RFC: clang-tidy readability check to reduce clutter: unnecessary use of auto and ->

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

Hello tidy programmers,

There seems to be a certain breed of programmer that once having
learned a new feature of C++ intends to get out their new feature
hammer and bang on everything in sight.

Case in point:

N2541 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm>
introduced auto and the trailing return type syntax.

Now people are using auto and -> on function signatures where the
return type is stated explicitly anyway. This results in code like:

auto main() -> int
{
  return 0;
}

Srsly? Is this really improving anything over

int main()
{
  return 0;
}

To me this is introducing unnecessary clutter and isn't the reason why
auto and -> for function return types were introduced.

I would like to propose a clang-tidy readability check that turns the
former into the latter when the return type doesn't use anything that
needs auto. In other words, this code would be left alone:

template <typename T, typename U>
auto add(T x, U y) -> decltype(x + y)
{
  return x + y;
}

There is one note in N2541 that states how the new syntax eliminates
an ambiguity with this example:

auto f() -> int (*)[4];
// function returning a pointer to array[4] of int,
// not function returning array[4] of pointer to int.

This seems to be another case that should be left unchanged.

Proposed check name: readability-redundant-auto-return-type

What are your thoughts?

There seems to be a certain breed of programmer that once having
learned a new feature of C++ intends to get out their new feature
hammer and bang on everything in sight.

I'm not native to the English language, but apparently breed is mostly
reserved to plants and animals.
I doesn't read that nice IMHO. Perhaps avoid such denomination for
programmers in the future. My $0.02.

[... ] In other words, this code would be left alone:

template <typename T, typename U>
auto add(T x, U y) -> decltype(x + y)
{
  return x + y;
}

There is one note in N2541 that states how the new syntax eliminates
an ambiguity with this example:

auto f() -> int (*)[4];
// function returning a pointer to array[4] of int,
// not function returning array[4] of pointer to int.

This seems to be another case that should be left unchanged.

In

struct Outer {
  struct Inner {};
  Inner getInner() { ... }
};

Instead of writing

Outer::Inner Outer::getInner() { ... }

I noticed that I can now write

auto Outer::getInner() -> Inner { ... }

If Outer is a long type name, not having to repeat it is kinda nice.
Of course client code must use Outer::Inner, unless it itself uses auto...

Similarly, in

template <typename TPrivilege>
auto PrivilegeSet<TPrivilege>::insert(const value_type& val)
  -> std::pair<iterator, bool>
{ ... }

iterator is a nested type, and not having to qualify it, nor using
typename, thanks to trailing return, is "nice" IMHO.

So am I "one of those people" with the above, or this is acceptable to you?
--DD

While this seems like a good idea, perhaps the tidy it should examine the complexity of the type being "auto"ed. The new return value syntax seems good to move a long type to the end of a function declaration to help code clarity, even if it is not strictly necessary.

auto f() → my_super_duper<long,long,long,long,long> {

}

A simple type name (e.g., int or Airplane) seems okay to do the proposed refactoring, beyond that seems less of a clarity win to me.

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

In article <CAFdkBy5EvyngcLBraxjd1Qq+XQAsy_QWPzdinsMfTj_ODG6eVw@mail.gmail.com>,
    Tim Halloran via cfe-dev <cfe-dev@lists.llvm.org> writes:

While this seems like a good idea, perhaps the tidy it should examine the
complexity of the type being "auto"ed. The new return value syntax seems
good to move a long type to the end of a function declaration to help code
clarity, even if it is not strictly necessary.

auto f() -> my_super_duper<long,long,long,long,long> {
  ...
}

A simple type name (e.g., int or Airplane) seems okay to do the proposed
refactoring, beyond that seems less of a clarity win to me.

If I understand you correctly, you're saying that really long
complicated types hinder readability.

I agree.

I don't see how moving the really long difficult to understand type
name from the front to the end increases readability. The source of
the difficulty in readability is the long type name, not its
placement. The traditional solution to this problem of legibility is
to use a typedef:

typedef my_super_duper<long,long,long,long,long> super_duper_t;

super_duper_t f() {
...
}

Not only is f's declaration and definition going to need this
difficult to understand templated type instantiation, but it is likely
that f's callers may also need it. If my function g returns the result
of f, then I also need this ugly type.

An alternative solution to a typedef is to wrap the type in an
intention revealing name that expresses the concept more directly.
For instance, it is very common to use a Point class to directly
represent 2D points, even though we could have used std::pair<float,float>.

Hi,

[…] so? no, but it is not worse either. it is more uniform, and for mathematically inclined people it is also more natural. just because it is a new syntax it doesn’t make it more cluttered or anything. Herb Sutter recommends this style in his Almost Always Auto article: I actually like the style especially when declaring interfaces: struct X { auto foo(int) → bool; auto bar(double, Y) → Z; … }; because it naturally aligns the most crucial information, the function name at the beginning of the line and mostly at the same column. certainly not, but it is a nice side effect. I was thinking of exactly the opposite, turn all “classic” declarations into the new style :slight_smile: I guess this is really a matter of taste. I would hope that clang-tidy (or would this be in the realm of clang-format?) would offer both options. Best Fabio

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

In article <563DC703.4000507@gmx.net>,
    Fabio Fracassi via cfe-dev <cfe-dev@lists.llvm.org> writes:

Hi,

[...]
> N2541 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm>
> introduced auto and the trailing return type syntax.
>
> Now people are using auto and -> on function signatures where the
> return type is stated explicitly anyway. This results in code like:
>
> auto main() -> int
> {
> return 0;
> }
so?

> Srsly? Is this really improving anything over
>
> int main()
> {
> return 0;
> }
no, but it is not worse either.

> To me this is introducing unnecessary clutter
it is more uniform,

It *is* more clutter in the sense that it introduces unnecessary tokens.

clang-tody's modernize transformation that uses override eliminates the
unnecessary keyword virtual because it is clutter that isn't needed.
One could argue against this saying "it is more uniform" to use virtual
on all virtual methods, even those with the keyword override.

Similarly, one can argue for uniformity in using 'void f(void);'
instead of 'void f();' but here the extra 'void' in the argument list
adds nothing and is merely clutter. Command methods (as in
command/query separation; google it if you're unfamiliar with the
design principle) return void. Writing 'auto f() -> void;' obscures
that this is a command until I get to the end of the declaration. In
fact writing 'auto' puts me immediately in the mindset of "this
function is going to return a value" only to make me change my mind
when I reach the end and learn "oh, the value is void."

and for mathematically inclined people it is also
more natural.

I didn't know you were appointed spokesman for all mathematically
inclined people? Seriously, I'm a mathematically inclined person and
I don't find this more natural.

just because it is a new syntax it doesn't make it more cluttered or
anything.

To me, clutter is adding extra stuff in that's not necessary like
(void) argument lists or gratuitous use of auto.

Herb Sutter recommends this style in his Almost Always Auto article:

I assume you are talking about this:
<http://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/>

That article is from 2013 and in subsequent talks on YouTube I've
seen people comment that "whacking everything with an auto hammer"
doesn't yield the benefits that were claimed at the time. Furthermore,
just because Herb Sutter likes a certain style doesn't mean that I do.

With clang-tidy we're not talking about the compiler creating a warning
or error because you use a certain code construct. clang-tidy is
like clang-format. You don't like LLVM code formatting conventions?
Nothing's forcing you to use them.

> and isn't the reason why
> auto and -> for function return types were introduced.
certainly not, but it is a nice side effect.

I read the paper at the link I posted before sending the email.
While the author of the paper doesn't state a rationale or design
motivation for the paper -- because it is simply a list of diffs to be
applied to the standard -- I think it is pretty clear what is motivating
the new syntax because of the examples. I suppose if we really want to
settle this fine point we can email the authro Jason Merrill and ask him.
I haven't attempted to do so.

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

I thought I did, I explicitly deleted your address from the to field, ...

In article <563DC703.4000507@gmx.net>,
     Fabio Fracassi via cfe-dev <cfe-dev@lists.llvm.org> writes:

Hi,

[...]

N2541 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm>
introduced auto and the trailing return type syntax.

Now people are using auto and -> on function signatures where the
return type is stated explicitly anyway. This results in code like:

auto main() -> int
{
    return 0;
}

so?

Srsly? Is this really improving anything over

int main()
{
    return 0;
}

no, but it is not worse either.

To me this is introducing unnecessary clutter

it is more uniform,

It *is* more clutter in the sense that it introduces unnecessary tokens.

From this perspective you are right in calling it clutter.

[...]
design principle) return void. Writing 'auto f() -> void;' obscures
that this is a command until I get to the end of the declaration. In
fact writing 'auto' puts me immediately in the mindset of "this
function is going to return a value" only to make me change my mind
when I reach the end and learn "oh, the value is void."

Actually for exactly this reason I make an "exception" when using void, because void is special.
And since void, like auto has exactly four characters it still fits nicely into the uniformity:

auto f(int) -> bool; // reads: I have a retun value ... and it is bool
auto f() { return ...; } // reads: I have a return value ... and it is whatever I return
void g(); // reads: I have no return value

and for mathematically inclined people it is also
more natural.

I didn't know you were appointed spokesman for all mathematically
inclined people?

Sorry, this was not my intention. IMO the "naturalness" is there because the mathematical notation of a function is
f: param1_t, param2_t -> return_t
which is closer to
auto f(param1_t, param2_t) -> return_t

just because it is a new syntax it doesn't make it more cluttered or
anything.

To me, clutter is adding extra stuff in that's not necessary like
(void) argument lists or gratuitous use of auto.

To me clutter is something that hinders readability without adding information.
I think this extra tokens help rather than hinder readability (even if it does not add information), so I objected to the negative connotation of "clutter"

Herb Sutter recommends this style in his Almost Always Auto article:

I assume you are talking about this:
<http://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style-almost-always-auto/>

That article is from 2013 and in subsequent talks on YouTube I've
seen people comment that "whacking everything with an auto hammer"
doesn't yield the benefits that were claimed at the time. Furthermore,
just because Herb Sutter likes a certain style doesn't mean that I do.

Sure, I just wanted to show that there are well reasoned people out there that think differently about this topic.

This is ultimately a stylistic issue, and thus a matter of taste, which is hard to discuss. There are (more or less good) reasons to use both styles, and you can find (more or less good) reasons against each.

With clang-tidy we're not talking about the compiler creating a warning
or error because you use a certain code construct. clang-tidy is
like clang-format. You don't like LLVM code formatting conventions?
Nothing's forcing you to use them.

I wasn't aware that clang-tidy and clang-format were there to (exclusively) force llvms style, I was under the impression that they were general tools to clean up code bases to their individual styles.
I didn't want to make any comment to what style llvm should use, just that I would hope that clang-tidy would support (or at least be open to support if someone bothers to write a patch) both, so that it is more generally useful.

Sorry for the noise,
I do appreciate the great work you guys are doing on clang and the nice clang tools.

Best
Fabio

Just as a side note, clang-tidy supports many checks that are not applicable to llvm, e.g. in the cert, google and cppcoreguideline groups.

+1

People will read code more than they write it and finding
where a functions are declared is easier if their names are
in the same column. After all, people are used to this when
looking in a dictionary where the names are always in the
far left column and declaration do form a sort of dictionary
of names. Thats also why I prefer:

  using T = //some complicated type expression
    ;
rather than:

  typedef //some complicated type expression using symbol T
           //where the name of the typedef name appears.
    ;

-regards,
Larry

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

In article <563DC703.4000507@gmx.net>,
    Fabio Fracassi via cfe-dev <cfe-dev@lists.llvm.org> writes:

Hi,

[...]

N2541 <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm>
introduced auto and the trailing return type syntax.

Now people are using auto and -> on function signatures where the
return type is stated explicitly anyway. This results in code like:

auto main() -> int
{
   return 0;
}

so?

Srsly? Is this really improving anything over

int main()
{
   return 0;
}

[snip]

and for mathematically inclined people it is also
more natural.

I didn't know you were appointed spokesman for all mathematically
inclined people? Seriously, I'm a mathematically inclined person and
I don't find this more natural.

Isn't Haskell a more mathematically inclined language and doesn't
it use:

  (S1,S2) -> T

to represent the same type as T(*)(S1,S2) in c++.
Also, I see the expression S -> T in math texts whereas I
don't remember ever seeing the c++ syntax to represent
function types in math texts. Of course I'm no math
expert, but it sure seems that S -> T is used much
more frequently in math texts.

-regards,
Larry

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

The comments on this check seem to be asking for the *reverse* check:

switch everything to use auto

Yes?

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

The comments on this check seem to be asking for the *reverse* check:

switch everything to use auto

Commenting from the sidelines, I see two things:
1. For more "traditional" (for lack of a better term) C++ codebases,
removing "unnecessary" uses of the -> syntax is probably a worthwhile thing
on the grounds of consistency (we would probably want something like that
in LLVM, not that we seem to have that problem in LLVM). I.e., the policy
is roughly "use the -> syntax only where strictly necessary, but otherwise,
for consistency, use the traditional syntax"
2. The `auto f() -> T` syntax opens up a new style altogether, and green
pasture projects might want to uniformly use that style.

So I think your original post is basically about 1. and some commenters
have noted the existence of 2.

They're basically separate use cases and I don't think one should hold up
the other as far as writing clang-tidy checks. Do you know of any projects
that are having issues with -> being used "unnecessarily"?

-- Sean Silva

[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]

The comments on this check seem to be asking for the *reverse* check:

switch everything to use auto

Maybe it could be configured to do either?

exactly.

+1

One thing that bothered me though is that the keyword 'auto' was reused
simply because... it was there. There's nothing "auto" going on in any
sense of the word.

This can be fixed though. I guess I may go to hell for it, but I found a
magic trick:

#define let auto

together with custom syntax highlighting rules making let a statement (and
auto a type rather than a storage specifier, for the rare case you need it).

Watch it in action:

let frobnicate( int x ) -> int { ... }

let constexpr addone( auto x ) { return x+1; } // g++ extension I think?

And it works excellently for initialized variable declarations as well:

let x = 42;
let const debug = false;

for( let &x : things ) { ... }

Annoyingly not everything can be written like this, although after a while
of writing in this style, the inconsistency of exceptions have actually
become an eye-sore to the point I'd now sooner write
let i = u64 {};
than
u64 i;
even though I'm normally a fan of conciseness.

You can admire this and other horrible coding sins in some actual code e.g.
here:
  https://github.com/dutchanddutch/jbang/blob/master/src/hw-subarctic.cc
though without the custom syntax highlighting rules it can't be fully
appreciated.

*runs for cover* :wink:

Matthijs

One thing that bothered me though is that the keyword 'auto' was reused
simply because... it was there. There's nothing "auto" going on in any
sense of the word.

'static' is much worse... in any context other than a function-local
variable, it's just a completely arbitrary we-already-had-this-reserved
keyword. Put it on a file-level variable, and it determines... visibility!
Put it on a class method, and... something with no one-word description
happens! There really is no possible English definition of 'static' that
could apply to either of those cases.

OOTC:
'auto' applied to a return-type could at least be interpreted as
"automatically decide what this type is" so 'auto' is kind-of sensible
there. Although I have to say, 'let' has some appeal.
--paulr

'static' is much worse... in any context other than a function-local

variable, it's just a completely arbitrary we-already-had-this-reserved

keyword. Put it on a file-level variable, and it determines... visibility!
Put it on a class method, and... something with no one-word description
happens! There really is no possible English definition of 'static' that
could apply to either of those cases.

Well, if I squint a bit I can see a connection between its use in functions
and in classes, but its effect in file-scope is really unrelated indeed.
The biggest irony that their its use inside functions/classes makes the
declaration behave (except for the scoping of its name) as if it were a
declaration at file scope, but *not* a static one there!

'auto' applied to a return-type could at least be interpreted as

"automatically decide what this type is"

Except support for return-type inference is more recent than support for
using 'auto' to introduce the trailing return type syntax. There's
technically actually therefore now ambiguity between those two uses of the
keyword, disambiguated by the presence or absence of an actual trailing
return type.

You can of course also combine them, i.e.
let foo( ... ) -> auto { ... }

Although I have to say, 'let' has some appeal.

I think it's really remarkable how well it fits and natural it feels, far
more so than 'auto' (at least to me; I may be biased due to exposure to
languages like OCaml).

Matthijs