draft rule for naming types/functions/variables

Hi guys,

Based on our discussion last week, I put together a new coding style
rule regarding the naming of types/functions/variables. I've uploaded
the patch to

http://codereview.appspot.com/3264041

Please let me know what you think. My idea is to start with something
non-controversial such that we can get the baseline committed soon.
We can then tweak the rule as needed later to cover more specific
scenarios. Thanks,

Looks great to me. One place I'd be happy to experiment with widespread name changes is the static analyzer. It might be a good isolated "experiment" of showing how these coding conventions could improve readability of the code.

Looks great to me. One place I'd be happy to experiment with widespread name changes is the static analyzer. It might be a good isolated "experiment" of showing how these coding conventions could improve readability of the code.

Great to hear about your plan for static analyzer, Ted! I'll be happy
to contribute refactoring patches to make it conform to the rule.

Please let me know what you think. My idea is to start with something
non-controversial such that we can get the baseline committed soon.
We can then tweak the rule as needed later to cover more specific
scenarios. Thanks,

LGTM

--
Zhanyong

Cheers,
Rafael

Sorry, this seems to have not followed the thread, but here was my comment:

http://codereview.appspot.com/3264041/diff/1/docs/CodingStandards.html#newcode801
docs/CodingStandards.html:801: camel case (e.g. TextFileReader
and isLValue).
I would really prefer some stylistic difference between variables and
types/functions. This is mostly a problem (for me) with local variables,
where having some signifier of the locality helps me in reading it.

The most common style I have worked with is to use
under_score_separators for variables. Personally, I would advocate for
sharing this style with member variables as well, but those aren’t as
important to me when reading code.

Sorry, this seems to have not followed the thread, but here was my comment:
docs/CodingStandards.html - Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review
docs/CodingStandards.html:801: camel case (e.g. <tt>TextFileReader</tt>
and <tt>isLValue</tt>).
I would really prefer some stylistic difference between variables and
types/functions. This is mostly a problem (for me) with local variables,
where having some signifier of the locality helps me in reading it.

Agreed. That's why types start with upper-case while variables start
with lower-case in my proposal.

I chose to start function names with lower-case, but don't feel
strongly about it. If people prefer them to start with upper-case,
fine by me.

The most common style I have worked with is to use
under_score_separators for variables. Personally, I would advocate for
sharing this style with member variables as well, but those aren't as
important to me when reading code.

I think camelCase works as well as underscore_separated for variable
names, and the former is much more common in the existing code.
That's why I picked it. I could go with underscore_separated were we
starting from scratch.

Sorry, this seems to have not followed the thread, but here was my comment:
docs/CodingStandards.html - Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review
docs/CodingStandards.html:801: camel case (e.g. <tt>TextFileReader</tt>
and <tt>isLValue</tt>).
I would really prefer some stylistic difference between variables and
types/functions. This is mostly a problem (for me) with local variables,
where having some signifier of the locality helps me in reading it.

Agreed. That's why types start with upper-case while variables start
with lower-case in my proposal.

I chose to start function names with lower-case, but don't feel
strongly about it. If people prefer them to start with upper-case,
fine by me.

The reason I feel comfortable with using the same style for both
functions and variables is that in the vast majority of the cases, a
function name is followed by () (so the distinction between the two is
obvious).

Sorry, this seems to have not followed the thread, but here was my comment:
http://codereview.appspot.com/3264041/diff/1/docs/CodingStandards.html#newcode801
docs/CodingStandards.html:801: camel case (e.g. TextFileReader
and isLValue).
I would really prefer some stylistic difference between variables and
types/functions. This is mostly a problem (for me) with local variables,
where having some signifier of the locality helps me in reading it.

Agreed. That’s why types start with upper-case while variables start
with lower-case in my proposal.

This however leaves no distinction between functions and variables. I suppose this isn’t a terrible compromise because of the ()s, but I would like it best to have a different style for each.

I chose to start function names with lower-case, but don’t feel
strongly about it. If people prefer them to start with upper-case,
fine by me.

I wonder which is better for function names. I don’t mind collision in style between function and type names as much as between variable names and other names, but that’s just my personal preference.

I think camelCase works as well as underscore_separated for variable
names, and the former is much more common in the existing code.
That’s why I picked it. I could go with underscore_separated were we
starting from scratch.

This is a good point regarding consistency. I’m curious if this is a good time for more dramatic changes, and if so whether we want to consider having distinctions for all of the above, but if not, I’m not opposed to the proposed change. I see it as a definite improvement.

I think that the type/function name convention makes sense.

However, I don't fully agree for local variable names. For them, there are two cases, things with small lifetimes where having a simple short name is good, and things with longer lifetimes where you want something descriptive.

For example, naming a variable i here is perfectly fine:

for (unsigned i = 0; i != 100; ++i)
  A[i] = 0;

Naming it "ArrayIndex" would not make it more clear :slight_smile:

For capitalization, I generally prefer capital names with the exception being one character names that are often metasyntactic names (like i/j).

Also, since this applies to LLVM as a whole, I'd suggest moving this to llvmdev, which will reach a larger audience.

-Chris

I don’t find this to be a compelling argument. By that reasoning, all field names should be m_foo and such horrible things. If a local variable has such a large live range to make it ambiguous or confusing, then something else is already wrong.

To me, the most important thing about having a naming convention is to just make it completely consistent and predictable everywhere instead of people having to remember that one API is isFoo() and another is IsBar().

Don’t try to go all hungarian and encode semantics into names through subtle distinctions…

-Chris

+llvmdev

Thanks for the comments, Chris.

Hi guys,

Based on our discussion last week, I put together a new coding style
rule regarding the naming of types/functions/variables. I've uploaded
the patch to

Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review

Please let me know what you think. My idea is to start with something
non-controversial such that we can get the baseline committed soon.
We can then tweak the rule as needed later to cover more specific
scenarios. Thanks,

I think that the type/function name convention makes sense.

However, I don't fully agree for local variable names. For them, there are two cases, things with small lifetimes where having a simple short name is good, and things with longer lifetimes where you want something descriptive.

For example, naming a variable i here is perfectly fine:

for (unsigned i = 0; i != 100; ++i)
A[i] = 0;

Naming it "ArrayIndex" would not make it more clear :slight_smile:

Good point. I actually have this in the example:

  828 VehicleMaker m; // Bad (abbreviation and non-descriptive); might be
  829 // OK for a local variable if its role is obvious.

I'll reword the rule to match what you have in mind.

For capitalization, I generally prefer capital names with the exception being one character names that are often metasyntactic names (like i/j).

If possible, I'd prefer that all variable names have the same style.
I'm afraid that we'll end up with the current inconsistent style if we
leave it to people to interpret whether a name is metasyntactic and
thus should be lower-case.

Also, having both types and variables in StrictCamelCase increases the
chance of clashing between the two and thus sometimes makes it hard to
choose good variable names. For example, if you have a function that
takes a Type parameter, how would you name the parameter if it has to
start with an upper-case? There are several obvious choices:

  void VisitType(Type T); // Bad -- T is too generic and could be
mistaken for "temporary".
  void VisitType(Type Ty); // Bad -- Ty is not a well-known abbreviation.
  void VisitType(Type AType); // Unnecessarily awkward.

In contrast,

  void VisitType(Type type);

is readable and natural. The same argument applies to other kinds of variables.

Another reason for preferring lower-case-started variable names is, as
I wrote in the proposed rule, it helps a lot with readability to know
at a glance whether something is a type or not -- at least that's my
experience.

So, would you be fine with making all variables start with lower-case?

Also, since this applies to LLVM as a whole, I'd suggest moving this to llvmdev, which will reach a larger audience.

Good point. Done. Hopefully this doesn't bring on bike shedding. :wink:

+llvmdev

Thanks for the comments, Chris.

Hi guys,

Based on our discussion last week, I put together a new coding style
rule regarding the naming of types/functions/variables. I've uploaded
the patch to

Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review

Please let me know what you think. My idea is to start with something
non-controversial such that we can get the baseline committed soon.
We can then tweak the rule as needed later to cover more specific
scenarios. Thanks,

I think that the type/function name convention makes sense.

However, I don't fully agree for local variable names. For them, there are two cases, things with small lifetimes where having a simple short name is good, and things with longer lifetimes where you want something descriptive.

For example, naming a variable i here is perfectly fine:

for (unsigned i = 0; i != 100; ++i)
A[i] = 0;

Naming it "ArrayIndex" would not make it more clear :slight_smile:

Good point. I actually have this in the example:

828 VehicleMaker m; // Bad (abbreviation and non-descriptive); might be
829 // OK for a local variable if its role is obvious.

I'll reword the rule to match what you have in mind.

I've made the change and uploaded the new patch to
Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review -- you can also find it attached
to this message. Thanks,

naming.patch (3.1 KB)

Hi,

   void VisitType(Type T); // Bad -- T is too generic and could be
mistaken for "temporary".
   void VisitType(Type Ty); // Bad -- Ty is not a well-known abbreviation.

I think there is a logical flaw in this argument: it doesn't matter if Ty is
not well-known in the wide world of programmers as long as it is well-known
inside LLVM: if "Ty" is used consistently all over the code base, then I think
that's good enough.

Good point. Done. Hopefully this doesn't bring on bike shedding. :wink:

It just did :slight_smile:

Ciao,

Duncan.

Zhanyong Wan (λx.x x) wrote:

If possible, I'd prefer that all variable names have the same style.
I'm afraid that we'll end up with the current inconsistent style if
we
leave it to people to interpret whether a name is metasyntactic and
thus should be lower-case.

Also, having both types and variables in StrictCamelCase increases
the
chance of clashing between the two and thus sometimes makes it hard
to
choose good variable names. For example, if you have a function
that
takes a Type parameter, how would you name the parameter if it has
to
start with an upper-case? There are several obvious choices:

void VisitType(Type T); // Bad -- T is too generic and could be
mistaken for "temporary".
void VisitType(Type Ty); // Bad -- Ty is not a well-known
abbreviation. void VisitType(Type AType); // Unnecessarily
awkward.

In contrast,

void VisitType(Type type);

is readable and natural. The same argument applies to other kinds
of variables.

I don't think this is natural at all. :slight_smile:

When the rule is to select readable and descriptive names, using names that only differ in case seems non-optimal. What about tYpe, tyPe, or typE?

Bo Persson

What if the variable is an function object?

Bo Persson

Zhanyong Wan (λx.x x) wrote:

If possible, I'd prefer that all variable names have the same style.
I'm afraid that we'll end up with the current inconsistent style if
we
leave it to people to interpret whether a name is metasyntactic and
thus should be lower-case.

Also, having both types and variables in StrictCamelCase increases
the
chance of clashing between the two and thus sometimes makes it hard
to
choose good variable names. For example, if you have a function
that
takes a Type parameter, how would you name the parameter if it has
to
start with an upper-case? There are several obvious choices:

void VisitType(Type T); // Bad -- T is too generic and could be
mistaken for "temporary".
void VisitType(Type Ty); // Bad -- Ty is not a well-known
abbreviation. void VisitType(Type AType); // Unnecessarily
awkward.

In contrast,

void VisitType(Type type);

is readable and natural. The same argument applies to other kinds
of variables.

I don't think this is natural at all. :slight_smile:

When the rule is to select readable and descriptive names, using names
that only differ in case seems non-optimal. What about tYpe, tyPe, or
typE?

These are not allowed because they are not in camel case:

I would really appreciate it if we could focus on the high-level
picture for the initial version of the rule. Given the complexity of
C++, we'll always be able to find corner cases that the rule doesn't
handle well, whatever rule we decide to use. As I mentioned in my
initial post, my goal is to set up a baseline now and tweak it later
based on feedback from actually using the rule.

Thanks,

+llvmdev

Thanks for the comments, Chris.

Hi guys,

Based on our discussion last week, I put together a new coding style
rule regarding the naming of types/functions/variables. I've uploaded
the patch to

Issue 3264041: add rule for naming type/function/variable to the coding standards - Code Review

Please let me know what you think. My idea is to start with something
non-controversial such that we can get the baseline committed soon.
We can then tweak the rule as needed later to cover more specific
scenarios. Thanks,

I think that the type/function name convention makes sense.

However, I don't fully agree for local variable names. For them, there are two cases, things with small lifetimes where having a simple short name is good, and things with longer lifetimes where you want something descriptive.

For example, naming a variable i here is perfectly fine:

for (unsigned i = 0; i != 100; ++i)
A[i] = 0;

Naming it "ArrayIndex" would not make it more clear :slight_smile:

Good point. I actually have this in the example:

828 VehicleMaker m; // Bad (abbreviation and non-descriptive); might be
829 // OK for a local variable if its role is obvious.

I'll reword the rule to match what you have in mind.

For capitalization, I generally prefer capital names with the exception being one character names that are often metasyntactic names (like i/j).

If possible, I'd prefer that all variable names have the same style.
I'm afraid that we'll end up with the current inconsistent style if we
leave it to people to interpret whether a name is metasyntactic and
thus should be lower-case.

Also, having both types and variables in StrictCamelCase increases the
chance of clashing between the two and thus sometimes makes it hard to
choose good variable names. For example, if you have a function that
takes a Type parameter, how would you name the parameter if it has to
start with an upper-case? There are several obvious choices:

void VisitType(Type T); // Bad -- T is too generic and could be
mistaken for "temporary".
void VisitType(Type Ty); // Bad -- Ty is not a well-known abbreviation.
void VisitType(Type AType); // Unnecessarily awkward.

In contrast,

void VisitType(Type type);

is readable and natural. The same argument applies to other kinds of variables.

Another reason for preferring lower-case-started variable names is, as
I wrote in the proposed rule, it helps a lot with readability to know
at a glance whether something is a type or not -- at least that's my
experience.

So, would you be fine with making all variables start with lower-case?

Being new to the clang code base I can add my experience from the
patch I implemented.

Especially for somebody new to the code, getting to interpret "P" and
"PP" correctly depending on the context you're in makes it really hard
to follow the code in many places. And if you have a variable of type
"Parser" what else do you call it? I fully expect that you will be
able to read the code just fine if you're a full time developer
spending all your time in the code base. For contributions by people
using the code, who are finding bugs or missing features, "first
time"-readability can greatly reduce the time needed to write a patch
though.

If there's one thing I could wish for in the clang style, it would be
a differentiation in style between variables and types.

Cheers,
/Manuel

Hi,

I enjoyed the new coding style in recent patches. Camel case makes it easy to pick a descriptive name. Starting functions and variables with lower cases reduces chances to conflict with a type name.

2010/11/23 Zhanyong Wan (λx.x x) <wan@google.com>

I enjoyed the new coding style in recent patches. Camel case makes it easy
to pick a descriptive name. Starting functions and variables with lower
cases reduces chances to conflict with a type name.

Honestly speaking, I don't. Especially in the cases when varname is
made from an
acronym. E.g. MachineInstr *MI looks much better than MachineInstr *mi, etc.

See latest Rafael's patch as an example.