sample of running google c++ lint script

Just as an example, I picked totally at random, one c++ program to run the google code style checker.

There are clearly some valid points it found. I think it would good to start to adapt this tool
or write a new tool to do style checking and to start to better formalize the llvm rules.

I ran it against Target/Target.cpp

Target.cpp:0: No copyright message found. You should have a line: "Copyright [year] <Copyright Owner>" [legal/copyright] [5]
Target.cpp:10: Line ends in whitespace. Consider deleting these extra spaces. [whitespace/end_of_line] [4]
Target.cpp:22: Found C++ system header after other header. Should be: Target.h, c system, c++ system, other. [build/include_order] [4]
Target.cpp:24: Do not use namespace using-directives. Use using-declarations instead. [build/namespaces] [5]
Target.cpp:26: Is this a non-const reference? If so, make const or use a pointer. [runtime/references] [2]
Target.cpp:65: Use int16/int64/etc, rather than the C type long [runtime/int] [4]
Target.cpp:69: Use int16/int64/etc, rather than the C type long [runtime/int] [4]
Target.cpp:73: Use int16/int64/etc, rather than the C type long [runtime/int] [4]
Target.cpp:95: Use int16/int64/etc, rather than the C type long [runtime/int] [4]
Target.cpp:100: Lines should be <= 80 characters long [whitespace/line_length] [2]
Target.cpp:100: Use int16/int64/etc, rather than the C type long [runtime/int] [4]
Target.cpp:49: Add #include <string> for string [build/include_what_you_use] [4]
Done processing Target.cpp
Total errors found: 12

Just as an example, I picked totally at random, one c++ program to run
the google code style checker.

There are clearly some valid points it found. I think it would good to
start to adapt this tool
or write a new tool to do style checking and to start to better
formalize the llvm rules.

I ran it against Target/Target.cpp

Target.cpp:10: Line ends in whitespace. Consider deleting these extra
spaces. [whitespace/end_of_line] [4]

We have an incredible amount of those, and fixing them would create far too much churn. I generally fix them on the code I touch.

Target.cpp:22: Found C++ system header after other header. Should be:
Target.h, c system, c++ system, other. [build/include_order] [4]

The LLVM convention is to reverse the "usual" order. The first file any .cpp includes should be its own .h. Then other project headers. Then, in the case of Clang, the LLVM headers. And finally system and standard headers.

Target.cpp:24: Do not use namespace using-directives. Use
using-declarations instead. [build/namespaces] [5]

LLVM convention is to use a using directive for the project namespace.

Target.cpp:26: Is this a non-const reference? If so, make const or use
a pointer. [runtime/references] [2]

What? Why?

Sebastian

Did you agree with the comment about the use of long long from the tool?

Anyway, it's not really important to me that we adopt any specific google code rules over the
current llvm rule.

The point is to that Google has a deeper set of conventions and it would be a good starting point for us.
Also, they have a tool which checks a lot of this. The lack of a tool for llvm style check is what originally caught my attention in this area. I don't like doing work that a robot can do for me.

The Google tool has a wish list of other things that I would not be opposed to personally adding
should we go that way. The point is to collect the kinds of style errors that send a patch back to the author and to try and implement those in the style checker. I would even open up a bug against the
style checker for each such thing that is not checked.

Not wanting to clean up the white space is exactly a simple but good example of technical debt that we are incurring.

In that case it's very simple to see. We have a rule about that for our style and because we are too
busy with other things, then we have allowed the technical debt to rise to a point where we don't
want to ever pay it, or so you say.

When I ran a script over the Target subtree, my simple check showed 20,000 violations of tab, line length and spaces at the end of line rules.

Just as an example, I picked totally at random, one c++ program to run
the google code style checker.

There are clearly some valid points it found. I think it would good to
start to adapt this tool
or write a new tool to do style checking and to start to better
formalize the llvm rules.

I ran it against Target/Target.cpp

Target.cpp:10: Line ends in whitespace. Consider deleting these extra
spaces. [whitespace/end_of_line] [4]

We have an incredible amount of those, and fixing them would create far
too much churn. I generally fix them on the code I touch.

That's been the prevailing opinion, except it doesn't seem to hold -
both SubVersion and Git have the ability to annotate/diff with
whitespace ignorance, so adding/removing whitespace shouldn't affect
your source control experience (ok, I guess ViewVC might not have such
smarts) - unless a whitespace run is entirely removed (which isn't
relevant to trailing whitespace removal, since there's always the
newline there - and 80 col fixes are generally appreciated by Chris at
least, even if they're not in code being touched by the developer -
not sure how he feels about a blanket 80 col fix, though I'd be more
inclined to do that if we had a tool to ensure no regressions)

That said, if we could get tools in place that could break the build
when (and before) we submit new violations, that's probably sufficient
- no CR headaches and no big churn. It's just that most such tools
aren't smart enough to just do diff based validation, so then you end
up arguing in favor of fixing the whole codebase before you add
validators.

Target.cpp:22: Found C++ system header after other header. Should be:
Target.h, c system, c++ system, other. [build/include_order] [4]

The LLVM convention is to reverse the "usual" order. The first file any
.cpp includes should be its own .h. Then other project headers. Then, in
the case of Clang, the LLVM headers. And finally system and standard
headers.

Target.cpp:24: Do not use namespace using-directives. Use
using-declarations instead. [build/namespaces] [5]

LLVM convention is to use a using directive for the project namespace.

Target.cpp:26: Is this a non-const reference? If so, make const or use
a pointer. [runtime/references] [2]

What? Why?

One of Google's style guidelines, also not relevant/used by LLVM -
LLVM is just fine having non-const ref out or in/out parameters.

Did you agree with the comment about the use of long long from the tool?

Anyway, it's not really important to me that we adopt any specific
google code rules over the
current llvm rule.

The point is to that Google has a deeper set of conventions and it would
be a good starting point for us.
Also, they have a tool which checks a lot of this. The lack of a tool
for llvm style check is what originally caught my attention in this
area. I don't like doing work that a robot can do for me.

The Google tool has a wish list of other things that I would not be
opposed to personally adding
should we go that way. The point is to collect the kinds of style errors
that send a patch back to the author and to try and implement those in
the style checker. I would even open up a bug against the
style checker for each such thing that is not checked.

Not wanting to clean up the white space is exactly a simple but good
example of technical debt that we are incurring.

In that case it's very simple to see. We have a rule about that for our
style and because we are too
busy with other things, then we have allowed the technical debt to rise
to a point where we don't
want to ever pay it, or so you say.

That's sort of a misunderstanding of the issue - it's not that we're
too busy. It's that the tools (version control mostly) don't support
this use case very well. If it were just about not having the time, it
would be easy for some newbie contributors to go through & cleanup the
whole code base - such patches are not accepted, though, because of
the VCS churn. I don't entirely agree there - tools already have some
support, if there are other tools that don't, we could look into what
it'd take to improve/replace/etc. But that's what it is today.

When I ran a script over the Target subtree, my simple check showed
20,000 violations of tab, line length and spaces at the end of line rules.

& what exactly is the cost there? in most cases it's just a few
characters over the limit here & there - I'm not much of a stickler
for those rules anyway. It seems the major benefit would be if we
actually had a tool that autoformatted/enforced those guidelines -
then it'd be potentially worth the cleanup to get us into a good state
from which we could never diverge. But better than that would be a
tool that could just ratchet up the quality by not allowing us to
regress without forcing us to fix everything up front. Such a tool is
more complicated to build/use, so it's a tradeoff between whether it's
better to fix everything or create/use such a tool.

I’m not sure what you’re trying to accomplish here.

The Google style guide is inapplicable to LLVM’s. This has been stated repeatedly, and you’re not likely to enact change through these emails. If you want to influence the LLVM style, become a sufficiently significant contributor, and then argue the merits of specific points.

Considering the tool separate from the style is hard, in part because (as I told you in the other thread) the tool is terribly architected and very inflexible. There are folk interested in making a real and powerful coding convention check on top of Clang. If you have time to invest in coding conventions, I encourage you to invest it here and not in bad python scripts that implement a set of rules which don’t apply to this project. Certainly, I don’t see the relevance of the latter to this mailing list.

Finally, let’s be clear:

Not wanting to clean up the white space is exactly a simple but good
example of technical debt that we are incurring.

I agree, it is a great example of long term technical debt.

In that case it's very simple to see. We have a rule about that for our
style and because we are too
busy with other things, then we have allowed the technical debt to rise
to a point where we don't
want to ever pay it, or so you say.

When I ran a script over the Target subtree, my simple check showed
20,000 violations of tab, line length and spaces at the end of line rules.

We should ask the LLVM steering committee to require a balanced
budget, including detailed plans for both for lines and whitespace, so
we know what targets we should be hitting. That way we know if we
expect to add 10k LOC to TableGen, and where that 10k LOC is going to
be removed from to make up for it.

While humorous, let’s dial back the trolling at this point. =] This discussion is a largely serious discussion, and shouldn’t get derailed.

The tablegen part, maybe.
Discussing whether whitespace is technical debt, i hope that's not a
serious discussion :slight_smile:

It doesn't even meet wikipedia's somewhat muddled
discussion/definition of technical debt.

Even broader style conformance issues are not technical debt, in the
sense that you can eliminate it any time you want, 99.8%
automatically.

While not technical debt, for some it is a broken window. Broken windows can lead to increases in technical debt. Humans are so strange.

Cheers,
/Manuel