Submitting simple NFC patches

I’m working on the Clang Static Analyzer and I frequently see code that could be replaced by an llvm-alternative.

For example:
Replacing ad-hoc lambda implementing the operator< on pairs, or using std::sort(C.begin(). C.end()). Such code could be replaced by llvm::less_first and llvm::sort() respectively.
There are many more examples like this that I’m not listing here.

I could make such changes to the StaticAnalyzer part of the repo, but I’m curious if I shall grep for more and replace similar issues at the monorepo level.

If the community is on board with this, who shall review such changes?
I know that there is the llvm-project/CODE_OWNERS.TXT, but it feels like overkill to invite the code owners for such changes.
@clattner @chandlerc @AaronBallman

PS: I’m only aiming for the most simple, no-brainer refactor changes, in small pieces.

As far as I can remember, the policy (albeit potentially unwritten) about formatting patches is that they should not be “overdone”. Assuming that these changes incur neither run-time performance gain nor a penalty, I believe we could say they are essentially akin to formatting patches. So submitting the patches (or without a review, the commits themselves) just for the sake of “having things formatted” (or replaced with the STLExtras library calls) we should not put additional strain on the already humongous repository history (keep in mind that a fresh clone of the Monorepo is around 2 GiB of compressed history).

The counter-argument, however, is that if we keep “formatting” only pieces of code that we touch, then changed lines will be attributed by a history entry that shows a change that is semantically unrelated to what the commit message and the related patch describes. Also, littering “format changes” into the reviews of something professionally significant (i.e. requiring explicit scrutiny from the community) might end up significantly wasting reviewer effort.

As for NFC patches, the general rule is that if the commit is yours, so is the responsibility. I would personally not overstep the boundaries of the parts of the project that I am familiar with even with NFC patches unless they are truly absolutely minimal and trivial (like typo changes or documentation that I was reading being incorrect or rendered broken).

Also, it should be noted that LLVM is compiled with many many toolchains currently across various platforms, sometimes even with otherwise conflicting flags (I think part of the project already requires C++17, while others are pegged at C++14.) So I would make it extra sure that whatever change is being made also works on “ancient” compilers too! A huge change’s revert due to something that is perfectly valid C++14 not being understood by GCC 5 or GCC 7 [†] would result in a revert commit that is equally huge.

So even if I went forward with the changes, I would definitely break them up at least per project (or subproject, in the case of clang-tools-extra/clangd). This is the “extreme care should be taken!” area.


[†]: I always like to tout the story that explicit specialisations of template members, while perfectly fine, are not possible if you are using GCC 7 to compile. And officially, as far as I know, we support a GCC 5 toolchain!

1 Like

If your question is “how much review should be applied to making code in LLVM follow LLVM conventions”, then I think that answer is “it depends”. That said, I want to encourage you to make LLVM as a whole better, not just “stay in your lane”. I think we want the LLVM community grow to be more consistent and integrated over time, and want to avoid devolving into subcultures.

LLVM intentionally has a “yes, you can apply common sense judgement to things” policy when it comes to code review. If you are doing mechanical patches (e.g. adopting less_first) that apply to the entire monorepo, then you don’t need everyone in the monorepo to sign off on it. Having some +1 validation from someone is useful, but you don’t need everyone whose code you touch to weigh in.

If you’re changing algorithmic behavior, that’s more complicated. For example, the use of std::map is generally the wrong thing, but mechanically changing all std::map<string,x> to StringMap isn’t safe or correct due to iterator invalidation semantics etc. For changes like that, you should get more careful review.

Does this make sense to you? Is this the question you’re asking?

-Chris

2 Likes

LLVM intentionally has a “yes, you can apply common sense judgement to things” policy when it comes to code review. If you are doing mechanical patches (e.g. adopting less_first) that apply to the entire monorepo, then you don’t need everyone in the monorepo to sign off on it. Having some +1 validation from someone is useful, but you don’t need everyone whose code you touch to weigh in.

I was exactly about this.

If you’re changing algorithmic behavior, that’s more complicated. For example, the use of std::map is generally the wrong thing, but mechanically changing all std::map<string,x> to StringMap isn’t safe or correct due to iterator invalidation semantics etc. For changes like that, you should get more careful review.

I’m focusing purely on syntactic constructs this time. I’ll have careful attention though.

Does this make sense to you? Is this the question you’re asking?

Absolutely! Thanks Chris.

I basically agree with what Chris is saying here. We hope to avoid needless churn that impacts activities like git blame (like formatting every file, or stripping all trailing whitespace, etc), but we want to encourage people to make changes that materially improve the quality of the code base even if they’re NFC changes… and that sometimes requires a value judgement on whether a change should or should not be reviewed.

FWIW, my rule of thumb is “when in doubt, ask for a review” – put up a patch with some changes (but perhaps not all of them) and ask your reviewers whether they’d like to see follow-up reviews for similar changes or whether they’re comfortable with those being commit as NFC changes.