Hi,
Libc++ has been using clang-format for 1-2 years. We’ve been enforcing that all new code written is clang-format clean by various means, the most recent incarnation being an LLVM-wide Github Action that enforces clang-format on PRs.
However, we have a good amount of existing code that isn’t clang-format clean. The idea was that we would gradually clang-format more and more code as we’re making changes to the code base, and eventually we’d be clang-format clean. We also previously maintained a list of unformatted files for which we would not enforce clang-format to reduce churn when touching these files. This ended up creating a lot of confusion for contributors, who would often be asked to update the list of unformatted files when it changed. Because of that, we recently removed this exclude list.
As a result, any patch touching any part of the code now has to be clang-format clean in the vicinity of the patch, even in a file that is otherwise not clang-formatted. This is more straightforward than the previous situation, but it’s still a pretty big annoyance and I expect it to start creating confusion right away for contributors.
Therefore, I propose clang-formatting the whole code base once and for all so we can be done with it. As clang-format evolves, we’ll still run into conflicts in the future but those will be significantly smaller and less confusing than the current situation. 99% of the time, we’ll be touching code that is already formatted properly.
There are mainly three things to keep in mind when thinking about such a change:
- Contributors who have outstanding patches and will eventually need to rebase them across the clang-format change. For all intents and purposes, this is a very small number of core contributors.
- Downstreams who have internal diffs and who will need to merge their differences across the clang-format change.
- The ability to git-blame the library.
I actually believe that clang-formatting the whole codebase in one go is the best course of action to address all of these concerns. Indeed, the alternative is to incrementally clang-format the code base like we’ve been doing, but this has worse effects in the long run on the above points.
First, doing it in one go reduces the overall amount of work that must be done by contributors and downstreams compared to incremental formatting. We sit down once, do it, and then never think about it again. There can be no additional formatting-related conflicts in the future.
For rebasing older patches across the formatting change, it is actually possible to automate the process using a merge driver, so that rebases work 100% transparently. The programmer only needs to install the merge driver by running a git
command once. I implemented a script that does that based on Nico’s notes for Chromium and tested it with our code base, and it is able to rebase patches across clang-format without any intervention. This also applies to downstreams that have an internal diff.
Furthermore, doing it at once allows the clang-format commit to be added to .git-blame-ignore-revs
, which is the best tool available to keep a sensible git blame
. The alternative of incrementally clang-formatting the code leads to patches touching lines that would otherwise be untouched if not for clang-format, which breaks git blame
pretty badly.
For all these reasons, I’d like to make this change once and for all. It will simplify the day-to-day operation of code reviews and making contributions, which is currently pretty annoying for something that we shouldn’t be thinking so much about.
Current status
Week of Nov 27: Propose an optional change to also rename _LIBCPP_INLINE_VISIBILITY
to _LIBCPP_HIDE_FROM_ABI
and _VSTD::
to std::
prior to the formatting change, since we could kill two birds with one stone. This is optional and I don’t want to necessarily couple the two changes, but I think that this additional change is simple enough that it should be pretty uncontroversial. _LIBCPP_INLINE_VISIBILITY
change / _VSTD
change
Week of Nov 27: Put out for review the merge driver script that would allow transparently rebasing across clang-format changes. PR HERE
Dec 4th: Open a PR that clang-formats the whole code base. We will iterate on that PR, extracting bits where we need clang-format off/on
out of the PR until we’re happy. PR HERE
Dec 11th (goal): Land the clang-format patch assuming we’re ready. This date might slip depending on what we find in the previous step.