Which version of clang-format to use?

Over the past several months, I’ve always used the latest version of clang-format built from the latest version of LLVM. Currently, this is version 13.0.0. When recently reviewing Phabricator requests, I’ve noticed differences caused by changes in clang-format. So I decided to clean up all of the differences and filed https://reviews.llvm.org/D102424. But when Phabricator displayed my request, it annotated it with several comments saying that clang-format needed to be run on some of the files. Apparently Phabricator is using a different version of clang-format, so I filed https://github.com/google/llvm-premerge-checks/issues/303 for this issue.

What’s the proper process? Which version of clang-format should we all be using?

Pete

I think the only canonical version is the version in-tree at the same
revision. Everything else would require another discussion and/or
manual work with the bots after clang has had a new release.

Maybe the pre-merge bot could be changed to also keep clang-format and
clang-tidy up-to-date?

Michael

It seems the pre-merge has just moved to LLVM 12.

I think it is actually sensible to use the latest released version (so in this case 12, as 13 is not out yet), otherwise whenever you rebase a patch that’s in review you also need to rebuild clang-format locally (instead of just flang + tests). In addition, there’s no connection in Phabricator between a patch and the commit it is based on, so it’s not clear what the in-tree version is. If you mean ToT, then that can change while a patch is in review. Think about what it would mean for a flang patch if a clang-format patch is committed, then reverted for bug-fixes, then committed again, all while the flang patch is still in review. Does that mean the pre-merge checks should be run 3 times, potentially with different results each time? That would be both confusing and very resource intensive. On the other hand, if the pre-merge checks are only run once, when you first upload the patch, then you’d essentially have non-deterministic formatting, depending on what happened to be in tree at that exact moment. Using the release binaries gives much more stability.

There may be some delay between when the release binaries become available and when the pre-merge bots are updated, but that’s probably something that can be mitigated. In this case, the delay was a whole month, which is admittedly quite long, especially since the release cycles are only a couple of months long if you count minor releases too. But with the release process moving towards becoming more and more automated, I suspect this step could be automated too, and then there would be no delay at all. You could still end up in a situation where you submitted a patch just before the release went public, and then you’d have to update your formatting to match the new release, but that should only happen infrequently.

Just my 2 cents, I hope I didn’t misinterpret what you were saying.

Cheers,
Diana

I think it is actually sensible to use the latest released version (so in this case 12, as 13 is not out yet), otherwise whenever you rebase a patch that's in review you also need to rebuild clang-format locally (instead of just flang + tests). In addition, there's no connection in Phabricator between a patch and the commit it is based on, so it's not clear what the in-tree version is. If you mean ToT, then that can change while a patch is in review. Think about what it would mean for a flang patch if a clang-format patch is committed, then reverted for bug-fixes, then committed again, all while the flang patch is still in review. Does that mean the pre-merge checks should be run 3 times, potentially with different results each time? That would be both confusing and very resource intensive. On the other hand, if the pre-merge checks are only run once, when you first upload the patch, then you'd essentially have non-deterministic formatting, depending on what happened to be in tree at that exact moment. Using the release binaries gives much more stability.

As you mentioned, a new version of clang can be released during an
ongoing review as well. But I can expect that patches are rebased to
ToT, i.e. the matching version of clang-format will be available
anyway. If using the latest release, I am effectively forced to also
have compiled clang around that matches the pre-merge bot's version.

I wouldn't call calling "git clang-format origin/main" once before
uploading a new patch labour-intensive. You should do this anyway.
Arcanist is even configured to use clang-format before updating a
patch.

There may be some delay between when the release binaries become available and when the pre-merge bots are updated, but that's probably something that can be mitigated.

This would have to be discussed with the maintainer which I would not
impose additional work on. It is an event occuring only twice a year,
it will be open whether it actually works until the next release and
thus I doubt it is worth automating. Not only the pre-merge bots are
affected, but every single developer as well.

Michael

> I think it is actually sensible to use the latest released version (so in this case 12, as 13 is not out yet), otherwise whenever you rebase a patch that's in review you also need to rebuild clang-format locally (instead of just flang + tests). In addition, there's no connection in Phabricator between a patch and the commit it is based on, so it's not clear what the in-tree version is. If you mean ToT, then that can change while a patch is in review. Think about what it would mean for a flang patch if a clang-format patch is committed, then reverted for bug-fixes, then committed again, all while the flang patch is still in review. Does that mean the pre-merge checks should be run 3 times, potentially with different results each time? That would be both confusing and very resource intensive. On the other hand, if the pre-merge checks are only run once, when you first upload the patch, then you'd essentially have non-deterministic formatting, depending on what happened to be in tree at that exact moment. Using the release binaries gives much more stability.

As you mentioned, a new version of clang can be released during an
ongoing review as well. But I can expect that patches are rebased to
ToT, i.e. the matching version of clang-format will be available
anyway. If using the latest release, I am effectively forced to also
have compiled clang around that matches the pre-merge bot's version.

I wouldn't call calling "git clang-format origin/main" once before
uploading a new patch labour-intensive. You should do this anyway.
Arcanist is even configured to use clang-format before updating a
patch.

I agree that you should be running "git clang-format" before uploading
a patch. What I was saying is that if you intend to use the ToT
clang-format, then you'd also need to run "ninja clang-format" before
that. In general I wouldn't expect people to build clang-format if
they're only touching flang. Furthermore, if they're using an
out-of-tree build of flang for development, then in order to obtain
the ToT clang-format they'd also have to switch to the build directory
that contains it first. It's not the end of the world, but it's a
couple of extra steps that people might forget to do. In practice it
probably doesn't matter much, since people can just use whatever
clang-format they have around and only build the ToT one when the
premerge bots complain about formatting.

Anyway, that was just to clarify my point. Sorry about the rant. I
think either policy is fine as long as we try to document it
somewhere. We should probably have a larger discussion with the LLVM
community and the pre-merge owners in particular to decide which
policy works best for them.

I think it is actually sensible to use the latest released version (so in this case 12, as 13 is not out yet), otherwise whenever you rebase a

patch that's in review you also need to rebuild clang-format locally (instead of just flang + tests). In addition, there's no connection in Phabricator between a patch and the commit it is based on, so it's not clear what the in-tree version is. If you mean ToT, then that can change while a patch is in review. Think about what it would mean for a flang patch if a clang-format patch is committed, then reverted for bug-fixes, then committed again, all while the flang patch is still in review. Does that mean the pre-merge checks should be run 3 times, potentially with different results each time? That would be both confusing and very resource intensive. On the other hand, if the pre-merge checks are only run once, when you first upload the patch, then you'd essentially have non-deterministic formatting, depending on what happened to be in tree at that exact moment. Using the release binaries gives much more stability.

As you mentioned, a new version of clang can be released during an
ongoing review as well. But I can expect that patches are rebased to
ToT, i.e. the matching version of clang-format will be available
anyway. If using the latest release, I am effectively forced to also
have compiled clang around that matches the pre-merge bot's version.

I wouldn't call calling "git clang-format origin/main" once before
uploading a new patch labour-intensive. You should do this anyway.
Arcanist is even configured to use clang-format before updating a
patch.

I agree that you should be running "git clang-format" before uploading
a patch. What I was saying is that if you intend to use the ToT
clang-format, then you'd also need to run "ninja clang-format" before
that. In general I wouldn't expect people to build clang-format if
they're only touching flang. Furthermore, if they're using an
out-of-tree build of flang for development, then in order to obtain
the ToT clang-format they'd also have to switch to the build directory
that contains it first. It's not the end of the world, but it's a
couple of extra steps that people might forget to do. In practice it
probably doesn't matter much, since people can just use whatever
clang-format they have around and only build the ToT one when the
premerge bots complain about formatting.

Anyway, that was just to clarify my point. Sorry about the rant. I
think either policy is fine as long as we try to document it
somewhere. We should probably have a larger discussion with the LLVM
community and the pre-merge owners in particular to decide which
policy works best for them.

IMHO, there is little gain in forcing a second version of clang-format
that does not match ToT. Running `make clang-format` is cheap at the
end of the day. Having multiple versions of LLVM on your path can have
arbitrarily high cost, e.g., if something goes wrong and you debug for
days why stuff happens only to discover you mixed tools, include files,
or something else.

The benefit of ToT is also that you can fix it if there is an obvious
problem rather than add a workaround for 6 months.

We are effectively using ToT clang-format for Attributor files and (some
of) the OpenMP runtimes for a while now, there has not be a problem I'm
aware of.

~ Johannes