Python code-style and reformatting, status update

Hi everyone,

As discussed earlier this year, we want to enforce a code style for all the Python code in LLVM since we are racking up quite a bit of code that’s python these days.

At EuroLLVM last week, I was reminded about this effort and picked it up again, so I wanted to give a brief status update on this.

Feel free to review or test if you see something weird.

As usual with mass reformats, it’s going to be a bit disruptive for people working in these files currently, but I hope it won’t be that bad, considering Python is more a util language in LLVM.

Next steps:

  • Post more mass-reformating, I am planning to do this sub-dir, by sub-dir, except for LLDB, which @JDevlieghere wanted to handle since he is much more involved in that.
  • Pre-commit CI step, to check the format. We want to run black --check on all .py files posted, but I am not sure it makes sense to work on this for Phabricator, considering we are planning to switch to GitHub “soon”. But I would like to hear input from @kwk @goncharov and @tstellar on this.

Thanks for your time!

1 Like

I saw that your mass refactor contains changes to llvm/bindings/python. I was sure those were removed from tree.

There is a short thread on their removal here:

@tstellar Did you change your mind about removal? Should I go ahead with removal? They have now not seen much meaningful changes in like 9 years.

EDIT: Here patch for removal of llvm python bindings: ⚙ D150642 [bindings] Remove LLVM python bindings

1 Like

@tobiashieta having worked with Golang for quite a while I simply loved how the debates about style vanished from the picture. Black seems to aim at doing exactly that with Python. I’m all for it.

But I think that your question was more targeted towards whether or not we should put this on Phabricator, right? Well, I’d rather not put it into some existing check but instead have a distinct one just for python format checking. Otherwise it won’t get the attention it needs. Other test(s) simply fail to often and are therefore ignored.

What about using precommit git hooks for this? Then we don’t have to run any of this on the server as long as people install the hooks: pre-commit. This won’t prevent code from being gated but at least there’s a standard way how to get to proper formatted commits in the first place. I think both parts are needed: the pre-commit hook and the CI.

Have you considered what to do about lit tests that contain python snippets, often split out via the split-file util? These won’t have .py suffixes (though the test-time generated files might).

I was interested in what the lower bound is, given the last thread on this topic. The above-mentioned doc-change points to GettingStarted.rst, which specifies python >=3.6, which was last updated 2.5 years ago.

I’d suggest to adopt something like “the last Python version that’s still supported (by CPython)”, which would currently be 3.7 (until this June), and would give 5 years of support for any given version.

No, I haven’t - I was not aware of such files. Could you link me to one to see what they look like?

I don’t have a strong opinion on the matter; if someone that works with these files has, please let me know.

I think that makes sense, but it should be its own discussion. If I don’t remember incorrectly, there was already talk about bumping this in a recent thread.

A quick look and there are a few different styles. There are some files that use %python -c to run an inline snippet of python. I don’t think we should try to format these, as they are usually formatted weirdly anyway to ensure the python fits inline. Some tests (e.g. tools/llvm-objcopy/ELF/preserve-segment-contents.test`) create the python script at test time, so that it can contain references to lit substitutions. In the specific example, it writes these files using echo, and so I don’t think we’ll be able to get black to format this, even if we wanted to, so I wouldn’t bother.

I found very few examples (I thought there were more) using split-file and %python, so maybe it’s not a real issue. One concrete example is lld/test/ELF/lto/bitcode-wrapper.ll, which has a “file” at the end of the test file. I suppose these should be formatted, but I don’t know if black can do partial file reformatting. I don’t think it’s worth trying too hard if it isn’t going to work.

1 Like

Update 23-05-17

llvm subdir reformatting has landed. With a few issues, I forgot to update the commit message telling people how to handle merge problems. It will be included in the future commits. I also forgot to format lit.local.cfg files that are all python. I have posted a follow-up below.

Now I have reformatted all the files except lldb, I’ll leave that up to @JDevlieghere

@tobiashieta please, don’t forget to add the series of commits to .git-blame-ignore-revs.

Hi! Everything but MLIR and LLDB has now landed.

Any updates on the LLDB changes @JDevlieghere ?

I was out of office, but plan to have it done before the end of the week.

Edit: ⚙ D151460 [NFC][Py Reformat] Reformat python files in lldb

1 Like

Hi! All open diffs have landed, and we have reformatted the whole tree’s .py files. We need to add some pre-commit ci checks, but I hope someone can step up and help with this. Otherwise, I hope reviewers can remind people to format with black.

Thanks for all the help landing this!