[RFC] Upgrading LLVM's minimum required Python version

Proposal to update minimum Python version required to 3.8.

What version do you suggest to increase to?

3.8 (3.7 is EOL in ~5 months)

Why this upgrade?

Existing minimum has been EOL since Dec 2021.

Where is the pain in keeping the current min version?

LLVM devs are unable to test with the current minimum version (e.g., no longer carried by some distros or homebrew) and code that passes on newer version fails on the older due to (among others) changes in type specification.

What (common) dists / OSes would be affected (not have access to this version by default)

Finding these have been a bit difficult, pkgs.org only shows openSUSE 15.04, ALT Linux P9, Amazon Linux 2, Amazon Linux 2, Rocky Linux 8 as ones where 3.6 is available but not 3.8, but even for some of these spot check 3.8 was as easy to get installed as 3.6.

Centos 7 (Centos 8 seems to support)
Ubuntu 18.04 default Python install
RHEL 8 seems default is Python 2.7 and recommendation is to use SCL for Python 3 (which has 3.8)
openSUSE 15.04 (only in experimental, official in Tumbleweed)

What’s the process to if you don’t have the right version. For users and developers.

Installing a new version of Python is relatively easy and doesn’t require root access. Projects like pyenv, conda/miniconda/mamba make it easy to install additional Python versions. For many distros it is a case of just installing (e.g., for Ubuntu 18.04/mlir-nvidia buildbot one needs only apt-get pyton3.8 instead of python, CentOS 7 requires using SCL). Or just building from source.

We could start with updating the build bots and then set the minimum version in CMake.

3 Likes

and code that passes on newer version fails on the older due to (among others) changes in type specification.

Do you have an example of where this has happened?

What (common) dists / OSes would be affected

Debian buster (oldstable) is on 3.7.

Do we even currently codify the minimum version in cmake or some other place? I didn’t know we have a minimum version requirement.

It would also be worth codifying which components require python. What is there anything beyond python bindings? Ideally you don’t pay for what you don’t need.
I can think of LLDB which has python support (I don’t know if this is enabled by a flag?).

In the Pre-RFC, you can see that it is documented, but it is not enforced by the build system:

Lit uses (quite a bit) of python. Additionally, each individual project may have a nontrivial amount of project specific code - in particular, libcxx has quite a lot of test framework code of its own, running on top of lit. Libcxx also has some scripts uses for wrapping execution (libcxx/utils, run.py and ssh.py), where the main one, run.py gets used in most cases, while ssh.py only is used in remote testing setups (so a regular run through a normal test setup wouldn’t exercise that code).

Additionally, a bunch of scripts in llvm/utils (update_*_test_checks etc) also are written in Python.

2 Likes

LLDB is a heavy user of Python: integrated into the debugger and test suite ( -DLLDB_ENABLE_PYTHON=On).

cmake will move to 3.20. Since 3.19 you can enforce a Python version with find_package:
https://cmake.org/cmake/help/latest/module/FindPython3.html

The build system enforces a minimum Python requirement, see

I’m personnaly fine with moving to Python 3.8. In addition to 3.6 being EOL, python 3.7 + 3.8 bring (see What’s New In Python 3.8 — Python 3.11.1 documentation and What’s New In Python 3.7 — Python 3.11.1 documentation)

Yes, Buildbot and I commented on a commit where it seemed like it would break too (but to test they or I have to fire up docker). The reason I mentioned the change in typing support is that has changed how many folks write new Python code.

@mstorsjo and @tschuett mentioned the others. lit is the one that I was thinking of being least easy to ignore (well unless you aren’t testing which I would consider very rare but have not data to back that). So we can require conditionally same as today only if you are testing/using lit, building LLDB or generating MLIR Python bindings (And I’d prefer that to be 1 version). We could expand the minimum requirement there to specify which components need it or potentially instead add “asserts” in the CMake file of those components to make the tracking more active.

1 Like

I misread the CMake documentation.

In general, I think we should try to keep the requirements to build LLVM as low as possible. It’s already pretty complex to get started with as it is.

LLVM has always required fairly specific versions of C++ compiler, since that’s a core technology that the project uses to the max, but for auxiliary tools like Python I don’t think it makes sense to be as aggressive.

Perhaps sub-projects that are Python heavy (LLDB, MLIR?) could have stricter requirements, but for just llvm+clang+lit tests it would be nice if pretty much any Python 3 version was sufficient.

Just discovered that BOLT requires at least 3.7 (due to how it uses subprocess.run). It’d be nice if enabled projects could contribute to checking the minimal version – that would have saved a bunch of confusion.

1 Like

While Python kept the “3” major version and there wasn’t anything of the magnitude of python 2->3, Python has changed by leaps and bounds within the “3” timeline, which is running for almost 15 years already.

“Any Python 3 version” is just not realistic IMO. Of course there’s a very core subset of functionality that has stayed identical, but old 3.x Python versions aren’t tested anymore in the wild and there will be behaviour changes in the stdlib (like urnathan mentioned), third party libraries will not work or pull in ancient versions, etc.

At the very least, it should be something like “the oldest non-EOL Python version is tested, anything below that might break”, perhaps with a CMake variable to opt out of the version bound.

4 Likes

Hi all, I wanted to revisit this topic, since I would really like to add some Python backwards compatibility checking for the libclang python binding, but it is difficult to add Python 3.6 CI (GitHub ubuntu-latests doesn’t support 3.6 anymore). Plus, the type annotation features added in 3.7 are significant.

See [GitHub] Add python 3.7 to libclang python test by linux4life798 · Pull Request #77219 · llvm/llvm-project · GitHub and [clang/python] Add type annotation to libclang Python binding · Issue #76664 · llvm/llvm-project · GitHub.

Has enough time passed to consider bumping the version to 3.8, or at least 3.7? Again, Python version 3.7 corresponds to Debian buster (old old stable) and was already EOL’d in mid 2023.

Best,

Craig

8 Likes

Strong +1 on following Python’s eol policy. Straying outside of this just becomes problematic to verify and keep CI running.

Note also that the next few versions of python seem to be heading towards some rapid evolution and deprecations. We may ultimately need a policy where we test latest and oldest supported.

2 Likes

I would definitely like to see the minimum Python version bumped, but my thoughts alone probably shouldn’t count for much on this issue as I’m unfamiliar with the efforts involved in maintaining recent LLVM toolchains on older systems (I know some go back to distros like RHEL 6) that I believe would be the biggest group impacted by this change. Some buildbots still running buster would probably have to upgrade too, but that’s already an ongoing issue with the minimum CMake version being higher than what is available by default on buster and bullseye. Debian Bullseye ships with Python 3.9.

I think a reasonable compromise if bumping from 3.6 would actually cause a lot of work for downstreams would be to specify different minimum versions for different parts of the project. It has already been mentioned in this thread that specific projects de facto require versions higher than 3.6. Specific elements within LLVM like Python bindings, ML Infrastructure, and Python CI infrastructure I think can easily require a higher version without impacting anything downstream. The testing infrastructure (llvm-lit and lit config scripts) would probably require a lot more work to get working if they started requiring features from newer versions for specific configurations. Scripts that get shipped to the user (like git-clang-format) should try and support the smallest possible version, but I don’t think in practice supporting anything less than the minimum version makes a big practical difference, at least for a conventional Linux distro setup where the Python version used for the build would probably be the same as the deployed version.

1 Like

I’m still +1 on bumping. I don’t mind a keeping llvm-lit and the like as simple as possible so that they might still work with the old, but we are 6 months after end of life of 3.7 and 3.6 is still being used (which was end of life 2021-12-23) so trying to test that these work is painful for everything already EOL’d.

I’d be pro 3.9 at this point, means next bump would be towards end of 2025. 3.8 fine too but just may mean another bump in 8 months.

2 Likes

IMO, python’s upstream EOL policy isn’t a very interesting target. What matters more is what versions people are using – e.g., if a commonly-used distro only has a version of python which is EOL upstream, but the distro is still being maintained, we should take that into account that when deciding to upgrade the minimum.

That’s the same way we look at the C++ minimum compiler updates, e.g. last time in RFC: Increasing the GCC and Clang requirements to support C++17 in LLVM

But, SGTM to go to 3.8, since:

  • Debian “Bullseye” (currently oldstable) has Python3.9
  • Ubuntu 20.04 has Python3.8
  • RHEL 8 Software Collections has Python3.8
  • openSUSE 15.4 apparently does have a python310 package available (just not the default version).

Centos 7 was already too old from the C++ compiler minimum update.

So, effectively, we’d be dropping support for:

  • Ubuntu 18.04
  • Debian “Buster” (currently oldoldstable)
1 Like

CentOS 7 is definitely not “too old”, since it has GCC 11 as part of its devtoolset.

It does only have Python 3.6.8 in its official repos as far as I’m aware though.

Is CMake 3.20 or newer available for CentOS 7? I believe this is our current minimum requirement.

CMake is definitely a good point of comparison. The minimum CMake version isn’t available by default on most of the distributions previously mentioned:

  • Bullseye has 3.18.4
  • Ubuntu 20.04 has 3.16.3
  • RHEL 8 seems like it might ship with CMake 3.11?
  • OpenSUSE 15.4 looks like it ships with something earlier by default, but can’t really tell.
  • CentOS 7 looks like it ships with 3.17, but someone will probably want to double check that as well.

So even the older distributions that are supported in terms of Python version are still not supported in terms of the CMake version.

1 Like