[RFC] Upgrading LLVM’s minimum required Python version

It’s that time of year again, the time when we celebrate the autumnal equinox, pay our taxes, and consider bumping LLVM’s minimum required Python version. The last time we did this was Jan 2023 so we’re a little behind - at the end of this month Python 3.9 will stop getting security fixes. So ideally we should bump from 3.8 to 3.10 (Python 3.8 was already EOL’ed last year).

To that end I’ve sent a “trial balloon” PR that bumps LLVM_MINIMUM_PYTHON_VERSION to 3.10. In addition, I kicked off all the builder bots. The current score is 26 failing, 125 successful checks. So about 20% of bots don’t have a recent enough version (but actually the numbers are even a little better because some of the fails are the typical flakes…). Anyway, if we consent (reach consensus) on bumping, I’m fine taking on the task of chasing down the relevant parties to get those bots upgraded (and then bumping only if/when that occurs). Regardless of when we perform the bump, I’d suggest we add a if (Python_VERSION < LLVM_MINIMUM_PYTHON_VERSION) message(WARNING ...) to the CMake now to give people a heads up.

11 Likes

What’s the motivation behind this? I’m not sure there’s a large benefit to raising the minimum just for the sake of raising the minimum. Just because the official security fix deadline has passed does not even mean that support is completely over (although at this point it will mostly be paid third parties), much less that people aren’t trying to build/test/use LLVM on systems with said Python version.

Raising the python version definitely has a downstream cost on top of the upstream cost around getting buildbots updated. I think that cost should be justified by some clearly articulated benefit to the health of the upstream project than just seasonal cleaning.

3 Likes

I think there are many reasons to drop support for Python 3.8 and 3.9 beyond the lack of new security patches (though that is also an important factor). I’ll take the MLIR Python bindings as an example, since I’m less familiar with other Python components in LLVM.

First, if you search for #if PY_VERSION_HEX in MLIR, you’ll find quite a few pieces of code that provide different implementations depending on the Python version. This is because (before Python 3.12) CPython didn’t offer a stable cross-version API, and each version might introduce new features or better alternatives. In short, just like many projects avoid supporting multiple C++ standards simultaneously, supporting many Python versions inevitably introduces maintenance overhead. Following Python’s official support window helps keep the project maintainable and reduces this burden.

Second, to give a more concrete example — if you frequently use Python type hints, you’ll notice that they changed quite a lot around Python 3.10 (see the documentation here: typing — Support for type hints — Python 3.14.0 documentation). For instance, using X | Y for Union and other syntax improvements. Supporting older versions often means being forced to stick to legacy syntax and avoid new features, which limits how we can write and evolve the codebase.

I don’t think the existence of some commercial services providing paid security fixes means that an EOL Python version is “safe.” Not everyone can or will use those services. Also, the fact that some people might still build or test LLVM on older Python versions doesn’t, in itself, justify continued support for them — at least not without stronger reasons.

I do have some doubts about how much effort the LLVM upstream should devote to mitigating downstream costs. I understand that maintainers at companies maintaining Linux distributions care deeply about version constraints to make package management work smoothly with their default system Python. But we also need to consider whether this is worth sacrificing upstream maintainability — and if so, where to draw that balance.

As I mentioned above, I believe that regular cleanup is itself a strong justification. Periodic cleanup is an important way to preserve maintainability, and in this case, I think that’s a valid and sufficient reason.

1 Like

As a point of reference, I tested it on my system and download & build a scratch Python install at specific version and it was <90s and required 3 Linux shell commands (on my macbook sitting in South Africa it took 190s but brew update and cleanup also got executed, so a little increased by those). And it doesn’t require root.

Given that it should be once off cost, the quantified cost and expertise needed is low IMHO.

I’ll add that as I’m not installing beyond EOL software on my corp machine, I’d only notice that I’m using a feature that has been commonly used in the code around me for the last 3+ years as being modern when some build bot fails.

This is one of the problems: Python from my view in LLVM was historically around lit. Many folks don’t even know lit or lit configs is Python, they often never have to interact with them and the majority that do, just fork some config, change some amount of fields to make things work and then move on. So considerations around UX, type hints, debug location info, generality, performance etc are much different. There is probably a O(2000) line interpreter instead of regular Python that could be used.

Now, this has changed with lldb and others also using Python. But still quite different from regular Python usage.

There is also desire to keep the version constant across LLVM project (understandable). But that forces “we use Python incidentally as we have lit testing” and “we use Python language” to use same version. The arguments that convinces the former of needing to upgrade is quite different from the latter. In another world we could have “minimum version for lit” and “minimum version for python bindings”. So current median is conservative (3.10 was released 4 years ago, not near cutting edge).

1 Like

This is the big one for me. Our company has automated software that detects if we’ve got out-of-date (i.e. EOL) versions of programs installed, such as python and we need a strong business justification to have any hope of getting an exemption. I don’t believe this would pass that hurdle. I doubt we’re the only company in the same position.

So, the question is, do we want developers in this situation to be able to investigate and resolve failures owing to python version mismatches (e.g. to using overly modern python constructs)? If the answer is yes, we have to have a policy of a minimum python version that is not EOL.

I’d like to register a strong -1 on the overall premise of this thread: That we should be regularly bumping the minimum Python version “just because”.

Bumping the Python version should follow the same process as bumping our other host toolchain requirements. This requires analysis of the specific benefits of the newer version, as well as availability of the toolchain across various distros.

The fact that upstream no longer supports our minimum Python version should not factor into our policy, in much the same way that only Clang 21 being supported by upstream should not mean that LLVM should drop support for building with older Clang versions (in fact, we support all the way down to Clang 5 currently).


Now with that generic comment out of the way, regarding the specific proposal to raise to Python 3.10: This would be a problem for packaging LLVM on RHEL 9 (and all derivative operating systems), because it uses Python 3.9 as the system Python version. This is something we can deal with, but using the non-default Python does introduce a lot of problems.

Raising the requirement to Python 3.9 should be unproblematic for us.


I think “quite a few pieces of code” is overstating it a bit. As far as I can tell, there’s just 6 of these. 2 of them could be removed by raising the minimum to Python 3.9 and one more by raising to Python 3.10. It’s not zero, but I don’t think there is a major maintenance burden here.

This is a more compelling argument in abstract, but doesn’t this already work with from __future__ import annotations since Python 3.7? Adding one line seems like a low cost for compatibility.

Yes, I think this is the crux of the problem. MLIR’s Python usage is something of an outlier in the LLVM project. It already requires a bleeding-edge nanobind version, so combining that with a five year Python version is somewhat odd.

I do think there is value in a common Python baseline for the LLVM project, but we do have to acknowledge that there are different requirements here – so if something has to give, we should probably give the MLIR python bindings a separate minimum version requirement. (Though based on the preceding responses, I don’t see any particularly strong reasons for doing that yet.)

1 Like

Yes, before this thread posted, I mentioned in the Discord discussion whether it would make sense to set a separate Python baseline for MLIR — especially considering that MLIR’s Python bindings are currently disabled by default in the standard build (though if they become enabled by default later, users might become more sensitive to the version requirements). As @jpienaar pointed out, developing a Python package with C extensions is fundamentally different from writing single-file Python scripts (or those used in llvm-lit). The former is much more sensitive to Python version changes — in terms of API evolution, new features, performance, and overall UX.

While I do agree to some extent that there’s no blocking reason that forces us to move to 3.10 right now, in day-to-day development, even a few small pain points can make the experience noticeably frustrating.

In addition, for reference, here are some popular Python packages on PyPI that include C extensions, along with the range of CPython versions supported by their latest releases:

  • numpy 2.3.4 — supports Python 3.11–3.14
  • pandas 2.3.3 — supports Python 3.9–3.14
  • torch 2.9.0 — supports Python 3.10–3.14
  • matplotlib 3.10.7 — supports Python 3.10–3.14
  • numba 0.62.1 — supports Python 3.10–3.13
  • scikit-learn 1.7.2 — supports Python 3.10–3.14
  • pyarrow 21.0.0 — supports Python 3.9—3.13

Off topic but I have to ask -

How did you manage to do this? We’ve been telling people you can’t run a buildbot on pre-commit code.

I guess the key is that the code is a user branch on llvm-project/llvm, instead of a fork. Did you have to go to each bot in turn and request a build?

(and I can confirm that clang-aarch64-sve-vla-2stage is a flake, all of Linaro’s buildbots are on Python >= 3.10)

I have been trying out Vermin for checking required Python versions (the name is a bit dark but you see the pun, version-minimum). My system’s default is 3.10 and it found I had used a 3.10 feature in some LLDB work.

Use of such checkers is separate to upgrading our version requirement, and I’m not sure this specific tool is appropriate but it’s something I’m looking into.

As it would be cheaper and easier to organise than a minimum Python bot, especially if that Python is an unsupported version.

Just want to chime in and say +1 on regularly bumping the python version. I am one of the few people who maintains the Rust debugger visualizer scripts. The scripts have to assume LLDB’s minimum python version because Rust doesn’t know which LLDB the user has. There are quite a few things added in 3.9-3.10 that would make those scripts a easier to write and maintain. To list a few that I’ve personally run into:

  • significant performance improvements
  • str.stripprefix and str.stripsuffix
  • functools.cache
  • generic type hints (e.g. list[T] instead of from typing import List)
  • match statements and structural pattern matching in general
  • int.bit_count()
7 Likes

I think macOS is also on Python 3.9.

1 Like

As someone heavily involved in the python ecosystem (and particularly distribution through conda-forge), my experience is that actually, a time-based “just because” is a good enough reason.

That’s because every python version has a maintenance cost (and people do expect compatibility with new python versions to be added quickly), and because keeping things compatible with EOL’d python versions for a dwindling set of users is just not a good use of maintainer time.

To avoid regurgitating the same kind of “should we drop py3.X now?” discussions, the scientific python ecosystem[1] came up with policies (e.g. NEP29, now SPEC0) that codifiy something like “we drop support for a given python version X years after its release”.

The details of those policies are not applicable to LLVM (it’d be possible to come up with rules that are more suited to the project), but the overall direction is still beneficial. That is, unless people are keen on having the same discussion year after year, and to decide on a case-by-case basis.

But IMO all the ingredients are pretty predictable these days (release cadence of Python, LLVM, etc.), even something like “last RHEL version not in extended support” would increment with some predictability[2].


  1. disproportinately affected due to lots of native code under the hood, which needs to be compiled per python version ↩︎

  2. though I wouldn’t take super-longterm support distros as the baseline personally; and those distros already provide backports of newer python versions anyway. ↩︎

1 Like

Weak -1 from me as long as this requirement is global. I mainly build LLVM libs and use lit tests and tooling from utils. There is a very little value in bumping python version for those but it comes with lots of annoyance when you have to update all of the build environments/build scripts ensuring they have the right python version. LLVM is not a python project, so I don’t see why it should depend on the recent version of python so badly.

Perhaps having two separate python version “this is required to build LLVM+Clang and run tests” and “minimum for python bindings/MLIR/etc, (lldb? I can see the argument for it being in either of the categories)” would make sense.
Don’t see why someone building Clang needs to be concerned with having latest Python version on their system.

2 Likes

Adding to this, I don’t see how the lack of upstream security fixes is relevant: we don’t expose Python to untrusted inputs during our build process.

1 Like

This doesn’t actually make the new syntax backwards compatible - it just trades a parser fail for a runtime fail:

from __future__ import annotations
import typing

def foo(x: "int | list"):
    return "bob"

typing.get_type_hints(foo)
# TypeError: unsupported operand type(s) for |: 'type' and 'type'

So e.g., this blocks in-tree type checking tests.

I’m fine with this compromise - I only didn’t propose it initially because my understanding (from the previous bump) was that this wouldn’t fly. I can re-run the “trial ballon” PR with just a bump to MLIR’s minimum version.

It’s true that it’s slightly easier if the branch is on the mothership repo (llvm/llvm-project) but FYI all github forks are actually branches on the original repo and every PR is actually a merge commit to main against that “secret” branch (something like refs/remotes/pull/163822/merge). So you can do this with any fork. And yes I triggered each bot by hand (but not really - I did one, grabbed the curl from chrome, and did the rest in a bash script). Hopefully writing this down is more helpful than hurtful vis-a-vis DDoS (I’m assuming Force Build can’t be triggered by non-committers anyway).

Yea I’m totally sympathetic to this - we should foist the minimal set of requirements on users/devs.

In effect we do, since there are Python (many) tests. So at minimum we are exposing builder bots to untrusted sources but we have (and support) fully executable tests so I guess that cat is out of the bag a long long time ago. It’s true the security concerns are not the primary reason to bump (the aforementioned features are).

1 Like

Yes, the Python3.framework in Xcode is Python 3.9. That’s what’s used by the shims in the OS (e.g. /usr/bin/python3) and also what LLDB links against.

We have a documented process for how we go about “Updating Toolchain Requirements”. It is currently written for C++ toolchain dependencies, but IMO we should use effectively the same process for updates to the minimum Python version. As critical infrastructure, we owe it to our users to keep wider compatibility than we perhaps might wish to, and not to just drop versions because time has passed.

The critical pieces of information to include in an RFC are (verbatim from the doc; read “C++” as “Python” in this case, of course):

  • Detail upsides of the version increase (e.g. which newer C++ language or library features LLVM should use; avoid miscompiles in particular compiler versions, etc).
  • Detail downsides on important platforms (e.g. Ubuntu LTS status).

Many of the replies here are people asking about these issues in various ways – but they should be addressed in the initial proposal. I would suggest to collect that data, and make a new proposal with it included.

4 Likes

Sure - will do. Thanks for the link to the doc.

Interestingly enough, a similar discussion back in 2012 had similar points.

I would avoid this. It would make it harder tor RedHat based builds to use our Python bindings.

We don’t need to make 3.10 a firm requirement to be able to test it across the board, and proactively fix issues, while still keeping it compatible to older (3.9) versions.

Just want to point out this characterization (“just because”) is wrong and unnecessary. The original post had a clear motivation behind it:

Whether this is important enough for LLVM to make the lives of RedHat users a pain or not is orthogonal to the underlying point, which is clear and perfectly valid.

My personal view is that the security fixes is important but less critical for LLVM (since we’re not a Python project and projects don’t need to use our Python for security purposes), so the 3.9 requirement is sufficient for now.

1 Like

Let me clarify what I meant by this: I strongly believe that we should never perform time-based minimum version bumps, for Python or any other toolchain dependencies. Our minimum versions should always be requirements-based.

My objection here was to the framing in the very first sentence of this RFC that bumping the Python version is something we should be doing routinely every single year. Updating Python requirements is not paying your taxes.

For example, the next time we bump our C++ compiler requirements, it will be either because this enables us to use C++20 features, or because we run into too many bugs with one of the old compiler versions. It’s not going to be just because N years have passed since the last bump. The conversation around the Python version requirement needs to be the same.

1 Like

These two things are always intertwined. We have updated compilers over the years because there was enough using the new ones and we wanted the new features. But often we had to delay our bump due to some distros (cough RedHat cough) not catching up. This is also time based, but the other way around.

But the point here, IIUC, is that the requirement (updates and security fixes) for Python IS time based. While we won’t get new Python features, we may run into security problems in the Python side.

But the counter-point is that LLVM is not a Python project and does not rely on it to function, so Python’s update cadence is less relevant to us than, for example, system compilers and libraries. Note the excess usage of may in the paragraph above.

So, we agree with each other, perhaps very strongly, but the usage of words (“just because”) was indeed wrong (there was a requirement, security fixes, although time-based due to how Python development works) and unnecessary (gave the idea that there was nothing else, when in fact, there was, just not as relevant).

1 Like