[RFC] Python 2 / Python 3 status

Hi folks,

Python2 has reached end of support[0], and many core Python packages are
dropping pthon 2 support[1].

This is a subject that raises periodically on this mlist, with a rather strong
no in 2018 [-1] and a slow move in 2019 [-2, -3].

Even if Python is not a core build requirement, it's used during some
configurations steps (e.g. in the cmake export_executable_symbols function),
for testing, and it is definitevely part of the user experience - several clang
tools depend on it, as well as many lldb features. So it's not only about utility
scripts.

Fedora has moved to Python3 [2], RHEL8 ships Python3 by default [3] (though
RHEL7 still ships Python2, but it has reached « End of Full Support »).
Latest OSX version also explicitly obsoletes Python2 [4]. Debian always ships
both interpreter - even in stable [5]

There has been an effort in LLVM/lldb/clang/... to make all Python scripts
portable across Python2 and Python3 version [6], but if the documentation and
``GettingStarted.rst`` is to be trust, we only require >=2.7. This may be
considered as a maintenance burden, see for instance this discussion [7]. Plus
LLVM doesn't have a strong conservative culture, e.g. wrt CMake requirements.

My personal take on this would be to start moving forward. Still supporting both
version this year, but obsoleting Python 2.7 and requiring, say Python 3.6,
starting January 2021 looks like a good compromise.

Any thoughts?

[-3] [llvm-dev] Python 2 end-of-life and llvm
[-2] [llvm-dev] Python 2 compatibility for utility scripts
[-1] http://lists.llvm.org/pipermail/llvm-dev/2018-January/120826.html\]
[0] PEP 373 – Python 2.7 Release Schedule | peps.python.org
[1] https://python3statement.org/
[2] Finalizing Fedora's Switch to Python 3 - Fedora Project Wiki
[3] 8.0 Release Notes Red Hat Enterprise Linux 8 | Red Hat Customer Portal
[4] Apple Developer Documentation)
[5] Python - Debian Wiki
[6] https://github.com/llvm/llvm-project/search?q=python+compatibility&unscoped_q=python+compatibility&type=Commits
[7] ⚙ D73011 [opt viewer] Python compat - decode/encode string

My personal take on this would be to start moving forward. Still supporting both
version this year, but obsoleting Python 2.7 and requiring, say Python 3.6,
starting January 2021 looks like a good compromise.

I'm relatively new to LLVM so my +1 doesn't count for much, but I am
very supportive of this idea! I maintain a downstream fork of LLVM at
Facebook, and Python 2 will no longer be made available in our
developer environment at some point in 2020. So if LLVM is still
maintaining Python 2-compatible programs in 2021, I'll need to install
a Python 2 interpreter on a personal computer of mine to test any
changes I make, which would be a big inconvenience.

Even if Python is not a core build requirement, it's used during some
configurations steps (e.g. in the cmake export_executable_symbols function),
for testing, and it is definitevely part of the user experience - several clang
tools depend on it, as well as many lldb features. So it's not only about utility
scripts.

My understanding is that the build relies upon 232 LLVMBuild.txt
files, which are processed by a Python program llvm-build. I think
that if one uses CMake to build, then Python is in fact a core build
requirement (I'm not sure if gn uses llvm-build?).

I don't want to derail the discussion here, but I have been meaning to
ask: I recall that it was once used more widely, but these days, why
does llvm-build exist? (Again, I'm a beginner in the project, so
forgive my ignorance here!)

I read through some of the code and from what I understand, one of its
main functions is to define the library dependencies of LLVM libraries
like LLVMX86CodeGen or LLVMObject, both as global properties in CMake
(build/LLVMBuild.cmake), and as a C struct that can be used by
llvm-config.cpp (build/tools/llvm-config/LibraryDependencies.inc). But
is there a reason these dependencies couldn't/shouldn't be hardcoded,
instead of being generated by the llvm-build Python executable?

I'm wondering if there's some dynamic build configuration feature of
llvm-build that I'm missing. What makes it hard to remove?

Any thoughts?

Pardon my tangential questions about llvm-build -- thank you for
proposing this timeline!

- Brian Gesiak

Sounds good to me. Keeping the window of time during which we support both Python 2 and 3 as small as possible would be nice.

I don’t know how Python-infested the build system is, but certainly the basic test suite (lit) is all Python.

Just to add to the notes about distros, on Windows it looks like Visual Studio 2017 comes with Python 3.6.

–paulr

My personal take on this would be to start moving forward. Still supporting both
version this year, but obsoleting Python 2.7 and requiring, say Python 3.6,
starting January 2021 looks like a good compromise.

I'm relatively new to LLVM so my +1 doesn't count for much, but I am
very supportive of this idea! I maintain a downstream fork of LLVM at
Facebook, and Python 2 will no longer be made available in our
developer environment at some point in 2020. So if LLVM is still
maintaining Python 2-compatible programs in 2021, I'll need to install
a Python 2 interpreter on a personal computer of mine to test any
changes I make, which would be a big inconvenience.

Even if Python is not a core build requirement, it's used during some
configurations steps (e.g. in the cmake export_executable_symbols function),
for testing, and it is definitevely part of the user experience - several clang
tools depend on it, as well as many lldb features. So it's not only about utility
scripts.

My understanding is that the build relies upon 232 LLVMBuild.txt
files, which are processed by a Python program llvm-build. I think
that if one uses CMake to build, then Python is in fact a core build
requirement (I'm not sure if gn uses llvm-build?).

I don't want to derail the discussion here, but I have been meaning to
ask: I recall that it was once used more widely, but these days, why
does llvm-build exist? (Again, I'm a beginner in the project, so
forgive my ignorance here!)

llvm-build exists for two reasons, one legacy, one still an issue. Back when we supported both autoconf and CMake llvm-build provided a way to express the build dependency graph in one place and have both build systems respect it. Additionally it generates the tables that drive llvm-config. Deprecating llvm-config in favor of pkg-config and CMake packages has been discussed, but there has been no decision.

Removing llvm-build is technically feasible even if we continue supporting llvm-config, but a non-insignificant amount of work to get CMake to generate the headers needed in the llvm-config build.

-Chris

My personal take on this would be to start moving forward. Still
supporting both
version this year, but obsoleting Python 2.7 and requiring, say Python 3.6,
starting January 2021 looks like a good compromise.

Sounds good to me. Keeping the window of time during which we support both
Python 2 and 3 as small as possible would be nice.

One blocker may be shebang, #!/usr/bin/env python

may pick either Python 2 or Python 3, if the distributions haven't
migrated.

Some scripts has dome
`#!/usr/bin/env python` -> `#!/usr/bin/env python3` but some have not.
If we can update the rest, it may be easier for our overall migration.

I also recall Reid said `#!/usr/bin/env python` might make Windows
developers happier but I forget the details.

My personal take on this would be to start moving forward. Still
supporting both
version this year, but obsoleting Python 2.7 and requiring, say Python 3.6,
starting January 2021 looks like a good compromise.

Sounds good to me. Keeping the window of time during which we support both
Python 2 and 3 as small as possible would be nice.

+1

I also recall Reid said #!/usr/bin/env python might make Windows
developers happier but I forget the details.

Python on windows has a wrapper that parses this line and selects which python to use bases on that.

I believe many developers develop from the “git bash” msys shell. In this context, bash will interpret the shebang line. If the shebang line is #!/usr/bin/env python3 and there is no python3.exe on PATH, that will be an error.

For the Linux distros that are removing Python 2, will “python” find Python 3 in the future, or will we have to say “python3” explicitly?

If the shebang line is #!/usr/bin/env python3 and there is no python3.exe on PATH, that will be an error.

Anyway, has current policy supports Python 2.7, it would be some kind of policy violation to require python 3 [1],
so I don’t think we should go that way until we explicitly require Python3. Once that milestone is reached,
let’s say in january, 2021, my educated guess is that python3 and python will be equivalent on most systems and we don’t need to make a move.

What about this:

  1. until autumn: do nothing, try to preserve py2 / py3 compat

  2. in autumn: prepare a patch that
    2.1 update doc requirement, and enforces it in cmake
    2.2 test that change on all buildbots

  3. in winter: (snakes brumate in winter)
    3.1 merge the above patch
    3.2 iteratively remove compatibility layer wherever it makes sense, in an iterative, distributed manner

Concerning 2.2, buildbot being an old python package, it may live on old machine with only python2, there may be some friction. that being said,
Buildbot no longer supports Python 2.7 on the Buildbot master…

I don’t know if 3.2 makes sense, or if we should just keep existing code as is, and just don’t care about compatibility for newer code.
The good thing is that 3.2 can span over multiple people and over time.

[1] as of 058070893428a480b76a137f647ae6b9c89ac2d4, the only scripts that explicitly require python3 are the following:
./libclc/generic/lib/gen_convert.py:#!/usr/bin/env python3

./llvm/utils/release/github-upload-release.py:#!/usr/bin/env python3
./llvm/utils/add_argument_names.py:#!/usr/bin/env python3
./llvm/utils/update_analyze_test_checks.py:#!/usr/bin/env python3
./llvm/utils/llvm-compilers-check:#!/usr/bin/python3
./llvm/utils/update_test_checks.py:#!/usr/bin/env python3
./llvm/utils/update_mca_test_checks.py:#!/usr/bin/env python3
./llvm/utils/update_llc_test_checks.py:#!/usr/bin/env python3
./llvm/utils/update_mir_test_checks.py:#!/usr/bin/env python3
./llvm/utils/lit/install/bin/lit:#!/opt/rh/rh-python36/root/usr/bin/python
./llvm/tools/sancov/coverage-report-server.py:#!/usr/bin/env python3
./mlir/utils/spirv/gen_spirv_dialect.py:#!/usr/bin/env python3
./clang/utils/convert_arm_neon.py:#!/usr/bin/env python3
./polly/test/update_check.py:#! /usr/bin/env python3