Migrate utils/ Python 2 scripts to Python 3

Hi LLVM-Devs,

I noticed that many Python scripts under utils/ have a shebang of
`#!/usr/bin/python` (which is a symlink to python2.7 on many platforms) and some of them use Python 2 syntax that is not compatible with Python 3 (e.g. print statements; str/bytes)

I created a revision to migrate utils/update_{llc_,}_test_checks.py from Python 2 to Python 3 ⚙ D42450 [utils] Convert update_{llc_,}test_checks.py to Python 3 , the arguments against making it Python 3 is:

* There are many Mac OS X users and Mac OS X 10.8 comes with Python 2.7 pre-installed [0] but not Python 3.
* Forcing scripts to use Python 3 (by doing so we can avoid some compatibility trouble) may not be a good idea.

Python 2.7 was published in 2010 and planned as the last of the 2.x releases. It will not be maintained past 2020 and there is also a retirement page https://pythonclock.org/ .

The second argument would not need to be addressed if the first one did not lay too much burden on developers. After all, we have to install `cmake, ninja/GNU Make / libedit(for lldb)` to build LLVM. These packages are not installed by default on many platforms.

Thoughts on deprecating Python 2 for utils/ scripts (different from libclang or lldb scripts) which are not user-facing?

[0]: 5. Using Python on a Mac — Python 3.10.6 documentation

Thanks,
Fangrui

Hi Fangrui, for what it's worth regarding lldb, Zachary Turner has already done a lot of work to make the majority of its python code (most importantly, its testsuite) work with Python 3 as well as Python 2.

J

+1 from my for changing LLVMs minimum requirements (for compiling llvm) to python3 and dropping python2 support.

In case of lit we probably need to keep the code in a way that works with python 2 and 3 though as that will affect various projects outside of core llvm that started adopting it.

- Matthias

-1 from my POV. Dropping python2 would presumably have a large effect on
anyone who maintains build infrastructure for LLVM (bots, companies'
build systems, anyone who builds and uses LLVM as part of a product or
OS...). This seems like a lot of busy work for little gain to me.

Matthias Braun via llvm-dev <llvm-dev@lists.llvm.org> writes:

Does Python 3 have feature we want to use in LLVM codebase, and no workaround there? If so, please give some examples. I think that makes the discussion more concrete.

Personally, every machine I work with only has Python 2.7.

Justin is correct that there is a non-trivial amount of effort to convert the bots.

Python 3 is wonderful. But, a Python 3 dependency seems like one burden that could be avoided. We have already made that trade-off in the past, for example by only using standard python packages, so there is less/nothing to pip install when getting started. Dependencies likes host compiler and cmake totally make sense given they are central to how LLVM is made. I don’t think the same for the python code.

I’m in favor of 2/3 compatibility until the death clock ends.

Thanks for the information. Then how about standalone scripts (many
one-file) like utils/update_check_tests.py that are unrelated to lit
or other important infrastructure? Can they be changed from
`#!/usr/bin/python2.7` to `#!/usr/bin/python3` shebang?

Thanks for the information. Then how about standalone scripts (many
one-file) like utils/update_check_tests.py that are unrelated to lit
or other important infrastructure? Can they be changed from
`#!/usr/bin/python2.7` to `#!/usr/bin/python3` shebang?

No. Every machine running macOS has /usr/bin/python and /usr/bin/python2.7 and because of system integrity protection it isn't even possible to install a python3 binary into /usr/bin (You can install it to /usr/local/bin though).

-- adrian

The suggested way to do this on OSX is using env:

#!/usr/bin/env python3

+1 to what Chris and Justin said.

I see no strong benefit to moving to python3 and substantial costs.

Philip

Since we seem to be voting, I'll -1 it. It's pretty ridiculous to have
a system without Python 3 in 2018 and anyone supplying such a
monstrosity should be encouraged to stop it.

Tim.

You might want to tell that to the Prominent North American Enterprise Linux Vendor that everybody is using... :slight_smile:

-Dimitry

Both Red Hat Software Collections and EPEL have Python 3 (epel has
python 3.6.3 even) so there is support for all north american hat
themed distros.

As a first step thought just running the scripts against futurize and
setting up a build to make sure it does support python 2 and 3 should
make the future transition much much easier.

Sadly, neither the latest version of RedHat (released in 2014), nor the latest version of macOS (released in 2017) have any version of python3 available with the default system. On the other hand, TTBOMK, every system that does have python3 available also makes python2.7 also easily available.

LLVM is not a primarily python project, so keeping up with the latest features of the language, and making them available for developers of LLVM is not terribly interesting (contrast with C++ – enabling developers of LLVM to use new C++ features does have a lot of value). There’s also no particular reason to think that developers will have already installed the latest version of python on their systems, if it takes effort to do so.

Python is used here as a convenient scripting language primarily because it is commonly/conveniently available across platforms. Requiring a python version which does not have the widespread availability (and without a compelling reason to want the higher version, other than that it’s newer) does not seem to make much sense.

So, ISTM that unfortunately, the decision to require python3 for is still premature, even some 10 years after its initial release. (Whether that state of the world is ridiculous, or whose fault that is, is irrelevant here; anyone who wants to have that discussion can go have it on python-dev for the 20000th time.)

That said, I do think it could make sense to prepare llvm for the world in which “python” is python3 on some systems. So, I’d propose the following:

  1. Change all #! lines to say “#!/usr/bin/env python2.7” instead of “#!/usr/bin/env python”, if they only work with 2.7.
  2. If someone feels motivated, and if it doesn’t make the code obtuse, port scripts to work with either version – and for such scripts, change the #! line to say “#!/usr/bin/env python”.

As mentioned in https://docs.python.org/3/using/unix.html#miscellaneous, for Python 3 the shebang line should be:

#!/usr/bin/env python3

For Python 2 the shebang line should probably be:

#!/usr/bin/env python2

but as Python 3 should never install its executable under the name “python”, you could also let it stay at:

#!/usr/bin/env python

instead.

-Dimitry

Unfortunately macOS never shipped with this symlink AFAIK. so python2.7 should work better for a bigger number of macOS users.

  • Matthias

The suggested way to do this on OSX is using env:

#!/usr/bin/env python3

Sorry, I use `#!/usr/bin/env python3` in my patch but used #!/usr/bin/python3 in the email :slight_smile:

Nope.

Regarding “python” potentially pointing to python3:

Arch Linux has done that for years. That unilateral decision on their part was widely-decried as a mistake at the time, and spawned the python doc you reference saying that shouldn’t be done. However, Fedora is now making noises about doing the same, in a few years, after driving a change in the upstream recommendation. While it’s certainly not finalized, I’d fully expect this to happen at some point.

https://fedoraproject.org/wiki/FinalizingFedoraSwitchtoPython3

https://wiki.archlinux.org/index.php/python

And, regarding using “python2” instead of “python2.7”:

Since python2.7 is the last-ever python2 version, and also the minimum version required by llvm, the name “python2” is false generality. Additionally, the “python2” symlink is missing on many systems (not just macos). So, the choice of “python2.7” is clearly the better option here.

I would like to raise the issue that neither python2 nor python2.7 exist on Windows. This is making it so that I can no longer directly execute .py files in the git for windows bash shell. Can we revert all this stuff and go back to the ‘#!/usr/bin/env python’ that worked?

I've started the work:

    https://reviews.llvm.org/D55121

More patch may come if there's an interest in that.