A little background: The single biggest painpoint for working with LLDB on Windows currently is Python. There is a long documented history of the problems with python extension modules on Windows. Part of it is Microsoft’s fault, part of it is Python’s fault (PEP 384 attempts to solve this, but it appears stalled), but the end result is that it’s really terrible and there’s nothing anyone can do about it.
The implications of this for LLDB on Windows are the following:
- Anyone building LLDB on Windows needs to compile Python from source
- Even after doing so, configuring the LLDB build is difficult and requires a lot of manual steps
- You can’t use any other extension modules in your custom built version of python, including any of the over 50,000 from PyPI, unless you also build them from source, which isn’t even always possible.
If you want to be compatible with the binary release of Python and interoperate with the version of Python that probably 99% of Windows people have on their machine, you must compile your extension module with a very old version of the compiler. So old, in fact, that it’s almost end-of-lifed. If it weren’t for Python actually, it would have been end-of-lifed already, and Microsoft ships a free version of this toolchain for the express purpose of compiling python extension modules.
I’ve been thinking about this for many months, and I believe I finally have a solution (2 solutions actually, although I think we should do both. I’ll get to that later). Both will probably be painful, but in the end for the better on all platforms.
Solution 1: Decouple embedded python code from LLDB
Rationale: Currently calls to embedded python code live in a few different places. Primarily in source/Interpreter (e.g. ScriptInterpreterPython) and the generated SWIG code, but there’s a few utility headers scattered around.
I’m proposing moving all of this code to a single shared library with nothing else and creating some heavy restrictions on what kind of code can go into this library, primarily to satisfy the requirement that it be compilable with the old version of the compiler in question. These restrictions would be:
- Can’t use fancy C++. It’s hard to say exactly what subset of C++ we could use here, but a good rough approximation is to say C++98.
- Can’t depend on any LLVM headers or even other LLDB libraries. Other LLDB projects could depend on it, but it has to stand alone to guarantee that it doesn’t pick up C++ that won’t compile under the old MSVC compiler.
I understand that the first restriction is very annoying, but it would only be imposed upon a very small amount of source code. About 2 or 3 source files.
The second restriction, and the decoupling as a whole will probably cause some pain and breakage for out-of-tree code, but I believe it’s a positive in the long run. In particular, it opens the door to replacing the embedded interpreter with that of a different language. By separating the code in this fashion, all one has to do is write a new module for their own language and initialize it instead of ScriptInterpreterPython. I’ve already seen people asking on the list about generating bindings for other languages and replacing the interpreter, and i believe a separation of this kind is a pre-requisite anyway.
Solution 2: Upgrade to Python 3.5
Rationale: Hopefully I didn’t send you running for the hills. This is going to have to happen someday anyway. Python 2.7 end of life is set to 2020. Seems like a long time, but it’ll be here before we know it, and if we aren’t prepared, it’s going to be a world of hurt.
Why does this help us on Windows? Visual Studio 2015, which releases sometime this year, is finally set to have a stable ABI for it’s CRT. This means that any application compiled against VS2015 or later will be forward compatible with future versions of the CRT forever. Python 3.5 is not yet released, but the current proposal is for Python 3.5 to ship its binary release compiled with VC++ 2015, effectively killing this problem.
I understand that there is a lot of out-of-tree code that is written against Python 2.7. I would encourage the people who own this code to begin thinking about migrating to Python 3 soon. In the meantime, I believe we can begin to address this in-tree in 3 main phases.
- Port the test suite to Python 3.5, using a subset of Python 3.5 that is also compatible with 2.7. This ensures no out of tree code is broken.
- Upgrading ScriptInterpreterPython with preprocessor flags to select whether you want to link against Python 2.7 and 3.5, and expose these options from CMake and Xcode so you can choose what you link against.
- Making multiple buildbots with different configurations and different versions of Python to get better code coverage to ensure that tests work under both language versions.
I realize that the impact of both of these changes is high, but we have a very strong desire to solve this problem on Windows and so we need to push some kind of solution forward. I think both solutions actually contribute to the long term benefit of LLDB as a whole, and so I think it’s worth seriously considering both of these and trying to come up with a path forward.