Adding support for this format to LLD will allow users to link object files generated with GCC LTO.
Description
This is not feature complete yet, but this prototype is able to link simple executables.
I believe it’s a good time to start getting early feedback from the community in order to guide me while I implement the missing features.
Split the LTO code and start adding an initial implementation for GCC;
Add the remaining implementation for linking a simple executable;
During this work, I decided to not add support for option -pass-through=. While GCC may use it, ld.bfd completely ignores it. Example of GCC usage: -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s.
As you mentioned, liblto_plugin is licensed GPLv3. Apache v2.0 and GPLv3 are compatible between themselves. There is also agreement on this topic in this thread from 2015.
AFAIU, the situation is very similar to using Clang to build LLD and linking to libstdc++ or libgcc.
With that said, I believe there are some downstream members of the community that may not want to get LLD linked to GPLv3 software. In that case, we could adopt the suggestion from Github user Artoria2e5 (thanks @teresajohnson for the link):
On the CMake configuration side, this may take the form of:
An LLVM_ENABLE_GPL option that allows for a GPL build
An LLVM_ENABLE_LLD_BFD_PLUGIN option that actually controls the feature
It doesn’t hurt to mention these macros would be disabled by default.
Can you clarify what this compatibility actually means? To my understanding, this is a one-way compatibility: GPLv3-licensed projects can use code under Apache 2.0, because the latter doesn’t place any additional restrictions beyond the ones in GPLv3. The opposite is not true, however. Which, I believe, is one of the reasons why GPLv3-licensed test code lives in llvm-test-suite repo to avoid tainting the monorepo.
As we’ve been pointed out many times over the years, legal questions regarding the project should be sent to LLVM Foundation Board to get an answer from an actual lawyer.
Can you clarify what this compatibility actually means?
This is the kind of question that is best answered by a lawyer, which I am not.
With that said, let me try to get you an answer.
The FSF explains it as:
What does it mean to say a license is “compatible with the GPL”?
It means that the other license and the GNU GPL are compatible; you can combine code released under the other license with code released under the GNU GPL in one larger program.
Their explanation is longer and also give more details about the difference between GPLv2 and GPLv3.
Which, I believe, is one of the reasons why GPLv3-licensed test code lives in llvm-test-suite repo to avoid tainting the monorepo.
Just to make it clear: all the code contributed to the monorepo in this RFC will be licensed Apache v2.
This Apache v2 code, when enabled, will be dynamically linked to a library that is licensed GPLv3.
As we’ve been pointed out many times over the years, legal questions regarding the project should be sent to LLVM Foundation Board to get an answer from an actual lawyer.
AFAIU, this proposal is not doing anything new regarding licensing.
As explained before, linking to GPLv3 libraries already happen when lld gets linked to libstdc++ or libgcc after being compiled by Clang for example.
It doesn’t hurt to repeat the proposal in my last comment:
The new code would be disabled by default.
In order to enable it, one would have to enable a CMake macro that makes it clear the user wants to link to GPLv3.
I don’t think this is right. I am no lawyer either - but I am pretty sure it’s different to load GPL code into LLVM compared to linking something where the output contains GPL.
I think you need to contact the foundation and let their lawyer look at this before we move forward with your patch.
From a practical perspective, who would be the maintainer of the gcc support in lld, address user issues, and how would it be tested (e.g. do you plan to add a public build bot)?
IMHO we can try to reuse Linux builders that already have gcc installed, e.g. llvm-clang-x86_64-gcc-ubuntu in order to run the tests that still need to be developed.
If this is not possible, I volunteer to add a new BuildBot builder (not to be confused with worker/machine).
If the new test ends up requiring a new worker/machine, then I’ll need to look for options.
mold does indeed support LTO using the LTO plugin mechanism. This is the only way to support LTO for both GCC and LLVM, and I found it to be quite useful at times, because it allows the compiler and linker to be updated independently. OTOH, lld and LLVM must be of the same version to do LTO.
FWIW, I don’t think it makes sense to include GNU binutils’ plugin-api.h just for the constants declared in that file. You should define them yourselves instead so that users don’t have to teach where the file is at the build time. Here’s a list of the constants required to support the LTO plugin: mold/src/lto.h at main · rui314/mold · GitHub
Although this document focuses on gold, a similar approach can also be implemented in GNU ld.[1]
At the same time gold has been deprecated:
Perhaps the most significant change is the absence of the “gold” linker, which is deprecated and about to disappear entirely. Gold appeared in 2008 with some fanfare as a faster linker, but it has suffered from a lack of maintenance in recent years.[2]
How are these bfd/gold/whopr and LTO related? What is different to LLVMgold.so? How does the eventual removal of gold affect this?
My understanding is that the reason GCC’s documents mention gold in the LTO context is because the LTO plugin was originally developed for gold. After that, BFD ld and mold gained support for GCC LTO using the same plugin mechanism, and their support for LTO is now on par with gold’s. So, IIUC, as long as a linker supports the plugin mechanism, it supports all GCC LTO features.
I’m grateful for your work on implementing the GCC LTO feature.
I try to find similar license discussions. The most relevant one is rustc_codegen_gcc using libgccjit.
For better or worse, the FSF holds copyright on libgccjit (FWIW, I used to be OK with this, but I’ve been reconsidering my views on the FSF lately …but that’s a whole other issue).
libgccjit is a GPLv3 library, in particular, it’s essentially a thin wrapper around GCC’s implementation (but designed to work as a shared library rather than a command-line tool). Despite the name, it can also be run as an ahead-of-time compiler, which is how this project is using it..
As I understand it, any host code directly linking with libgccjit needs to comply with the GPLv3, but the target code generated by libgccjit isn’t affected by the GPLv3 (but might link against the target libgcc runtime library, which has its own license); this is analogous to the classic usage of GCC as a command-line tool. My understanding is that the FSF is OK with GCC being used to develop code under other licenses (including proprietary), and GCC’s license only affects that code in-as-much as it links to the target libgcc runtime library (which is under a different license). It might be worth having your counsel check that license.
Note, LLVMgold.so, built from llvm/tools/gold, support all of GNU ld, gold, and ar, despite “gold” in its name.
ar requires a symbol table (not useful nowadays Archives and --start-lib | MaskRay )
Since LLVMgold.so is a shared object, its dependency scenario differs from adding a dependency to the LLD executable.
I’ve received a reply from the LLVM Foundation board.
They confirmed there is no problem contributing regular Apache 2.0 WITH LLVM-exception-licensed code that links to GPLv3 code; the two licenses are compatible.
They also clarified that static linking, dynamic linking or loading does not have a difference when the licenses involved are Apache 2.0 and GPLv3.
I think there is an important caveat that these license are compatible, but when you link together GPLv3 and Apache 2.0 licensed code, the resulting binary is a derived work that carries the requirements of the GPLv3. We wouldn’t want the LLVM build system to unconditionally statically link GPLv3 libraries, or all of our downstream forks would have to abide by the terms of the GPLv3.
It seems to me (as a non-lawyer) that the plugin architecture is an important aspect of what would allow us to add support for loading GCC-derived plugins, because the plugin architecture ensures that our build system doesn’t produce GPLv3 encumbered artifacts.
Agreed. That’s why I’m planning to follow the proposal from Github’s user Artoria2e5 that will disable building this code by default and will only build it if the user requests for “GPL code” and “GCC LTO support”.
I didn’t understand this part. How does a plugin ensure the build system doesn’t produce GPLv3 encumbered artifacts?