By organizing it as a library, I'm expecting something coarse. I don't
expect to reorganize the linker itself as a collection of small libraries,
but make the entire linker available as a library, so that you can link
stuff in-process. More specifically, I expect that the library would
basically export one function, link(std::vector<StringRef>), which takes
command line arguments, and returns a memory buffer for a newly created
executable. We may want to allow a mix of StringRef and MemoryBuffer as
input, so that you can directly pass in-memory objects to the linker, but
the basic idea remains the same.Are we on the same page?
Let me answer this below, where I think you get to the core of the problem.
In the process of migrating from old lld ELF linker to new
(previously ELF2) I noticed the interface lost several important features
(ordered by importance for my use case):1. Detecting errors in the first place. New linker seems to call
exit(1) for any error.2. Reporting messages to non-stderr outputs. Previously all link
functions had a raw_ostream argument so it was possible to delay the error
output, aggregate it for multiple linked files, output via a different
format, etc.3. Linking multiple outputs in parallel (useful for test drivers) in
a single process. Not really an interface issue but there are at least two
global pointers (Config & Driver) that refer to stack variables and are
used in various places in the code.All of this seems to indicate a departure from the linker being
useable as a library. To maintain the previous behavior you'd have to use a
linker binary & popen.Is this a conscious design decision or a temporary limitation?
That the new ELF and COFF linkers are designed as commands instead of
libraries is very much an intended design change.I disagree.
During the discussion, there was a *specific* discussion of both the
new COFF port and ELF port continuing to be libraries with a common command
line driver.There was a discussion that we would keep the same entry point for the
old and the new, but I don't remember if I promised that we were going to
organize the new linker as a library.Ok, myself and essentially everyone else thought this was clear. If it
isn't lets clarify:I think it is absolutely critical and important that LLD's architecture
remain one where all functionality is available as a library. This is *the*
design goal of LLVM and all of LLVM's infrastructure. This applies just as
much to LLD as it does to Clang.You say that it isn't compelling to match Clang's design, but in fact it
is. You would need an overwhelming argument to *diverge* from Clang's
design.The fact that it makes the design more challenging is not compelling at
all. Yes, building libraries that can be re-used and making the binary
calling it equally efficient is more challenging, but that is the express
mission of LLVM and every project within it.The new one is designed as a command from day one. (Precisely speaking,
the original code propagates errors all the way up to the entry point, so
you can call it and expect it to always return. Rafael introduced error()
function later and we now depends on that function does not return.)I think this last was a mistake.
The fact that the code propagates errors all the way up is fine, and
even good. We don't necessarily need to be able to *recover* from link
errors and try some other path.But we absolutely need the design to be a *library* that can be embedded
into other programs and tools. I can't even begin to count the use cases
for this.So please, let's go back to where we *do not* rely on never-returning
error handling. That is an absolute mistake.If you want to consider changing that, we should have a fresh (and
broad) discussion, but it goes pretty firmly against the design of the
entire LLVM project. I also don't really understand why it would be
beneficial.I'm not against organizing it as a library as long as it does not make
things too complicatedI am certain that it will make things more complicated, but that is the
technical challenge that we must overcome. It will be hard, but I am
absolutely confident it is possible to have an elegant library design here.
It may not be as simple as a pure command line tool, but it will be
*dramatically* more powerful, general, and broadly applicable.The design of LLVM is not the simplest way to build a compiler. But it
is valuable to all of those working on it precisely because of this
flexibility imparted by its library oriented design. This is absolutely not
something that we should lose from the linker., and I guess reorganizing the existing code as a library is relatively
easy because it's still pretty small, but I don't really want to focus on
that until it becomes usable as an alternative to GNU ld or gold. I want to
focus on the linker features themselves at this moment. Once it's complete,
it becomes more clear how to organize it.Ok, now we're talking about something totally reasonable.
If it is easier for you all to develop this first as a command line
tool, and then make it work as a library, sure, go for it. You're doing the
work, I can hardly tell you how to go about it. ;]It is not only easier for me to develop but is also super important for
avoiding over-designing the API of the library. Until we know what we need
to do and what can be done, it is too easy to make mistake to design API
that is supposed to cover everything -- including hypothetical unrealistic
ones. Such API would slow down the development speed significantly, and
it's a pain when we abandon that when we realize that that was not needed.I'm very sympathetic to the problem of not wanting to design an API until
the concrete use cases for it appear. That makes perfect sense.We just need to be *ready* to extend the library API (and potentially find
a more fine grained layering if one is actually called for) when a
reasonable and real use case arises for some users of LLD. Once we have
people that actually have a use case and want to introduce a certain
interface to the library that supports it, we need to work with them to
figure out how to effectively support their use case.At the least, we clearly need the super simple interface[1] that the
command line tool would use, but an in-process linker could also probably
use.
Okay. I understood that fairly large number of people want to use the
linker without starting a new process even if it just provides super simple
interface which is essentially equivalent to command line options. That can
be done by removing a few global variables and sprinkle ErrorOr<> in many
places, so that you can call the linker's main() function from your
program. That's bothersome but should not be that painful. I put it on my
todo list. It's not at the top of the list, but I recognize the need and
will do at some point. Current top priorities are speed and achieving
feature parity with GNU -- we are tying to create a linker which everybody
wants to switch. Library design probably comes next. (And I guess if we
succeed on the former, the degree of the latter need raises, since more
people would want to use our linker.)
We might need minor extensions to effectively support Arseny's use case (I
think an in-process linker is a *very* reasonable thing to support, I'd
even like to teach the Clang driver to optionally work that way to be more
efficient on platforms like Windows). But I have to imagine that the
interface for an in-process static linker and the command line linker are
extremely similar if not precisely the same.At some point, it might also make sense to support more interesting
linking scenarios such as linking a PIC "shared object" that can be mapped
into the running process for JIT users. But I think it is reasonable to
build the interface that those users need when those users are ready to
leverage LLD. That way we can work with them to make sure we don't build
the wrong interface or an overly complicated one (as you say).
I can imagine that there may be a chance to support such API in future, but
I honestly don't know enough to say whether it makes sense or not at this
moment. Linking against the current process image is pretty different from
regular static linking, so most part of the linker is probably not useful.
Some part of relocation handling might be found useful, but it is too early
to say anything about that. We should revisit when the linker become mature
and an actual need arises.