lld: ELF/COFF main() interface

By organizing it as a library, I'm expecting something coarse. I don't
expect to reorganize the linker itself as a collection of small libraries,
but make the entire linker available as a library, so that you can link
stuff in-process. More specifically, I expect that the library would
basically export one function, link(std::vector<StringRef>), which takes
command line arguments, and returns a memory buffer for a newly created
executable. We may want to allow a mix of StringRef and MemoryBuffer as
input, so that you can directly pass in-memory objects to the linker, but
the basic idea remains the same.

Are we on the same page?

Let me answer this below, where I think you get to the core of the problem.

In the process of migrating from old lld ELF linker to new
(previously ELF2) I noticed the interface lost several important features
(ordered by importance for my use case):

1. Detecting errors in the first place. New linker seems to call
exit(1) for any error.

2. Reporting messages to non-stderr outputs. Previously all link
functions had a raw_ostream argument so it was possible to delay the error
output, aggregate it for multiple linked files, output via a different
format, etc.

3. Linking multiple outputs in parallel (useful for test drivers) in
a single process. Not really an interface issue but there are at least two
global pointers (Config & Driver) that refer to stack variables and are
used in various places in the code.

All of this seems to indicate a departure from the linker being
useable as a library. To maintain the previous behavior you'd have to use a
linker binary & popen.

Is this a conscious design decision or a temporary limitation?

That the new ELF and COFF linkers are designed as commands instead of
libraries is very much an intended design change.

I disagree.

During the discussion, there was a *specific* discussion of both the
new COFF port and ELF port continuing to be libraries with a common command
line driver.

There was a discussion that we would keep the same entry point for the
old and the new, but I don't remember if I promised that we were going to
organize the new linker as a library.

Ok, myself and essentially everyone else thought this was clear. If it
isn't lets clarify:

I think it is absolutely critical and important that LLD's architecture
remain one where all functionality is available as a library. This is *the*
design goal of LLVM and all of LLVM's infrastructure. This applies just as
much to LLD as it does to Clang.

You say that it isn't compelling to match Clang's design, but in fact it
is. You would need an overwhelming argument to *diverge* from Clang's
design.

The fact that it makes the design more challenging is not compelling at
all. Yes, building libraries that can be re-used and making the binary
calling it equally efficient is more challenging, but that is the express
mission of LLVM and every project within it.

The new one is designed as a command from day one. (Precisely speaking,
the original code propagates errors all the way up to the entry point, so
you can call it and expect it to always return. Rafael introduced error()
function later and we now depends on that function does not return.)

I think this last was a mistake.

The fact that the code propagates errors all the way up is fine, and
even good. We don't necessarily need to be able to *recover* from link
errors and try some other path.

But we absolutely need the design to be a *library* that can be embedded
into other programs and tools. I can't even begin to count the use cases
for this.

So please, let's go back to where we *do not* rely on never-returning
error handling. That is an absolute mistake.

If you want to consider changing that, we should have a fresh (and

broad) discussion, but it goes pretty firmly against the design of the
entire LLVM project. I also don't really understand why it would be
beneficial.

I'm not against organizing it as a library as long as it does not make
things too complicated

I am certain that it will make things more complicated, but that is the
technical challenge that we must overcome. It will be hard, but I am
absolutely confident it is possible to have an elegant library design here.
It may not be as simple as a pure command line tool, but it will be
*dramatically* more powerful, general, and broadly applicable.

The design of LLVM is not the simplest way to build a compiler. But it
is valuable to all of those working on it precisely because of this
flexibility imparted by its library oriented design. This is absolutely not
something that we should lose from the linker.

, and I guess reorganizing the existing code as a library is relatively
easy because it's still pretty small, but I don't really want to focus on
that until it becomes usable as an alternative to GNU ld or gold. I want to
focus on the linker features themselves at this moment. Once it's complete,
it becomes more clear how to organize it.

Ok, now we're talking about something totally reasonable.

If it is easier for you all to develop this first as a command line
tool, and then make it work as a library, sure, go for it. You're doing the
work, I can hardly tell you how to go about it. ;]

It is not only easier for me to develop but is also super important for
avoiding over-designing the API of the library. Until we know what we need
to do and what can be done, it is too easy to make mistake to design API
that is supposed to cover everything -- including hypothetical unrealistic
ones. Such API would slow down the development speed significantly, and
it's a pain when we abandon that when we realize that that was not needed.

I'm very sympathetic to the problem of not wanting to design an API until
the concrete use cases for it appear. That makes perfect sense.

We just need to be *ready* to extend the library API (and potentially find
a more fine grained layering if one is actually called for) when a
reasonable and real use case arises for some users of LLD. Once we have
people that actually have a use case and want to introduce a certain
interface to the library that supports it, we need to work with them to
figure out how to effectively support their use case.

At the least, we clearly need the super simple interface[1] that the
command line tool would use, but an in-process linker could also probably
use.

Okay. I understood that fairly large number of people want to use the
linker without starting a new process even if it just provides super simple
interface which is essentially equivalent to command line options. That can
be done by removing a few global variables and sprinkle ErrorOr<> in many
places, so that you can call the linker's main() function from your
program. That's bothersome but should not be that painful. I put it on my
todo list. It's not at the top of the list, but I recognize the need and
will do at some point. Current top priorities are speed and achieving
feature parity with GNU -- we are tying to create a linker which everybody
wants to switch. Library design probably comes next. (And I guess if we
succeed on the former, the degree of the latter need raises, since more
people would want to use our linker.)

We might need minor extensions to effectively support Arseny's use case (I

think an in-process linker is a *very* reasonable thing to support, I'd
even like to teach the Clang driver to optionally work that way to be more
efficient on platforms like Windows). But I have to imagine that the
interface for an in-process static linker and the command line linker are
extremely similar if not precisely the same.

At some point, it might also make sense to support more interesting
linking scenarios such as linking a PIC "shared object" that can be mapped
into the running process for JIT users. But I think it is reasonable to
build the interface that those users need when those users are ready to
leverage LLD. That way we can work with them to make sure we don't build
the wrong interface or an overly complicated one (as you say).

I can imagine that there may be a chance to support such API in future, but
I honestly don't know enough to say whether it makes sense or not at this
moment. Linking against the current process image is pretty different from
regular static linking, so most part of the linker is probably not useful.
Some part of relocation handling might be found useful, but it is too early
to say anything about that. We should revisit when the linker become mature
and an actual need arises.

Well, we definitely have a need for it, take a look at rtdyld in tree. It’d be nice to use the same relocation handling interfaces here.

-eric

Hi Rui,

This is tangential to the library design discussion, but may be interesting:

Just to clarify this: errors can be fatal as far as the linking process
is considered, but they should allow enough recovery to get rid of the
LLD instance and associated ressources.

Joerg

By organizing it as a library, I'm expecting something coarse. I don't
expect to reorganize the linker itself as a collection of small libraries,
but make the entire linker available as a library, so that you can link
stuff in-process. More specifically, I expect that the library would
basically export one function, link(std::vector<StringRef>), which takes
command line arguments, and returns a memory buffer for a newly created
executable. We may want to allow a mix of StringRef and MemoryBuffer as
input, so that you can directly pass in-memory objects to the linker, but
the basic idea remains the same.

Are we on the same page?

Let me answer this below, where I think you get to the core of the
problem.

In the process of migrating from old lld ELF linker to new
(previously ELF2) I noticed the interface lost several important features
(ordered by importance for my use case):

1. Detecting errors in the first place. New linker seems to call
exit(1) for any error.

2. Reporting messages to non-stderr outputs. Previously all link
functions had a raw_ostream argument so it was possible to delay the error
output, aggregate it for multiple linked files, output via a different
format, etc.

3. Linking multiple outputs in parallel (useful for test drivers)
in a single process. Not really an interface issue but there are at least
two global pointers (Config & Driver) that refer to stack variables and are
used in various places in the code.

All of this seems to indicate a departure from the linker being
useable as a library. To maintain the previous behavior you'd have to use a
linker binary & popen.

Is this a conscious design decision or a temporary limitation?

That the new ELF and COFF linkers are designed as commands instead
of libraries is very much an intended design change.

I disagree.

During the discussion, there was a *specific* discussion of both the
new COFF port and ELF port continuing to be libraries with a common command
line driver.

There was a discussion that we would keep the same entry point for the
old and the new, but I don't remember if I promised that we were going to
organize the new linker as a library.

Ok, myself and essentially everyone else thought this was clear. If it
isn't lets clarify:

I think it is absolutely critical and important that LLD's architecture
remain one where all functionality is available as a library. This is *the*
design goal of LLVM and all of LLVM's infrastructure. This applies just as
much to LLD as it does to Clang.

You say that it isn't compelling to match Clang's design, but in fact
it is. You would need an overwhelming argument to *diverge* from Clang's
design.

The fact that it makes the design more challenging is not compelling at
all. Yes, building libraries that can be re-used and making the binary
calling it equally efficient is more challenging, but that is the express
mission of LLVM and every project within it.

The new one is designed as a command from day one. (Precisely
speaking, the original code propagates errors all the way up to the entry
point, so you can call it and expect it to always return. Rafael introduced
error() function later and we now depends on that function does not return.)

I think this last was a mistake.

The fact that the code propagates errors all the way up is fine, and
even good. We don't necessarily need to be able to *recover* from link
errors and try some other path.

But we absolutely need the design to be a *library* that can be
embedded into other programs and tools. I can't even begin to count the use
cases for this.

So please, let's go back to where we *do not* rely on never-returning
error handling. That is an absolute mistake.

If you want to consider changing that, we should have a fresh (and

broad) discussion, but it goes pretty firmly against the design of the
entire LLVM project. I also don't really understand why it would be
beneficial.

I'm not against organizing it as a library as long as it does not make
things too complicated

I am certain that it will make things more complicated, but that is the
technical challenge that we must overcome. It will be hard, but I am
absolutely confident it is possible to have an elegant library design here.
It may not be as simple as a pure command line tool, but it will be
*dramatically* more powerful, general, and broadly applicable.

The design of LLVM is not the simplest way to build a compiler. But it
is valuable to all of those working on it precisely because of this
flexibility imparted by its library oriented design. This is absolutely not
something that we should lose from the linker.

, and I guess reorganizing the existing code as a library is
relatively easy because it's still pretty small, but I don't really want to
focus on that until it becomes usable as an alternative to GNU ld or gold.
I want to focus on the linker features themselves at this moment. Once it's
complete, it becomes more clear how to organize it.

Ok, now we're talking about something totally reasonable.

If it is easier for you all to develop this first as a command line
tool, and then make it work as a library, sure, go for it. You're doing the
work, I can hardly tell you how to go about it. ;]

It is not only easier for me to develop but is also super important for
avoiding over-designing the API of the library. Until we know what we need
to do and what can be done, it is too easy to make mistake to design API
that is supposed to cover everything -- including hypothetical unrealistic
ones. Such API would slow down the development speed significantly, and
it's a pain when we abandon that when we realize that that was not needed.

I'm very sympathetic to the problem of not wanting to design an API until
the concrete use cases for it appear. That makes perfect sense.

We just need to be *ready* to extend the library API (and potentially
find a more fine grained layering if one is actually called for) when a
reasonable and real use case arises for some users of LLD. Once we have
people that actually have a use case and want to introduce a certain
interface to the library that supports it, we need to work with them to
figure out how to effectively support their use case.

At the least, we clearly need the super simple interface[1] that the
command line tool would use, but an in-process linker could also probably
use.

Okay. I understood that fairly large number of people want to use the
linker without starting a new process even if it just provides super simple
interface which is essentially equivalent to command line options. That can
be done by removing a few global variables and sprinkle ErrorOr<> in many
places, so that you can call the linker's main() function from your
program. That's bothersome but should not be that painful. I put it on my
todo list. It's not at the top of the list, but I recognize the need and
will do at some point. Current top priorities are speed and achieving
feature parity with GNU -- we are tying to create a linker which everybody
wants to switch. Library design probably comes next. (And I guess if we
succeed on the former, the degree of the latter need raises, since more
people would want to use our linker.)

I remember talking with Rafael about some topics similar to what is in this
thread, and he pointed out something that I think is very important: all
the inputs to the linker are generated by other programs.
So the situation for LLD being used as a library is analogous to LLVM IR
libraries being used from clang. In general, clang knows that it just
generated the IR and that the IR is correct (otherwise it would be a bug in
clang), and thus it disables the verifier.
I suspect a number of the uses of "noreturn" error handling situations will
be pretty much in line with LLVM's traditional use of report_fatal_error
(which is essentially the same thing), and so we won't need to thread
ErrorOr through quite as many places as we might initially suspect.

-- Sean Silva

Having perfectly consistent object files does not necessarily mean that the linker is always able to link them because of various higher layer errors such as failing to resolve symbols. But yes, if you consider a compiler and the linker as one system, we can handle corrupted file as an internal error that should never happen, and I think most of error() calls are for such errors.

So the situation for LLD being used as a library is analogous to LLVM IR libraries being used from clang. In general, clang knows that it just generated the IR and that the IR is correct (otherwise it would be a bug in clang), and thus it disables the verifier.

That sounds good, but we can only usually trust our inputs. There are still some awfully crusty object files out in the world, so we need to verify any objects coming in from disk, and at least as it applies to libObject that doesn’t fit the current lazy error handling scheme.

In the process of migrating from old lld ELF linker to new (previously
ELF2) I noticed the interface lost several important features (ordered by
importance for my use case):

1. Detecting errors in the first place. New linker seems to call
exit(1) for any error.

2. Reporting messages to non-stderr outputs. Previously all link
functions had a raw_ostream argument so it was possible to delay the error
output, aggregate it for multiple linked files, output via a different
format, etc.

3. Linking multiple outputs in parallel (useful for test drivers) in a
single process. Not really an interface issue but there are at least two
global pointers (Config & Driver) that refer to stack variables and are
used in various places in the code.

All of this seems to indicate a departure from the linker being useable
as a library. To maintain the previous behavior you'd have to use a linker
binary & popen.

Is this a conscious design decision or a temporary limitation?

That the new ELF and COFF linkers are designed as commands instead of
libraries is very much an intended design change.

I disagree.

During the discussion, there was a *specific* discussion of both the new
COFF port and ELF port continuing to be libraries with a common command
line driver.

There was a discussion that we would keep the same entry point for the old
and the new, but I don't remember if I promised that we were going to
organize the new linker as a library. The new one is designed as a command
from day one.

My recollection is that the plan was to basically factor out of COFF/ELF2
and into LLVM whenever we saw the need to share functionality (not just
between COFF and ELF2, or with MCJIT, but in general).

I guess the question is whether eventually enough stuff will be factored
out into libraries that we will have a coherent set of reusable "linker
libraries" (in which case it might make sense to pull those out of LLVM and
into lld/{include,lib}) and the COFF/ELF2 command line programs will
naturally become "thin wrappers around the library code".

At this point it is unclear (to me, at least) whether that will happen or
not. If not, then in order to meet the use case of linker functionality in
a library, we will have to take the command line programs and do some
straightforward transformations to get a "main() in a library" type
interface (MemoryBuffer's instead of filenames, etc.) to meet those use
cases.

Of course, the "main() in a library" scenario is a bit depressing and we
should avoid it if possible, since it really does go against the whole LLVM
philosophy. The atom LLD warned us of the dangers of
overgeneralizing/overabstracting important details/specifics that linkers
actually have to deal with, so it's natural that COFF/ELF2 is a reaction to
this. We just need to make sure that we find the right middle ground and
don't get stuck in the opposite extreme that the atom LLD got stuck in.
There is a saying that "you have to go too far to know how far is far
enough"; I think right now we are exploring the "not a library / not
generalized / not abstracted" side of things and will incrementally pull
back towards the right middle ground as we gain experience and new and
interesting use cases come within reach.

-- Sean Silva

> So the situation for LLD being used as a library is analogous to LLVM
IR libraries being used from clang. In general, clang knows that it just
generated the IR and that the IR is correct (otherwise it would be a bug in
clang), and thus it disables the verifier.

That sounds good, but we can only usually trust our inputs. There are
still some awfully crusty object files out in the world, so we need to
verify any objects coming in from disk, and at least as it applies to
libObject that doesn't fit the current lazy error handling scheme.

FWIW I believe Kevin Enderby has used a similar up-front object
verification scheme before in CCTools, and we may end up implementing
something like that again if we do a custom MachO class (and relegate
MachOObjectFile to a view).

I'd imagine that verifying all input object files beforehand at the start
of linking would be a significant overhead since it reads all data whether
it will be used or not. For example, such safeguard would read all
relocations for comdat functions which will be uniquified and discarded by
the linker. So if we take this pass, I guess we don't want to have the
verifier as a part of the linker, but instead create that as an independent
feature (probably in libObject), and we call the verifier only for object
files that can be broken (e.g. read from disk instead of created by LLVM
itself.)

Hi Rui,

Yep. “Verifying up front” is an oversimplification. I don’t know the details of Kevin’s original scheme in CCTools, but I’m imagining that we would provide coarse-grained verification for different portions of the object file (e.g. verification of all load commands, verification of symbol tables, verification of relocations for a section), while providing accessors that assume a well-formed data structure. If/when verification gets run would be up to the tools that use the class. For the linker you could defer most of the verification until you know an object file will actually be used. If the linker is consuming trusted input from the compiler you might forgo verification altogether (that’s analogous to how we treat IR, as Sean pointed out). On the other hand, something like llvm-objdump might run all verification up front, since it’s not time-sensitive, and it’s nice to be able warned loudly about malformed objects even if the portion you asked about wasn’t malformed.

As for the impact on library design: Any library interface that can’t assume good input is going to have some sort of error return. None of this affects that fundamental requirement, but it might change the granularity of error-checking within the library.

Cheers,
Lang.

Sorry for being late on this thread.

I just wanted to say I am strongly on Rui's side on this one.

There current design is for lld *not* not be a library and I think
that is important. That has saved us a tremendous amount of work for
doing library like code and a lot of design for library interfaces.
The comparison of old and new ELF code is night and day as far as
productivity and performance are concerned. Designing right now would
be premature because it is not clear what commonalities there will be
on how to refactor them.

For example, both MCJIT and lld apply relocations, but there are
tremendously different options on how to factor this

* Have MC produce position dependent code and MCJIT would be a bit
more like other jits and not need relocations.
* Move relocation processing to LLVM somewhere and have lld and MCJIT use it.
* Have MC produce shared objects directly, saving MCJIT the
complication of using relocatable objects.
* Have MCJIT use lld as trivial library that implements "ld foo.o -o
foo.so -shared".

The situation is even less clear for the other parts we are missing in
llvm: objcopy, readelf, etc.

We have to discuss and prototype these before we can make a decision.
Committing now would be premature design and stall the progress on one
thing we are sure we need: A high quality, bsd license linker. Lets
get that implemented. While that MCJIT will move along and we will be
in a position to productively discuss what can be shared and at what
cost (complexity and performance).

Last but not least, anything that is not needed in two different areas
should remain application code. The only point of paying the
complexity of writing a library is if it is used.

Cheers,
Rafael

Sorry for being late on this thread.

I just wanted to say I am strongly on Rui’s side on this one.

There current design is for lld not not be a library and I think
that is important. That has saved us a tremendous amount of work for
doing library like code and a lot of design for library interfaces.
The comparison of old and new ELF code is night and day as far as
productivity and performance are concerned. Designing right now would
be premature because it is not clear what commonalities there will be
on how to refactor them.

For example, both MCJIT and lld apply relocations, but there are
tremendously different options on how to factor this

  • Have MC produce position dependent code and MCJIT would be a bit
    more like other jits and not need relocations.
  • Move relocation processing to LLVM somewhere and have lld and MCJIT use it.
  • Have MC produce shared objects directly, saving MCJIT the
    complication of using relocatable objects.
  • Have MCJIT use lld as trivial library that implements “ld foo.o -o
    foo.so -shared”.

The situation is even less clear for the other parts we are missing in
llvm: objcopy, readelf, etc.

We have to discuss and prototype these before we can make a decision.
Committing now would be premature design and stall the progress on one
thing we are sure we need: A high quality, bsd license linker. Lets
get that implemented. While that MCJIT will move along and we will be
in a position to productively discuss what can be shared and at what
cost (complexity and performance).

Last but not least, anything that is not needed in two different areas
should remain application code. The only point of paying the
complexity of writing a library is if it is used.

I strongly disagree about some of this, but agree about other aspects. I feel like there are two issues conflated here:

  1. Having a fundamentally library-oriented structure of code and design philosophy.

  2. Having general APIs for a library of code that allows it to be reused in different ways by different clients.

For #1, let me indicate the kinds of things I’m thinking about here:

  • Cannot rely on global state
  • Cannot directly call “exit” (but can call “abort” for programmer errors like asserts)
  • Cannot leak memory

There are probably others, but this is the gist of it. Now, you could still design everything with the simplest imaginable API, that is incredibly narrow and specialized for a single user. But there are still fundamentals of the style of code that are absolutely necessary to build a library. And the only way to make sure we get this right, is to have the single user of the code use it as a library and keep all the business logic inside the library.

This pattern is fundamental to literally every part of LLVM, including Clang, LLDB, and thus far LLD. I think it is a core principle of the project as a whole. I think that unless LLD continues to follow this principle, it doesn’t really fit in the LLVM project at all.

But for #2, I actually completely agree with you. We will never guess the right general purpose API for different users to share logic until we actually have those different users. I very much like lazy design of APIs as users for those APIs arrive. It’s one of the reasons I’m so strongly in favor of the lack of API stability in LLVM – it allows us to figure these APIs out as the actual use cases emerge and we learn what they need to do.

One of the nice things about changing APIs though is that there tends to be a clear incremental path to evolve the API. But if your code doesn’t use basic memory management techniques, or if even reportable errors (as opposed to asserted programmer errors) are inherently fatal, fixing that can be incredibly hard and present a huge barrier to adoption of the library.

So, I encourage LLD to keep its interfaces highly specialized for the users it actually has – and indeed today that may be exactly one user, the command line linker.

But when a new user for the libraries arrives, it needs to adapt to support an API that they can use, provided the use case is reasonable for the LLD code to support.

And most importantly, it needs to be engineered as at least a fundamentally library oriented body of code.

Finally, I will directly state that we (Google) have a specific interest in both linking LLD libraries into the Clang executable rather than having separate binaries, and in invoking LLD to link many different executables from a single process. So there is at least one concrete user here today. Now, the API we would need for both of these is exactly the API that the existing command line linker would need. But the functionality would have to be reasonable to access via a library call.

-Chandler

I haven't heard of that until now. :slight_smile: What is the point of doing that?

There are probably others, but this is the gist of it. Now, you could still
design everything with the simplest imaginable API, that is incredibly
narrow and specialized for a *single* user. But there are still fundamentals
of the style of code that are absolutely necessary to build a library. And
the only way to make sure we get this right, is to have the single user of
the code use it as a library and keep all the business logic inside the
library.

This pattern is fundamental to literally every part of LLVM, including
Clang, LLDB, and thus far LLD. I think it is a core principle of the project
as a whole. I think that unless LLD continues to follow this principle, it
doesn't really fit in the LLVM project at all.

The single user so far is the one the people actually coding the
project care for. I seems odd to say that it doesn't fit in the LLVM
project when it has attracted a lot of contributors and hit some
important milestones.

So, I encourage LLD to keep its interfaces highly specialized for the users
it actually has -- and indeed today that may be exactly one user, the
command line linker.

We have a highly specialized api consisting of one function:
elf2::link(ArrayRef<const char *> Args). That fits 100% of the uses we
have. If there is ever another use we can evaluate the cost of
supporting it, but first we need to actually write the linker.

Note that this is history replaying itself in a bigger scale. We used
to have a fancy library to handle archives and llvm-ar was written on
top of it. It was the worst ar implementation by far. It had horrible
error handling, incompatible options and produced ar files with
indexes that no linker could use.

I nuked the library and wrote llvm-ar as the trivial program that it
is. To the best of my knowledge it was then the fastest ar in
existence, actually useful (linkers can use it's .a files) and far
easier to maintain.

When the effort to support windows came up, there was a need to create
archives from within lld since link.exe can run lib.exe. The
maintainable code was easy to refactor into one library function
llvm::writeArchive. If another use ever show up, we evaluate it. If
not, we keep the very narrow interface.

Finally, I will directly state that we (Google) have a specific interest in
both linking LLD libraries into the Clang executable rather than having
separate binaries, and in invoking LLD to link many different executables
from a single process. So there is at least one concrete user here today.
Now, the API we would need for both of these is *exactly* the API that the
existing command line linker would need. But the functionality would have to
be reasonable to access via a library call.

Given that clang can fork, I assume that this new clang+lld can fork.
If so, you might actually already be able to do it, just call
elf2::link(ArrayRef<const char *> Args) in a new process. It is
guaranteed to not crash your program or leak resources (short of a
kernel bug).

Cheers,
Rafael

FWIW I totally agree with all of Chandler’s points.

There are probably others, but this is the gist of it. Now, you could still
design everything with the simplest imaginable API, that is incredibly
narrow and specialized for a single user. But there are still fundamentals
of the style of code that are absolutely necessary to build a library. And
the only way to make sure we get this right, is to have the single user of
the code use it as a library and keep all the business logic inside the
library.

This pattern is fundamental to literally every part of LLVM, including
Clang, LLDB, and thus far LLD. I think it is a core principle of the project
as a whole. I think that unless LLD continues to follow this principle, it
doesn’t really fit in the LLVM project at all.

The single user so far is the one the people actually coding the
project care for. I seems odd to say that it doesn’t fit in the LLVM
project when it has attracted a lot of contributors and hit some
important milestones.

I don’t think that every open source effort relating to compilers belongs in the LLVM project. I think that they would have to fit with the overarching goals and design of the LLVM project as a whole. This includes, among other things, being modular and reusable.

So, I encourage LLD to keep its interfaces highly specialized for the users
it actually has – and indeed today that may be exactly one user, the
command line linker.

We have a highly specialized api consisting of one function:
elf2::link(ArrayRef<const char *> Args). That fits 100% of the uses we
have. If there is ever another use we can evaluate the cost of
supporting it, but first we need to actually write the linker.

Note that I’m perfectly happy with this interface today, provided it is genuinely built as a library and can be used in that context. See below.

Note that this is history replaying itself in a bigger scale. We used
to have a fancy library to handle archives and llvm-ar was written on
top of it. It was the worst ar implementation by far. It had horrible
error handling, incompatible options and produced ar files with
indexes that no linker could use.

I nuked the library and wrote llvm-ar as the trivial program that it
is. To the best of my knowledge it was then the fastest ar in
existence, actually useful (linkers can use it’s .a files) and far
easier to maintain.

The fact that it was in a library, IMO, is completely orthogonal from the fact that the design of that library ended up not working.

Bad library code is indeed bad. That doesn’t mean that it is terrible hard to write good library code, as you say:

When the effort to support windows came up, there was a need to create
archives from within lld since link.exe can run lib.exe. The
maintainable code was easy to refactor into one library function
llvm::writeArchive. If another use ever show up, we evaluate it. If
not, we keep the very narrow interface.

Yes, +1 to narrow interface, but I think it should always be in a library. That impacts much more than just the interface.

Finally, I will directly state that we (Google) have a specific interest in
both linking LLD libraries into the Clang executable rather than having
separate binaries, and in invoking LLD to link many different executables
from a single process. So there is at least one concrete user here today.
Now, the API we would need for both of these is exactly the API that the
existing command line linker would need. But the functionality would have to
be reasonable to access via a library call.

Given that clang can fork, I assume that this new clang+lld can fork.

No, it cannot in all cases. We have genuine use cases where forking isn’t realistically an option. As an example, imagine that you want to use something like ClangMR (which I presented ages ago) but actually link code massively at scale to do post-link analysis of the binaries? There are environments where we need to be able to run the linker on multiple different threads in a single address space and collect the linked object in an in-memory buffer.

Also, one of the other possible motivations of using LLD directly from Clang would be to avoid process overhead on operating systems where that is a much more significant part of the compile time cost. We could today actually take the fork out of the Clang driver because the Clang frontend is designed in this way. But we would also need LLD to work in this way.

As a person who started this thread I should probably comment on the interface.

My needs only require a library-like version of a command-line interface. Just to be specific, the interface that would work okay is the old high-level lld interface:

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics)

This would require round-tripping data through files which is not ideal but is not too bad.

The ideal version for my cases would be

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics, ArrayRef<unique_ptr> inputs)

This is slightly harder to implement since you have to have a layer of argument parsing in the lld command-line driver that separates inputs from other args, but it’s probably not too bad.

So note that the easiest interface that would satisfy my needs is similar to a command-line interface; it should not write errors to stderr and should return on errors instead of aborting. These requirements do not seem like they would severely complicate the design. They also do not seem contentious - there is no risk of overdesigning! I don’t see a reason why having this interface would be bad. The only “challenge” really is error propagation, which is mostly an implementation concern and, as mentioned previously, there could be partial solutions where you rely on validity of inputs within reasonable boundaries and/or provide separate validation functions and keep the core of the linker lean.

Also, not all platforms have forking working, or fast process startup times, or any other things that may be used to suggest workarounds for the current situations. I can already invoke system linker using these facilities - lld is (was) attractive exactly because it does not require this! (plus it provides more consistent performance, it’s way easier to debug/enhance/profile within one process etc. etc.) Honestly, old lld is very much like LLVM in that I can freely use it in my environment, whereas new lld reminds me of my experience integrating mcpp (a C preprocessor) a few years ago into an in-engine shader compiler with all the same issues (exit() on error, stdout, etc.) that had to be worked around by modifying the codebase.

Arseny

As a person who started this thread I should probably comment on the
interface.

My needs only require a library-like version of a command-line interface.
Just to be specific, the interface that would work okay is the old
high-level lld interface:

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics)

This would require round-tripping data through files which is not ideal
but is not too bad.

The ideal version for my cases would be

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics,
ArrayRef<unique_ptr<MemoryBuffer>> inputs)

This is slightly harder to implement since you have to have a layer of
argument parsing in the lld command-line driver that separates inputs from
other args, but it's probably not too bad.

So note that the easiest interface that would satisfy my needs is similar
to a command-line interface; it should not write errors to stderr and
should return on errors instead of aborting. These requirements do not seem
like they would severely complicate the design. They also do not seem
contentious - there is no risk of overdesigning! I don't see a reason why
having this interface would be bad. The only "challenge" really is error
propagation, which is mostly an implementation concern and, as mentioned
previously, there could be partial solutions where you rely on validity of
inputs within reasonable boundaries and/or provide separate validation
functions and keep the core of the linker lean.

Also, not all platforms have forking working, or fast process startup
times, or any other things that may be used to suggest workarounds for the
current situations. I can already invoke system linker using these
facilities - lld is (was) attractive *exactly* because it does not require
this! (plus it provides more consistent performance, it's way easier to
debug/enhance/profile within one process etc. etc.) Honestly, old lld is
very much like LLVM in that I can freely use it in my environment, whereas
new lld reminds me of my experience integrating mcpp (a C preprocessor) a
few years ago into an in-engine shader compiler with all the same issues
(exit() on error, stdout, etc.) that had to be worked around by modifying
the codebase.

I'm not going to argue that we don't want to support the library use
scenario as we discussed in this thread, but I'd like to point out that it
is natural that all users want to have a linker that is usable both as a
command and as a library, because if you compare a linker/linker-library
with just a linker, the former is precisely a super set of the latter. If I
were a user, I definitely want the former instead of the latter because the
former just provides more. In that sense, only the developers would argue
that there's a good reason to provide less (and that's the case as only
Rafael and I were arguing that way). The current design is not the result
of a short-sighted choice but the result of a deliberate trade-off. I
worked on the old LLD for more than a year, and I made a decision based on
that experience. I'm still thinking that that was a good design choice to
not start writing the new LLD as a library. Again, we are open to future
changes, but it is probably not the right time.

Arseny

As a person who started this thread I should probably comment on the
interface.

My needs only require a library-like version of a command-line interface.
Just to be specific, the interface that would work okay is the old
high-level lld interface:

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics)

This would require round-tripping data through files which is not ideal
but is not too bad.

The ideal version for my cases would be

bool link(ArrayRef<const char*> args, raw_ostream& diagnostics,
ArrayRef<unique_ptr<MemoryBuffer>> inputs)

This is slightly harder to implement since you have to have a layer of
argument parsing in the lld command-line driver that separates inputs from
other args, but it's probably not too bad.

So note that the easiest interface that would satisfy my needs is similar
to a command-line interface; it should not write errors to stderr and
should return on errors instead of aborting. These requirements do not seem
like they would severely complicate the design. They also do not seem
contentious - there is no risk of overdesigning! I don't see a reason why
having this interface would be bad. The only "challenge" really is error
propagation, which is mostly an implementation concern and, as mentioned
previously, there could be partial solutions where you rely on validity of
inputs within reasonable boundaries and/or provide separate validation
functions and keep the core of the linker lean.

Also, not all platforms have forking working, or fast process startup
times, or any other things that may be used to suggest workarounds for the
current situations. I can already invoke system linker using these
facilities - lld is (was) attractive *exactly* because it does not require
this! (plus it provides more consistent performance, it's way easier to
debug/enhance/profile within one process etc. etc.) Honestly, old lld is
very much like LLVM in that I can freely use it in my environment, whereas
new lld reminds me of my experience integrating mcpp (a C preprocessor) a
few years ago into an in-engine shader compiler with all the same issues
(exit() on error, stdout, etc.) that had to be worked around by modifying
the codebase.

I'm not going to argue that we don't want to support the library use
scenario as we discussed in this thread, but I'd like to point out that it
is natural that all users want to have a linker that is usable both as a
command and as a library, because if you compare a linker/linker-library
with just a linker, the former is precisely a super set of the latter. If I
were a user, I definitely want the former instead of the latter because the
former just provides more. In that sense, only the developers would argue
that there's a good reason to provide less (and that's the case as only
Rafael and I were arguing that way). The current design is not the result
of a short-sighted choice but the result of a deliberate trade-off. I
worked on the old LLD for more than a year, and I made a decision based on
that experience. I'm still thinking that that was a good design choice to
not start writing the new LLD as a library. Again, we are open to future
changes, but it is probably not the right time.

To pull from Chandler's email, the things that are being asked are pretty
straightforward:
"""
- Cannot rely on global state
- Cannot directly call "exit" (but can call "abort" for *programmer* errors
like asserts)
- Cannot leak memory
"""

These are pretty basic and I think LLD is actually quite close (for
example, I think the LLD developers currently consider it a bug if LLD
leaks memory) and the global state is already nicely compartmentalized into
some structs (which I think have some sort of actual lifetime expectations
and aren't just willy-nilly global state).

Concretely speaking, I think that all LLD needs to make everybody in this
thread happy is take the current global state (which is fairly nicely
compartmentalized anyway) and thread it through (including a handler for
"exit"). Threading this state through to where it needs to get is never
going to get easier (and is likely to get harder, possibly spiraling out of
control as mutable global state tends to do).

-- Sean Silva

If I were a user, I definitely want the former instead of the latter because the former just provides more.

This is if you wanted to use the library (e.g. embed linker into clang, do parallel linking of many executables from the same process, etc.). For some use cases there’s no difference because the only thing you’ll do with a library is link it into a command-line executable and run it.

The current design is not the result of a short-sighted choice but the result of a deliberate trade-off.

I don’t really understand where this is coming from, honestly. I understand the frustration of dealing with layers of abstractions that do not fit perfectly within the established framework. I do not understand the resistance to not using global state and propagating errors to the top level.

Arseny