LLD status update and performance chart

I am on PTO, so slow to respond.

Some items that are left:

* Debug fission
* Single file debug fission
* Range extension thunks
* All of freebsd links and works
* Very good performance when all that is in

Looks we have initial version of debug fusion implemented.
r289790, r289810 commits from yesterday did the rest of main job I believe.
I do not know what is "Single file debug fission" ? (quick googling gives nothing and I never heard about that before I think)

George.

I talked several people and found that this is more like a communication issue rather than a technical/philosophical issue. I believe communication problems won’t solve themselves. As a person who is on the owners file of LLD, I think I need to say something about that issue. Also, I guess people who were just watching this thread wondered why my happy pre-holiday status report suddenly turned into a heated discussion, and they are probably still wondering what’s wrong with LLD. I want to address that, too.

So, as a project, there is no anti-library policy in LLD. I think this is the misunderstanding one side had. We already provide main-as-a-library feature so that you can embed the linker to your program. We as a project welcome other ideas to export linker features at a well-defined boundary. For example, I think abstracting the file system access so that you can hook file operations could be a well-defined, useful API for those who want to do in-memory linking (I expressed that opinion already in this thread). Just like LLVM, we won’t guarantee API compatibility between releases, and we are unlikely to be able to expose deep internals of the linker, but as long as you think you found a reasonable coarse API boundary, there should be nothing preventing you from bringing that to the table.

On the other hand, as far as I talked, no one who is on the “library” side requested LLD expose deep internals. This is the misunderstanding the other side had. If we as a project said that LLD should not support any library interface at all, they would be upset and speak out loudly, but again, that’s not a project policy.

So, correct me if I’m wrong, but I don’t see no serious conflicts here. The conflict I saw in the thread is I believe superficial, and I strongly believe that it could have been addressed calmly and nicely if we have used more words to explain thoughts instead of small number of strong words.

Hope this helps.

Rui

Hi Rui,

Thank you for your comments. I agree with your views that the issue
was much more about communication than technical. It's how things are
said, rather that what is said, and we should put that aside.

On the technical side, I agree that the project needs a solid base
before fancy new features get incorporated, but I also believe that
parallel development, even on trunk, can happen as it does in LLVM
(see GlobalISel, pass managers, register allocators, back-ends).

I'm very happy that LLD works well on AArch64, bootstrapping Clang and
passing the test-suite. We aim to reach the same objective next year
on ARM, and go beyond. I'm also happy that the FreeBSD community is
looking at it with serious eyes and trying to make it the default
linker on x86_64.

In the interest of collaboration, I think we should set some goals for
the project in general, getting feedback from the community that works
in it and the stakeholders, at least until we reach production quality
in one architecture. From my point of view, having LLD as the default
linker for FreeBSD/ELF/x86_64 is a strong indication that it "works
well in a large range of situations". I'd happily call that stage
"Production Beta". But that's not all. We need other architectures
(AArch64 and ARM will probably come next), as well as different object
formats (COFF, MachO) to be able to call it "Production Stable".

However, my humble opinion is that we don't need to be in "Production
Stable" to start adding new features, especially when that means a
larger portion of the LLVM ecosystem will begin to contribute more to
the project. Those new features can stay disabled by default and be
isolated from the main code, like we do in LLVM. Yes, that will mean
more work for all of us. Yes, that will mean longer test cycles, more
test configurations. But that will also mean more people working on
it, validating in ways you didn't even know it was possible. That
value is worth the extra trouble, IMVHO, and LLVM's success is living
proof of that.

LLD may be a separate project, and a young one full of energy, but it
is also an "LLVM Project". As such, it's bound to the same level of
design goals that all LLVM projects share. Not all share all values,
but two that we share amongst all projects is collaboration and
modularisation. Those values reflect our multiple ranges of users and
developers as well as the need to re-use code above raw performance.

As was said in this list, Clang is not faster than GCC for a long
time, even though it was one of the shiny distinctions. The faster the
code we produce, the slower the compiler runs. It's an obvious
relationship, and one that will (hopefully) happen with the linker if
it has any expectation of being used by the whole community, not just
the limited number of developers today.

cheers,
--renato

Thank you for writing this!

I totally agree with this characterization of the discussion as a huge misunderstanding – I was planning to write an email saying effectively the same thing, except, posed more as a question. :slight_smile:

Hi Rui

I agree separating the components out in to libraries only makes sense when there is a clear reason to do so. However, just this year there was a very involved discussion about what it means to be a library. Specifically, I don’t think your current ‘main-as-library’ argument is valid while you call exit or (if you) rely on mutable global state. Having a single entry point via a main function is fine, but that function cannot then kill the process which its linked in to.

If you want context then the relevant piece of the thread is http://lists.llvm.org/pipermail/llvm-dev/2016-January/093760.html.

Arseny summarized things very well there, so i’ll just quote him at the end here. I understand that you and others want to first write a fast linker tool and i don’t think anyone has any problem with that, but there is also a clear desire from folks to have it be usable as a library and I would hope any patches to do so are accepted, even if they make the code more complex, or slower.

>>><i> On Thu, Jan 7, 2016 at 7:03 AM, Arseny Kapoulkine via llvm-dev <
</i>>>><i> [llvm-dev at lists.llvm.org](http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev)> wrote:
</i>>>> >>>><i> In the process of migrating from old lld ELF linker to new (previously
</i>>>>><i> ELF2) I noticed the interface lost several important features (ordered by
</i>>>>><i> importance for my use case):
</i>>>>> >>>><i> 1. Detecting errors in the first place. New linker seems to call
</i>>>>><i> exit(1) for any error.
</i>>>>> >>>><i> 2. Reporting messages to non-stderr outputs. Previously all link
</i>>>>><i> functions had a raw_ostream argument so it was possible to delay the error
</i>>>>><i> output, aggregate it for multiple linked files, output via a different
</i>>>>><i> format, etc.
</i>>>>> >>>><i> 3. Linking multiple outputs in parallel (useful for test drivers) in a
</i>>>>><i> single process. Not really an interface issue but there are at least two
</i>>>>><i> global pointers (Config & Driver) that refer to stack variables and are
</i>>>>><i> used in various places in the code.
</i>>>>> >>>><i> All of this seems to indicate a departure from the linker being useable
</i>>>>><i> as a library. To maintain the previous behavior you'd have to use a linker
</i>>>>> *binary & popen.*

Pete

> So, correct me if I'm wrong, but I don't see no serious conflicts here.
The
> conflict I saw in the thread is I believe superficial, and I strongly
> believe that it could have been addressed calmly and nicely if we have
used
> more words to explain thoughts instead of small number of strong words.

Hi Rui,

Thank you for your comments. I agree with your views that the issue
was much more about communication than technical. It's how things are
said, rather that what is said, and we should put that aside.

On the technical side, I agree that the project needs a solid base
before fancy new features get incorporated, but I also believe that
parallel development, even on trunk, can happen as it does in LLVM
(see GlobalISel, pass managers, register allocators, back-ends).

I'm very happy that LLD works well on AArch64, bootstrapping Clang and
passing the test-suite. We aim to reach the same objective next year
on ARM, and go beyond. I'm also happy that the FreeBSD community is
looking at it with serious eyes and trying to make it the default
linker on x86_64.

In the interest of collaboration, I think we should set some goals for
the project in general, getting feedback from the community that works
in it and the stakeholders, at least until we reach production quality
in one architecture. From my point of view, having LLD as the default
linker for FreeBSD/ELF/x86_64 is a strong indication that it "works
well in a large range of situations". I'd happily call that stage
"Production Beta". But that's not all. We need other architectures
(AArch64 and ARM will probably come next), as well as different object
formats (COFF, MachO) to be able to call it "Production Stable".

However, my humble opinion is that we don't need to be in "Production
Stable" to start adding new features, especially when that means a
larger portion of the LLVM ecosystem will begin to contribute more to
the project. Those new features can stay disabled by default and be
isolated from the main code, like we do in LLVM. Yes, that will mean
more work for all of us. Yes, that will mean longer test cycles, more
test configurations. But that will also mean more people working on
it, validating in ways you didn't even know it was possible. That
value is worth the extra trouble, IMVHO, and LLVM's success is living
proof of that.

Currently, as you know, we are working on as-needed basis. There are people
working on AArch64, MIPS, FreeBSD, performance optimization, code
maintainability, better error reporting, etc. We opened many fronts already
because people wish to work on these fronts, so there should be nothing
prevents us from adding one or two more.

It may be good to compile a wishlist about a linker to collect feature
ideas people wish to use. I honestly know only one major request: embedding
a linker to a program. I guess that this single feature satisfies a
majority of needs as per 80:20 rule, but I really want to know what people
wish to have.

LLD may be a separate project, and a young one full of energy, but it

is also an "LLVM Project". As such, it's bound to the same level of
design goals that all LLVM projects share. Not all share all values,
but two that we share amongst all projects is collaboration and
modularisation. Those values reflect our multiple ranges of users and
developers as well as the need to re-use code above raw performance.

I agree with a fine print that I wish people working on LLVM and clang
aware that best practices in compilers may not be directly transferable to
linkers. Linkers and compilers are different kinds of programs. They are
not that different than, say, clang and llvm-objdump, but the two are still
different for sane reasons. In terms of size, LLD will never become as
large as LLVM or Clang and will stay one or two orders of magnitude smaller
than them. Naturally a lot of things can be different. Therefore, "because
we do that in LLVM or clang" is not that convincing unless it also mention
that it also makes sense in the linker's context. (Renato, I know you are
not saying that.)

We already owe a lot to the main LLVM project. The reason why LLD is small
and easy to maintain is partly because it uses libObject, libSupport, ADT
and libLTO (and in that sense it's already modularized into small pieces).
The first thing that comes up to my mind with regard to pushing LLD forward
as an LLVM project is to move more code to LLVM libraries so that we can
make LLD smaller. What do you think?

As was said in this list, Clang is not faster than GCC for a long

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not corrupted.
If you are doing in-memory linking, I think it is unlikely that the object
files in memory are corrupted (especially when you just created them using
LLVM), so I think this satisfies most users needs in practice. Do you have
a concern about that?

For the situation that you need to handle foreign object files in the same
process (I'd recommend you to sandbox a process in that case though), we
can write a verifier to check for file correctness rigorously so that we
can guarantee that object files are as trustworthy as freshly-created
object files. I think this feature is a reasonable addition to the linker.

As to the mutable shared state, my current (unproved) idea is to make them
thread local variables. Since no one yet has come up to say "hey, we are
actually trying to run multiple instances of the linker in the same process
simultaneously but LLD doesn't allow that", that's not implemented yet, but
technically I think it's doable, and that's needless to say a reasonable
feature request.

As I repeatedly said in the thread that speed is not the only goal for us.
Honestly, it's going to be the best selling point of LLD, because most
people do not use that many linker features but just use it to create
executables (and sometimes wait for a long period of time). I reported
about the performance in this thread because I thought people would be
happy to hear the speed improvement we've made this year. Also, because I
was happy about that, I probably emphasized that too much. But that's not
our single goal.

If you want context then the relevant piece of the thread is

Ultimately my concern is that there is any code path calling exit. I would say that this prevents the lld library from being used in-process. But others opinions may differ, and I honestly don’t have a use case in mind, just that I don’t think library code should ever call exit.

That sounds great. Having written some parts of the MachO lld linker and seen Kevin’s work on llvm-objdump, I can appreciate that is not easy. For example, I wrote the logic to process EH FDE’s which may need to error out if invalid. You don’t necessarily want to validate them all up front as it may be too slow, so I can understand that this isn’t necessarily trivial to handle in a performant way.

LLVM uses the LLVMContext for this (and begs users to look the other way with regards to cl::opt’s). I don’t know if there’s been a discussion in LLVM about whether TLV’s would be better there too, but seems like a reasonable discussion to have. Certainly I don’t think anyone should say you can’t use them without good reason.

I meant to commend you for both sending out a summary email, and the results. Having this fast a linker on ELF/COFF is going to be a huge win for developers. And I personally really like status updates for major projects/features as it can be hard to follow along with all the email traffic. So thank you for doing that.

My only concern with performance is that I felt like you would be against changes to the code which make it slower but add functionality. Error handling is such a use case. LLVM and clang continue to get bigger each year and sometimes that means a little slower too. The linker may be faster next year than it is now, or it may be slower but have a feature which makes that a worthwhile tradeoff. I don’t want to slow down any of the code for any reason, but its natural that sometimes it’ll happen with good reason.

Thanks,
Pete

It may be good to compile a wishlist about a linker to collect feature ideas
people wish to use. I honestly know only one major request: embedding a
linker to a program. I guess that this single feature satisfies a majority
of needs as per 80:20 rule, but I really want to know what people wish to
have.

Indeed. Getting a list of what people need and who's committing to do
those things is a good idea.

At this stage, we should refrain from adding any wild wish, or it'll
be impossible to do anything.

But given developer involvement in adding functionality as well as
keeping the promise of stability and performance, a next-steps list is
a good way to go.

I agree with a fine print that I wish people working on LLVM and clang aware
that best practices in compilers may not be directly transferable to
linkers. Linkers and compilers are different kinds of programs.

Wholeheartedly agree!

This is already true for compiler-RT, libc++, the test-suite, polly
and many other "LLVM branded" projects.

The first thing that comes up to my mind with regard to pushing LLD forward
as an LLVM project is to move more code to LLVM libraries so that we can
make LLD smaller. What do you think?

As you and Rafael have said in this thread, the reusable part of
linkers is not that big, so having LLD libraries in objdump (for
example) may not be practical, but having a separate (small) symbol
handling library that both use could be a potential way forward. I
don't know enough about LLD or objdump to have any concrete opinion,
but I have a feeling that objdump is a much bigger program than it
should be.

People usually say it's because the code is not really reusable. That
may be true, but if we can find reusability on small tools and LLD,
that'd at least reduce code a bit. Given that LLD already depends on
Clang and LLVM, there's no bad side of the increased dependency. (is
there?).

But I would personally follow a more pragmatic approach. It should be
ok for LLD to have its own infrastructure, as long as it's not
completely duplicating and increasing maintenance for both LLD and
LLVM developers.

cheers,
--renato

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not corrupted.
If you are doing in-memory linking, I think it is unlikely that the object
files in memory are corrupted (especially when you just created them using
LLVM), so I think this satisfies most users needs in practice. Do you have
a concern about that?

Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.

There is a duality of LLD: lld-as-a-command and lld-as-a-library. This
duality is not necessarily a bad thing. Given that we have a verifier, any
path that leads to check for an impossible error condition and call exit()
should be thought as an assert() when it is used as a library since they
should never happen or there is a bug in code (and that's what assert
actually does). We already have lots of asserts in our libraries, and
that's I think essentially the same.

For the situation that you need to handle foreign object files in the same
process (I'd recommend you to sandbox a process in that case though), we
can write a verifier to check for file correctness rigorously so that we
can guarantee that object files are as trustworthy as freshly-created
object files. I think this feature is a reasonable addition to the linker.

That sounds great. Having written some parts of the MachO lld linker and
seen Kevin's work on llvm-objdump, I can appreciate that is not easy. For
example, I wrote the logic to process EH FDE's which may need to error out
if invalid. You don't necessarily want to validate them all up front as it
may be too slow, so I can understand that this isn't necessarily trivial to
handle in a performant way.

That's I don't know yet. My gut is that doing error checking beforehand
makes code easy to read and maintain, just like semantic analysis doesn't
have to handle syntactic errors. But I don't know the answer, so I cannot
exclude neither possibilities. We have to experiment that and compare.

As to the mutable shared state, my current (unproved) idea is to make them
thread local variables. Since no one yet has come up to say "hey, we are
actually trying to run multiple instances of the linker in the same process
simultaneously but LLD doesn't allow that", that's not implemented yet, but
technically I think it's doable, and that's needless to say a reasonable
feature request.

LLVM uses the LLVMContext for this (and begs users to look the other way
with regards to cl::opt's). I don't know if there's been a discussion in
LLVM about whether TLV's would be better there too, but seems like a
reasonable discussion to have. Certainly I don't think anyone should say
you can't use them without good reason.

That's also another thing no one knows the answer. As far as I can say,
global states in the LLD/ELF makes things easy to maintain, and looks like
a majority of people working on it are in favor of it. Of course people who
have different taste may not like it that much, I understand that, and I
don't say that that's the best way, but it's there and it works fairly
satisfactory. The most important thing for external users is the API, no?
We can discuss what is the best way to have a linker-global state
internally, but as long as we provide a sane API, everything else should
fall in the internal design stuff category.

As I repeatedly said in the thread that speed is not the only goal for us.
Honestly, it's going to be the best selling point of LLD, because most
people do not use that many linker features but just use it to create
executables (and sometimes wait for a long period of time). I reported
about the performance in this thread because I thought people would be
happy to hear the speed improvement we've made this year. Also, because I
was happy about that, I probably emphasized that too much. But that's not
our single goal.

I meant to commend you for both sending out a summary email, and the
results. Having this fast a linker on ELF/COFF is going to be a huge win
for developers. And I personally really like status updates for major
projects/features as it can be hard to follow along with all the email
traffic. So thank you for doing that.

My only concern with performance is that I felt like you would be against
changes to the code which make it slower but add functionality. Error
handling is such a use case. LLVM and clang continue to get bigger each
year and sometimes that means a little slower too. The linker may be
faster next year than it is now, or it may be slower but have a feature
which makes that a worthwhile tradeoff. I don't want to slow down any of
the code for any reason, but its natural that sometimes it'll happen with
good reason.

I don't know if you believe me by repeating the thing I said many times in
this thread, but I did not sacrificing functionality for speed.

If you take a look at the performance chart that I sent in this thread,
you'll notice the pattern that the linker gradually became slower and then
suddenly became faster. As we add more safety measures, error checks and
features, the linker get slower and slower. Each one is small but they
accumulates. And then we sometimes ran a profiler to nail down a
bottleneck, came up with a good optimization, and implement it. That's what
you see as steep speedups in the chart. We do not optimize by removing
features. We just did better.

So, I don't know what I can do for you to believe me, but I have never said
that the performance is the only goal, and you can find it by my actual
behavior. I believe I've been trying to always be helpful. Please ask
LLD/ELF developers about that. If you find me doing the opposite in the
code review or discussion, please point it out, so that I can correct that.

> It may be good to compile a wishlist about a linker to collect feature
ideas
> people wish to use. I honestly know only one major request: embedding a
> linker to a program. I guess that this single feature satisfies a
majority
> of needs as per 80:20 rule, but I really want to know what people wish to
> have.

Indeed. Getting a list of what people need and who's committing to do
those things is a good idea.

At this stage, we should refrain from adding any wild wish, or it'll
be impossible to do anything.

But given developer involvement in adding functionality as well as
keeping the promise of stability and performance, a next-steps list is
a good way to go.

> I agree with a fine print that I wish people working on LLVM and clang
aware
> that best practices in compilers may not be directly transferable to
> linkers. Linkers and compilers are different kinds of programs.

Wholeheartedly agree!

This is already true for compiler-RT, libc++, the test-suite, polly
and many other "LLVM branded" projects.

> The first thing that comes up to my mind with regard to pushing LLD
forward
> as an LLVM project is to move more code to LLVM libraries so that we can
> make LLD smaller. What do you think?

As you and Rafael have said in this thread, the reusable part of
linkers is not that big, so having LLD libraries in objdump (for
example) may not be practical, but having a separate (small) symbol
handling library that both use could be a potential way forward. I
don't know enough about LLD or objdump to have any concrete opinion,
but I have a feeling that objdump is a much bigger program than it
should be.

People usually say it's because the code is not really reusable. That
may be true, but if we can find reusability on small tools and LLD,
that'd at least reduce code a bit. Given that LLD already depends on
Clang and LLVM, there's no bad side of the increased dependency. (is
there?).

Nope, it doesn't depend on Clang. We uses a lot of LLVM libraries that
Clang depends on, but not on Clang.

Ah, I stand corrected! I can't see what LLD would need to share with
Clang anyway. :slight_smile:

--renato

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not corrupted.
If you are doing in-memory linking, I think it is unlikely that the object
files in memory are corrupted (especially when you just created them using
LLVM), so I think this satisfies most users needs in practice. Do you have
a concern about that?

Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.

I agreed with the sentiment at first, but after thinking about it for a
while, I actually have convinced myself that it doesn't hold water under
closer inspection.

The fundamental thing is that the LLVM libraries actually do have tons of
fatal errors; they're just in the form of assert's (or we'll dereference a
null pointer, or run off the end of a data structure, or go into an
infinite loop, etc.).

If you pass a corrupted Module to LLVM through the library API, you can
certainly trip tons of "fatal errors" (in the form of failed assertions or
UB). The way that LLVM gets around this is by having a policy of "if you
pass it corrupted Module that doesn't pass the verifier, it's your fault,
you're using our API wrong". Why can't an LLD library API have that same
requirement?

If it is safe for clang to uses the LLVM library API without running the
verifier as its default configuration for non-development builds, why would
it be unsafe for (say) clang to pass an object file directly to LLD as a
library without verification? Like Rui said, it's absolutely possible to
create a verifier pass for LLD; it just hasn't been written because most
object files we've seen so far seem to come from a small number of
well-tested codepaths that always (as in the `assert` meaning of "always")
create valid ELF files. In fact, we've added graceful recovery as
appropriate (e.g. r259831), which is a step above error handling!

Also, I'd like to point out that Clang, even when it does run the LLVM
verifier (which is not the default except in development builds), runs it
with fatal error handling. Is anybody aware of a program that uses LLVM as
a library, produces IR in memory, runs the verifier, and does not simply
abort if the verifier fails in non-development builds?

-- Sean Silva

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not
corrupted. If you are doing in-memory linking, I think it is unlikely that
the object files in memory are corrupted (especially when you just created
them using LLVM), so I think this satisfies most users needs in practice.
Do you have a concern about that?

Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.

I agreed with the sentiment at first, but after thinking about it for a
while, I actually have convinced myself that it doesn't hold water under
closer inspection.

The fundamental thing is that the LLVM libraries actually do have tons of
fatal errors; they're just in the form of assert's (or we'll dereference a
null pointer, or run off the end of a data structure, or go into an
infinite loop, etc.).

If you pass a corrupted Module to LLVM through the library API, you can
certainly trip tons of "fatal errors" (in the form of failed assertions or
UB). The way that LLVM gets around this is by having a policy of "if you
pass it corrupted Module that doesn't pass the verifier, it's your fault,
you're using our API wrong". Why can't an LLD library API have that same
requirement?

I agree that if an API user violates the API of a library, it is
appropriate for the library to abort with a fatal error.

However if the API is used correctly, but some error occurs, this error
should be reported back to the API consumer.

If it is safe for clang to uses the LLVM library API without running the
verifier as its default configuration for non-development builds, why would
it be unsafe for (say) clang to pass an object file directly to LLD as a
library without verification? Like Rui said, it's absolutely possible to
create a verifier pass for LLD; it just hasn't been written because most
object files we've seen so far seem to come from a small number of
well-tested codepaths that always (as in the `assert` meaning of "always")
create valid ELF files. In fact, we've added graceful recovery as
appropriate (e.g. r259831), which is a step above error handling!

Also, I'd like to point out that Clang, even when it does run the LLVM
verifier (which is not the default except in development builds), runs it
with fatal error handling. Is anybody aware of a program that uses LLVM as
a library, produces IR in memory, runs the verifier, and does not simply
abort if the verifier fails in non-development builds?

I'm doing the same as clang:

#ifndef NDEBUG
    char *error = nullptr;
    LLVMVerifyModule(g->module, LLVMAbortProcessAction, &error);
#endif

However the LLVM API is defined such that trying to call codegen functions
on an invalid module is undefined, and aborting in this case makes sense.
This is really just a more sophisticated assert().

I'm a fan of assert(); I like having assertions on during development. It
makes sense for a library to assert if API is violated. But errors not due
to API violations should be reported back to the caller instead of aborting.

It would be OK, but in practice this is not what happens, does it?
We run the IR verifier on every LTO input for instance.

Well, LLD/ELF's API is also documented to not be guaranteed to return if
you pass it corrupted object files:

The current policy is that it is your reponsibility to give trustworthy
object
files. The function is guaranteed to return as long as you do not pass
corrupted
or malicious object files. A corrupted file could cause a fatal error or
SEGV.
That being said, you don't need to worry too much about it if you create
object
files in the usual way and give them to the linker. It is naturally
expected to
work, or otherwise it's a linker's bug.

-- Sean Silva

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not
corrupted. If you are doing in-memory linking, I think it is unlikely that
the object files in memory are corrupted (especially when you just created
them using LLVM), so I think this satisfies most users needs in practice.
Do you have a concern about that?

Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.

I agreed with the sentiment at first, but after thinking about it for a
while, I actually have convinced myself that it doesn't hold water under
closer inspection.

The fundamental thing is that the LLVM libraries actually do have tons of
fatal errors; they're just in the form of assert's (or we'll dereference a
null pointer, or run off the end of a data structure, or go into an
infinite loop, etc.).

If you pass a corrupted Module to LLVM through the library API, you can
certainly trip tons of "fatal errors" (in the form of failed assertions or
UB). The way that LLVM gets around this is by having a policy of "if you
pass it corrupted Module that doesn't pass the verifier, it's your fault,
you're using our API wrong". Why can't an LLD library API have that same
requirement?

If it is safe for clang to uses the LLVM library API without running the
verifier as its default configuration for non-development builds, why would
it be unsafe for (say) clang to pass an object file directly to LLD as a
library without verification?

It would be OK, but in practice this is not what happens, does it?

That is what happens in practice. Also, even if you opt-in to the verifier
(which is enabled by an internal option), it uses fatal error handling to
report the verifier error.

-- Sean Silva

I was talking about "clang to pass an object file directly to LLD”

Hi Rui

I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.

Our main function returns as long as input object files are not
corrupted. If you are doing in-memory linking, I think it is unlikely that
the object files in memory are corrupted (especially when you just created
them using LLVM), so I think this satisfies most users needs in practice.
Do you have a concern about that?

Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.

I agreed with the sentiment at first, but after thinking about it for a
while, I actually have convinced myself that it doesn't hold water under
closer inspection.

The fundamental thing is that the LLVM libraries actually do have tons of
fatal errors; they're just in the form of assert's (or we'll dereference a
null pointer, or run off the end of a data structure, or go into an
infinite loop, etc.).

If you pass a corrupted Module to LLVM through the library API, you can
certainly trip tons of "fatal errors" (in the form of failed assertions or
UB). The way that LLVM gets around this is by having a policy of "if you
pass it corrupted Module that doesn't pass the verifier, it's your fault,
you're using our API wrong". Why can't an LLD library API have that same
requirement?

If it is safe for clang to uses the LLVM library API without running the
verifier as its default configuration for non-development builds, why would
it be unsafe for (say) clang to pass an object file directly to LLD as a
library without verification?

It would be OK, but in practice this is not what happens, does it?

That is what happens in practice.

I was talking about "clang to pass an object file directly to LLD”

Ah, sorry. I misunderstood what you were saying. Currently, no. But my post
above was primarily talking about the in-memory use case; I was just using
clang passing object files through memory to LLD as a hypothetical since
that has come up before.

-- Sean Silva

<unlurking>

Is it? If you pass an invalid fd to the libc, it replies with a EBADF,
it doesn't crash hard. Most mature libraries have guards against invalid
or inconsistent parameter values, and return error codes to the caller.

As someone who maintains and uses an LLVM binding to Python (llvmlite),
it's one of the annoyances we have faced: if someone makes a mistake
when calling one of the exposed APIs, that API may crash the process
(while, as Python programmers, they would rather get an exception,
which at least makes it easier to debug and diagnose the issue).
Getting a crude assert-induced crash on a CI machine or a user's
machine is no fun.

Of course, a C or C++ library cannot guard against everything,
especially not against invalid pointers or corrupted memory. But large
classes of user errors may be better served by actually returning an
error code rather than failing on an assert.

</unlurking>

Regards

Antoine.