Hi Rui
I agree separating the components out in to libraries only makes sense
when there is a clear reason to do so. However, just this year there was a
very involved discussion about what it means to be a library.
Specifically, I don't think your current 'main-as-library' argument is
valid while you call exit or (if you) rely on mutable global state. Having
a single entry point via a main function is fine, but that function cannot
then kill the process which its linked in to.
Our main function returns as long as input object files are not corrupted.
If you are doing in-memory linking, I think it is unlikely that the object
files in memory are corrupted (especially when you just created them using
LLVM), so I think this satisfies most users needs in practice. Do you have
a concern about that?
Ultimately my concern is that there is *any* code path calling exit. I
would say that this prevents the lld library from being used in-process.
But others opinions may differ, and I honestly don't have a use case in
mind, just that I don't think library code should ever call exit.
There is a duality of LLD: lld-as-a-command and lld-as-a-library. This
duality is not necessarily a bad thing. Given that we have a verifier, any
path that leads to check for an impossible error condition and call exit()
should be thought as an assert() when it is used as a library since they
should never happen or there is a bug in code (and that's what assert
actually does). We already have lots of asserts in our libraries, and
that's I think essentially the same.
For the situation that you need to handle foreign object files in the same
process (I'd recommend you to sandbox a process in that case though), we
can write a verifier to check for file correctness rigorously so that we
can guarantee that object files are as trustworthy as freshly-created
object files. I think this feature is a reasonable addition to the linker.
That sounds great. Having written some parts of the MachO lld linker and
seen Kevin's work on llvm-objdump, I can appreciate that is not easy. For
example, I wrote the logic to process EH FDE's which may need to error out
if invalid. You don't necessarily want to validate them all up front as it
may be too slow, so I can understand that this isn't necessarily trivial to
handle in a performant way.
That's I don't know yet. My gut is that doing error checking beforehand
makes code easy to read and maintain, just like semantic analysis doesn't
have to handle syntactic errors. But I don't know the answer, so I cannot
exclude neither possibilities. We have to experiment that and compare.
As to the mutable shared state, my current (unproved) idea is to make them
thread local variables. Since no one yet has come up to say "hey, we are
actually trying to run multiple instances of the linker in the same process
simultaneously but LLD doesn't allow that", that's not implemented yet, but
technically I think it's doable, and that's needless to say a reasonable
feature request.
LLVM uses the LLVMContext for this (and begs users to look the other way
with regards to cl::opt's). I don't know if there's been a discussion in
LLVM about whether TLV's would be better there too, but seems like a
reasonable discussion to have. Certainly I don't think anyone should say
you can't use them without good reason.
That's also another thing no one knows the answer. As far as I can say,
global states in the LLD/ELF makes things easy to maintain, and looks like
a majority of people working on it are in favor of it. Of course people who
have different taste may not like it that much, I understand that, and I
don't say that that's the best way, but it's there and it works fairly
satisfactory. The most important thing for external users is the API, no?
We can discuss what is the best way to have a linker-global state
internally, but as long as we provide a sane API, everything else should
fall in the internal design stuff category.
As I repeatedly said in the thread that speed is not the only goal for us.
Honestly, it's going to be the best selling point of LLD, because most
people do not use that many linker features but just use it to create
executables (and sometimes wait for a long period of time). I reported
about the performance in this thread because I thought people would be
happy to hear the speed improvement we've made this year. Also, because I
was happy about that, I probably emphasized that too much. But that's not
our single goal.
I meant to commend you for both sending out a summary email, and the
results. Having this fast a linker on ELF/COFF is going to be a huge win
for developers. And I personally really like status updates for major
projects/features as it can be hard to follow along with all the email
traffic. So thank you for doing that.
My only concern with performance is that I felt like you would be against
changes to the code which make it slower but add functionality. Error
handling is such a use case. LLVM and clang continue to get bigger each
year and sometimes that means a little slower too. The linker may be
faster next year than it is now, or it may be slower but have a feature
which makes that a worthwhile tradeoff. I don't want to slow down any of
the code for any reason, but its natural that sometimes it'll happen with
good reason.
I don't know if you believe me by repeating the thing I said many times in
this thread, but I did not sacrificing functionality for speed.
If you take a look at the performance chart that I sent in this thread,
you'll notice the pattern that the linker gradually became slower and then
suddenly became faster. As we add more safety measures, error checks and
features, the linker get slower and slower. Each one is small but they
accumulates. And then we sometimes ran a profiler to nail down a
bottleneck, came up with a good optimization, and implement it. That's what
you see as steep speedups in the chart. We do not optimize by removing
features. We just did better.
So, I don't know what I can do for you to believe me, but I have never said
that the performance is the only goal, and you can find it by my actual
behavior. I believe I've been trying to always be helpful. Please ask
LLD/ELF developers about that. If you find me doing the opposite in the
code review or discussion, please point it out, so that I can correct that.