LLD status update and performance chart

Mehdi Amini <mehdi.amini@apple.com> writes:

From: "Rafael Avila de Espindola via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Mehdi Amini" <mehdi.amini@apple.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Tuesday, December 13, 2016 11:40:43 AM
Subject: Re: [llvm-dev] LLD status update and performance chart

Mehdi Amini <mehdi.amini@apple.com> writes:

>>
>> Sean Silva via llvm-dev <llvm-dev@lists.llvm.org> writes:
>>> This will also greatly facilitate certain measurements I'd like
>>> to do
>>> w.r.t. different strategies for avoiding memory costs for input
>>> files (esp.
>>> minor faults and dTLB costs). I've almost gotten to the point of
>>> implementing this just to do those measurements.
>>
>> If you do please keep it local. The bare minimum we have of
>> library
>> support is already disproportionately painful and prevents easier
>> sharing
>> with COFF. We should really not add more until the linker is done.
>
> This is so much in contrast with the LLVM development, I find it
> quite hard to see this as an acceptable position on llvm-dev.

Why? What is wrong with setting priorities and observing that what
library support we already have has had a disproportional cost?

Can you please elaborate on this disproportional cost? I think it is really important to be specific about these kinds of things for the benefit of all potential contributors.

Thanks again,
Hal

I think the discussion has started to turn into generalities. There is an
apparent disagreement here, but perhaps only in philosophical stance and
not in actual practice.

Perhaps no decision with regards to this philosophy needs to be made in
this mailing list; we can evaluate code contributions on a case-by-case
basis and have a more concrete talking point.

The library-hostile lld development goes against one the core principles that, I believe, drives the LLVM development: providing libraries and reusable components.

From: "Mehdi Amini via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Rafael Espíndola" <rafael.espindola@gmail.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Tuesday, December 13, 2016 11:47:43 AM
Subject: Re: [llvm-dev] LLD status update and performance chart

>
> Mehdi Amini <mehdi.amini@apple.com> writes:
>
>>>
>>> Sean Silva via llvm-dev <llvm-dev@lists.llvm.org> writes:
>>>> This will also greatly facilitate certain measurements I'd like
>>>> to do
>>>> w.r.t. different strategies for avoiding memory costs for input
>>>> files (esp.
>>>> minor faults and dTLB costs). I've almost gotten to the point of
>>>> implementing this just to do those measurements.
>>>
>>> If you do please keep it local. The bare minimum we have of
>>> library
>>> support is already disproportionately painful and prevents easier
>>> sharing
>>> with COFF. We should really not add more until the linker is
>>> done.
>>
>> This is so much in contrast with the LLVM development, I find it
>> quite hard to see this as an acceptable position on llvm-dev.
>
> Why? What is wrong with setting priorities and observing that what
> library support we already have has had a disproportional cost?

The library-hostile lld development goes against one the core
principles that, I believe, drives the LLVM development: providing
libraries and reusable components.

I certainly agree with this, but I think that we should listen to any technical rationale that Rafael and others have.

We have always considered the marginal short-term cost of coding separable, reusable components worthwhile because of the longer-term benefits (including the benefit of serving yet-undefined future use cases). I'd like to understand why the cost/benefit tradeoff is claimed to be different in this case. We should also understand the relationship here with the JIT runtime linker functionality with which we probably want to unify this codebase.

-Hal

LLD is a subproject of the LLVM project, but as a product, LLD itself is
not LLVM nor Clang, so some technical decisions that make sense to them are
not directly be applicable or even inappropriate. As a person who spent
almost two years on the old LLD and 1.5 years on the new LLD, I can say
that Rafael's stance on focusing on making a good linker first really makes
sense. I can easily imagine that if we didn't focus on that, we couldn't
make this much progress over the past 1.5 year and would be stagnated at a
very basic level. Do you know if I'm a person who worked really hard on the
old (and probably "modular" whatever it means) linker so hard? I'm speaking
based on the experience. If you have an concrete idea how to construct a
linker from smaller modules, please tell me. I still don't get what you
want. We can discuss concrete proposals, but "making it (more) modular" is
too vague and not really a proposal, so it cannot be a productive
discussion.

That said, I think the current our "API" to allow users call our linker's
main function hit the sweet spot. I know at least a few LLVM-based language
developers who want to eliminate external dependencies and embed a linker
to their compilers. That's a reasonable usage, and I think allowing them to
pass a map from filename to MemoryBuffer objects makes sense, too. That
would be done without affecting the overall linker architecture. I don't
oppose to that idea, and if someone wrote a patch, I'm fine with that.

Hal Finkel <hfinkel@anl.gov> writes:

Why? What is wrong with setting priorities and observing that what
library support we already have has had a disproportional cost?

Can you please elaborate on this disproportional cost? I think it is really important to be specific about these kinds of things for the benefit of all potential contributors.

Yes. Getting the early shutdown code was way more work than it would
have been if this was just a program. The code is also quite ugly.

It also prevents us from sharing error handling functions with the COFF
linker.

So, please, I *BEG YOU*, let us write a linker. Once we have a
workning, production qualitity linker we can setup a performance
tracking bot and evaluate each librification change for its cost in
performance and code quality. We are just not there yet and not in a
position to take those patches.

Cheers,
Rafael

Mehdi Amini <mehdi.amini@apple.com> writes:

From: "Rafael Avila de Espindola via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Mehdi Amini" <mehdi.amini@apple.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Tuesday, December 13, 2016 12:10:08 PM
Subject: Re: [llvm-dev] LLD status update and performance chart

Mehdi Amini <mehdi.amini@apple.com> writes:

>>
>> Mehdi Amini <mehdi.amini@apple.com> writes:
>>
>>>>
>>>> Sean Silva via llvm-dev <llvm-dev@lists.llvm.org> writes:
>>>>> This will also greatly facilitate certain measurements I'd like
>>>>> to do
>>>>> w.r.t. different strategies for avoiding memory costs for input
>>>>> files (esp.
>>>>> minor faults and dTLB costs). I've almost gotten to the point
>>>>> of
>>>>> implementing this just to do those measurements.
>>>>
>>>> If you do please keep it local. The bare minimum we have of
>>>> library
>>>> support is already disproportionately painful and prevents
>>>> easier sharing
>>>> with COFF. We should really not add more until the linker is
>>>> done.
>>>
>>> This is so much in contrast with the LLVM development, I find it
>>> quite hard to see this as an acceptable position on llvm-dev.
>>
>> Why? What is wrong with setting priorities and observing that what
>> library support we already have has had a disproportional cost?
>
> The library-hostile lld development goes against one the core
> principles that, I believe, drives the LLVM development: providing
> libraries and reusable components.

Because it is trying to do something fundamentally different. We are
trying to write a *program*.

But this is not a technical argument. As a project, we rarely write programs, as such. We generally create reusable components that happen to have driver executables. At least long term, I think there's consensus that this is the best path. If we're going to make a different choice in this case, we need concrete reasons. We should discuss this in the context of the reasons you've provided (error handling, etc.).

-Hal

Hal Finkel <hfinkel@anl.gov> writes:

But this is not a technical argument. As a project, we rarely write programs, as such. We generally create reusable components that happen to have driver executables. At least long term, I think there's consensus that this is the best path. If we're going to make a different choice in this case, we need concrete reasons. We should discuss this in the context of the reasons you've provided (error handling, etc.).

We are not in a position no judge and will not be until the linker is
done. From what library support we have I think that was a horrible
mistake with a disproportionate cost.

*Once* the linker is done, if someone is able to first define what a
modular linker means, write a patch for it and show that it doesn't
degrade the linker performance or maintainability that is awesome. But
we don't have a linker yet, so, *PLEASE* let us write a linker. I don't
agree with how other parts of llvm are written but I don't keep
permanently nagging people do implement it the way *I* think is
correct.

Cheers,
Rafael

From: "Rafael Avila de Espindola" <rafael.espindola@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>, "Mehdi Amini" <mehdi.amini@apple.com>
Sent: Tuesday, December 13, 2016 12:09:00 PM
Subject: Re: [llvm-dev] LLD status update and performance chart

Hal Finkel <hfinkel@anl.gov> writes:
>> Why? What is wrong with setting priorities and observing that what
>> library support we already have has had a disproportional cost?
>
> Can you please elaborate on this disproportional cost? I think it
> is really important to be specific about these kinds of things for
> the benefit of all potential contributors.

Yes. Getting the early shutdown code was way more work than it would
have been if this was just a program. The code is also quite ugly.

This is general issue across LLVM's codebase. Is there some reason this is worse in lld than anywhere else in the project?

It also prevents us from sharing error handling functions with the
COFF
linker.

Is this a technical problem is just a lack of needed refactoring?

Thanks again,
Hal

Please tell me what you think about how reusable components would be like.
Which parts of the linker can be reusable components? Is that really
feasible? You are saying that the linker should be written in different way
by comparing it with an ideal linker (modular, reusable and fast! and by
the way the current LLD is much more reusable and extensible than before in
my opinion), but you can say anything if you compare with an ideal one. You
need to prove that it's doable so we should do that way instead of this. We
(or I) did a large experiment with the old LLD for years but couldn't find
a way to make it possible in a reasonable manner. I'm still trying to find
one, by distilling ELF and COFF linkers common parts, but still couldn't
make it.

From: "Rafael Avila de Espindola" <rafael.espindola@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>, "Mehdi Amini" <mehdi.amini@apple.com>
Sent: Tuesday, December 13, 2016 12:46:04 PM
Subject: Re: [llvm-dev] LLD status update and performance chart

Hal Finkel <hfinkel@anl.gov> writes:
> But this is not a technical argument. As a project, we rarely write
> programs, as such. We generally create reusable components that
> happen to have driver executables. At least long term, I think
> there's consensus that this is the best path. If we're going to
> make a different choice in this case, we need concrete reasons. We
> should discuss this in the context of the reasons you've provided
> (error handling, etc.).

We are not in a position no judge and will not be until the linker is
done. From what library support we have I think that was a horrible
mistake with a disproportionate cost.

*Once* the linker is done,

I think this seems like a dangerous predicate. I've very rarely worked on a software project that reached the state of "done", even in terms of core features.

if someone is able to first define what a
modular linker means, write a patch for it and show that it doesn't
degrade the linker performance or maintainability that is awesome.
But
we don't have a linker yet, so,

lld can self host and work is actively proceeding on various optimizations. That definitely seems done enough so that if someone were to, "define what a modular linker means, write a patch for it and show that it doesn't degrade the linker performance or maintainability", then it is time to discuss it.

*PLEASE* let us write a linker. I
don't
agree with how other parts of llvm are written but I don't keep
permanently nagging people do implement it the way *I* think is
correct.

If you were to propose patches to address your concerns, we would definitely consider them as well.

Thanks again,
Hal

I’m totally willing to believe you that it is not possible to write the fastest ELF linker on earth (or in the universe) with a library based and reusable components approach. But clang is not the fastest C/C++ compiler available, and LLVM is not the fastest compiler framework either!

So as a project, it seems to me that LLVM has not put the tradeoff on the speed/efficiency historically when it was to the detriment of layering/component/modularity/reusability/…

Writing the fastest linker possible is nice goal, I regret that a LLVM subproject is putting this goal above layering/component/modularity/reusability/… though.

I’m not how it compares now, but at least when I started contributing and for a year or two after the speed of clang (especially at O0, for fast compile-test-debug cycles) was one of its big selling points.

David

As an LLVM-based language developer, this is exactly what I want to do. In
short:

* Avoid depending on an external binary
* Avoid a fork+exec
* Avoid unnecessary use of the filesystem

Does this feature compromise progress on the linker binary?

Would it be reasonable for this feature to exist in the next minor version
release of lld?

I've never mentioned that creating the fastest linker is the only goal.

Medhi, please tell how you would *actually* layer linkers with fine-grained
components. You are just saying that the current LLD is worse than an
imaginary super-beautiful linker.

Some data:- http://baptiste-wicht.com/posts/2016/11/zapcc-a-faster-cpp-compiler.html

That does not mean we’re not taking compile time seriously: we are and we (at least a few people I know of) are planning on improving it, hopefully significantly. I don’t believe we’d ever go against the design principles (modularity, etc.) though.

Also, the above data is just about clang as a C/C++ compiler. LLVM is a different beast and its overhead (that is somehow inherent with the features provided) is "well known” (for example I believe this drove: https://webkit.org/blog/5852/introducing-the-b3-jit-compiler/ ; but you can find other examples).

That said, I think the current our "API" to allow users call our linker's
main function hit the sweet spot. I know at least a few LLVM-based language
developers who want to eliminate external dependencies and embed a linker
to their compilers. That's a reasonable usage, and I think allowing them to
pass a map from filename to MemoryBuffer objects makes sense, too. That
would be done without affecting the overall linker architecture. I don't
oppose to that idea, and if someone wrote a patch, I'm fine with that.

As an LLVM-based language developer, this is exactly what I want to do. In
short:

* Avoid depending on an external binary
* Avoid a fork+exec
* Avoid unnecessary use of the filesystem

Does this feature compromise progress on the linker binary?

I don't think so. There is a function in the driver to open input files,
and you can make a change to that function.

Would it be reasonable for this feature to exist in the next minor version

release of lld?

By the next minor release, do you mean the next LLVM release that's planned
early next year? If you write a patch, I can review (and Rafael would want
to take a look too).

I believe this has clearly been put ahead the other design aspects I mentioned, isn’t it?

That’s not a bait I’m gonna bite.

Using superlative and trying to qualify what I’m writing with "imaginary super-beautiful linker” is just caricatural and diverting the whole point. LLVM is quite far from an "imaginary super-beautiful compiler framework”, yet it is modular and usable as a library.