RFC: ThinLTO Impementation Plan

I've included below an RFC for implementing ThinLTO in LLVM, looking
forward to feedback and questions.
Thanks!
Teresa

RFC to discuss plans for implementing ThinLTO upstream. Background can
be found in slides from EuroLLVM 2015:
   https://drive.google.com/open?id=0B036uwnWM6RWWER1ZEl5SUNENjQ&authuser=0)
As described in the talk, we have a prototype implementation, and
would like to start staging patches upstream. This RFC describes a
breakdown of the major pieces. We would like to commit upstream
gradually in several stages, with all functionality off by default.
The core ThinLTO importing support and tuning will require frequent
change and iteration during testing and tuning, and for that part we
would like to commit rapidly (off by default). See the proposed staged
implementation described in the Implementation Plan section.

ThinLTO Overview

"ELF-wrapped bitcode" seems potentially controversial to me.

What about ar, nm, and various ld implementations adds this requirement? What about the LLVM implementations of these tools is lacking?

Alex

"ELF-wrapped bitcode" seems potentially controversial to me.

What about ar, nm, and various ld implementations adds this requirement?
What about the LLVM implementations of these tools is lacking?

Sorry I can not parse your questions properly. Can you make it clearer?

David

"ELF-wrapped bitcode" seems potentially controversial to me.

What about ar, nm, and various ld implementations adds this requirement?
What about the LLVM implementations of these tools is lacking?

Sorry I can not parse your questions properly. Can you make it clearer?

Alex is asking what the issue is with ar, nm, ld -r and regular
bitcode that makes using elf-wrapped bitcode easier.

The issue is that generally you need to provide a plugin to these
tools in order for them to understand and handle bitcode files. We'd
like standard tools to work without requiring a plugin as much as
possible. And in some cases we want them to be handled different than
the way bitcode files are handled with the plugin.

nm: Without a plugin, normal bitcode files are inscrutable. When
provided the gold plugin it can emit the symbols.

ar: Without a plugin, it will create an archive of bitcode files, but
without an index, so it can't be handled by the linker even with a
plugin on an -flto link. When ar is provided the gold plugin it does
create an index, so the linker + gold plugin handle it appropriately
on an -flto link.

ld -r: Without a plugin, fails when provided bitcode inputs. When
provided the gold plugin, it handles them but compiles them all the
way through to ELF executable instructions via a partial LTO link.
This is where we would like to differ in behavior (while also not
requiring a plugin) with ELF-wrapped bitcode: we would like the ld -r
output file to still contain ELF-wrapped bitcode, delaying the LTO
until the full link step.

Let me know if that helps address your concerns.

Thanks,
Teresa

So, what Alex is saying is that we have these tools as well and they understand bitcode just fine, as well as every object format - not just ELF. :slight_smile:

-eric

So, what Alex is saying is that we have these tools as well and they
understand bitcode just fine, as well as every object format - not just ELF.
:slight_smile:

Right, there are also LLVM specific versions (llvm-ar, llvm-nm) that
handle bitcode similarly to the way the standard tool + plugin does.
But the goal we are trying to achieve is to allow the standard system
versions of the tools to handle these files without requiring a
plugin. I know the LLVM tool handles other object formats, but I'm not
sure how that helps here? We're not planning to replace those tools,
just allow the standard system versions to handle the intermediate
objects produced by ThinLTO.

Thanks,
Teresa

The design objective is to make thinLTO mostly transparent to binutil tools to enable easy integration with any build system in the wild. ‘Pass-through’ mode with ‘ld -r’ instead of the partial LTO mode is another reason.

David

I’m not sure this is a particularly great assumption to make. We have to support a lot of different build systems and tools and concentrating on something that just binutils uses isn’t particularly friendly here. I also can’t imagine how it’s necessary for any of the lto aspects as currently written in the proposal.

-eric

I'm not sure this is a particularly great assumption to make.

Which part?

We have to
support a lot of different build systems and tools and concentrating on
something that just binutils uses isn't particularly friendly here.

I think you may have misunderstood
His point was exactly that they want to be transparent to *all of* these tools.
You are saying "we should be friendly to everyone". He is saying the same thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

that is exactly the point.

thanks,

David

The end goal is the ability to turn on thin-lto as easy as turning optimizations like -O2 or -O3 – we want friendliness, very much :slight_smile:

David

The friendliest tactic would be to support all object-file formats, not just ELF?

–paulr

I’m not sure this is a particularly great assumption to make.

Which part?

The binutils part :slight_smile:

We have to
support a lot of different build systems and tools and concentrating on
something that just binutils uses isn’t particularly friendly here.
I think you may have misunderstood
His point was exactly that they want to be transparent to all of these tools.
You are saying “we should be friendly to everyone”. He is saying the same thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

Oh, I understood. I just don’t know that I agree. To do anything with the tools will require some knowledge of bitcode anyhow or need the plugin. I’m saying that as a baseline start we should look at how to do this using the tools we’ve got rather than wrapping things for no real gain.

I’ve talked to Teresa a bit offline and we’re going to talk more later (and discuss on the list), but there are some discussions about how to make this work either with just bitcode/llvm tools and so not requiring integration on all platforms. The latter is what I consider as particularly friendly :slight_smile:

-eric

> I'm not sure this is a particularly great assumption to make.

Which part?

The binutils part :slight_smile:

> We have to
> support a lot of different build systems and tools and concentrating on
> something that just binutils uses isn't particularly friendly here.
I think you may have misunderstood
His point was exactly that they want to be transparent to *all of* these
tools.
You are saying "we should be friendly to everyone". He is saying the same
thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

Oh, I understood. I just don't know that I agree. To do anything with the
tools will require some knowledge of bitcode anyhow or need the plugin. I'm
saying that as a baseline start we should look at how to do this using the
tools we've got rather than wrapping things for no real gain.

That doesn't seem strictly true - the ar situation (which I'm lead to
believe is in use in our build system & others, one would assume). With the
symbol table included as proposed, ar can be used without any knowledge of
the bitcode or need for a plugin.

It'd be helpful to have the scenarios we're trying to support with these
tools & then weigh up the alternatives.

The friendliest tactic would be to support all object-file formats, not
just ELF?

In general it should be wrapped in native object format -- and ELF will be
a starting point.

David

For some bits, sure. Optimizing for ar seems a bit silly, why not ‘ld -r’? :wink:

Agreed. The ar situation is interesting because one thing we discussed after you wandered off was just adding a ToC section to bitcode as it is and then having the tools handle that. Would seem to accomplish at least the goals as I’ve seen them up to this point without worrying too much.

At any rate, I think this aspect of the proposal needs a bit of discussion and some mapping out of the pros and cons here.

-eric

The friendliest tactic would be to support all object-file formats, not
just ELF?

In general it should be wrapped in native object format -- and ELF will be a
starting point.

Yes, sorry, I should have generalized this to Native Object File
Wrapper format, ala
http://llvm.org/docs/BitCodeFormat.html#native-object-file-wrapper-format.
I was prototyping with ELF, but the writer support should be similar
for other formats supported by LLVM.

Thanks,
Teresa

> I'm not sure this is a particularly great assumption to make.

Which part?

The binutils part :slight_smile:

> We have to
> support a lot of different build systems and tools and concentrating
> on
> something that just binutils uses isn't particularly friendly here.
I think you may have misunderstood
His point was exactly that they want to be transparent to *all of* these
tools.
You are saying "we should be friendly to everyone". He is saying the
same thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

Oh, I understood. I just don't know that I agree. To do anything with the
tools will require some knowledge of bitcode anyhow or need the plugin. I'm
saying that as a baseline start we should look at how to do this using the
tools we've got rather than wrapping things for no real gain.

That doesn't seem strictly true - the ar situation (which I'm lead to
believe is in use in our build system & others, one would assume). With the
symbol table included as proposed, ar can be used without any knowledge of
the bitcode or need for a plugin.

For some bits, sure. Optimizing for ar seems a bit silly, why not 'ld -r'?

But as mentioned, ld -r can work on native object wrapped bitcode
without a plugin as well.

:wink:

It'd be helpful to have the scenarios we're trying to support with these
tools & then weigh up the alternatives.

Agreed. The ar situation is interesting because one thing we discussed after
you wandered off was just adding a ToC section to bitcode as it is and then
having the tools handle that. Would seem to accomplish at least the goals as
I've seen them up to this point without worrying too much.

The ToC section is a way we can encode the function index/summary into
bitcode, but won't help integrate with existing tools. The main issue
we are trying to solve is integrating transparently with existing
binutils tools in use in our build system and probably elsewhere.

At any rate, I think this aspect of the proposal needs a bit of discussion
and some mapping out of the pros and cons here.

Sure, we can continue to discuss and I will try to lay out the pros/cons.

Teresa

> I'm not sure this is a particularly great assumption to make.

Which part?

The binutils part :slight_smile:

I took it as the more general: "we want to simply work with native
toolchains", not as something specific to binutils.

> We have to
> support a lot of different build systems and tools and concentrating on
> something that just binutils uses isn't particularly friendly here.
I think you may have misunderstood
His point was exactly that they want to be transparent to *all of* these
tools.
You are saying "we should be friendly to everyone". He is saying the same
thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

Oh, I understood. I just don't know that I agree.

Fair enough. I just wanted to make sure there wasn't a misunderstanding here :slight_smile:

To do anything with the
tools will require some knowledge of bitcode anyhow or need the plugin.

This is certainly true, but that's part of the point - the ability to
pass through native tools without them breaking, or worrying about
the bitcode there.

I'm
saying that as a baseline start we should look at how to do this using the
tools we've got rather than wrapping things for no real gain.

The gain is precisely: "People on different platforms do not have to
use all-llvm tools to have this build mode work".

I've talked to Teresa a bit offline and we're going to talk more later (and
discuss on the list), but there are some discussions about how to make this
work either with just bitcode/llvm tools and so not requiring integration on
all platforms. The latter is what I consider as particularly friendly :slight_smile:

Sure, if you have a way to make this work that doesn't require
everyone in the world replace ar with llvm-ar and ld with llvm-ld,
sounds awesome :slight_smile:

(I actually have no real dog in this fight, just trying to make sure
everyone is on the same page ;P)

I’m not sure this is a particularly great assumption to make.

Which part?

The binutils part :slight_smile:

We have to
support a lot of different build systems and tools and concentrating
on
something that just binutils uses isn’t particularly friendly here.
I think you may have misunderstood
His point was exactly that they want to be transparent to all of these
tools.
You are saying “we should be friendly to everyone”. He is saying the
same thing.
We should be friendly to everyone. The friendly way to do this is to
not require all of these tools build plugins to handle bitcode.

Hence, elf-wrapped bitcode.

Oh, I understood. I just don’t know that I agree. To do anything with the
tools will require some knowledge of bitcode anyhow or need the plugin. I’m
saying that as a baseline start we should look at how to do this using the
tools we’ve got rather than wrapping things for no real gain.

That doesn’t seem strictly true - the ar situation (which I’m lead to
believe is in use in our build system & others, one would assume). With the
symbol table included as proposed, ar can be used without any knowledge of
the bitcode or need for a plugin.

For some bits, sure. Optimizing for ar seems a bit silly, why not ‘ld -r’?

But as mentioned, ld -r can work on native object wrapped bitcode
without a plugin as well.

How? It’s not like any partial linking is going to go on inside the bitcode if the linker doesn’t understand bitcode.

Agreed. The ar situation is interesting because one thing we discussed after
you wandered off was just adding a ToC section to bitcode as it is and then
having the tools handle that. Would seem to accomplish at least the goals as
I’ve seen them up to this point without worrying too much.

The ToC section is a way we can encode the function index/summary into
bitcode, but won’t help integrate with existing tools. The main issue
we are trying to solve is integrating transparently with existing
binutils tools in use in our build system and probably elsewhere.

Right. I’m not entirely sure what use we’re going to see in the existing tools that we want to encompass here. There’s some of it for convenience (i.e. nm etc for developers), but they can use a tool that understands bitcode and we can make the existing llvm tools suffice for these needs.

I think the way of looking at this is that we can:

a) go with wrapping things in native object formats, this means

  • some tools continue to work at the cost of additional I/O and space at compile/link time
  • we still have to update some tools to work at all

b) we extend those tools/our own tools and have them be drop in replacements to the existing tools. They’ll understand the bitcode format natively, they’ll be smaller, and we’ll be able to push the state of the art in tooling/analysis a bit more in the future without having to rework thin lto.

It’s basically a set of trade-offs and for llvm we’ve historically gone the b direction.

At any rate, I think this aspect of the proposal needs a bit of discussion
and some mapping out of the pros and cons here.

Sure, we can continue to discuss and I will try to lay out the pros/cons.

Excellent.

-eric