[GSOC 2018] GNU Binutils replacement

Hello everyone,

My name is Sebastiaan Peters, an undergrad student from The Netherlands.

The project for creating drop-in replacements for GNU Binutils peaked my interest. It would make my day (or my summer) being able to contribute to such a project.

Since I have no experience with llvm besides some basic usage of clang, the first thing I tried was to find out what llvm tools are out there.
However the more I research the more questions I got.
I would very grateful if somebody could answer some of the following questions:

1. Does this project apply to all llvm tools, or just those that have intersecting functionality with the existing GNU Binutils?
For instance the llvm-assembler still provides the same functionality, but does will most likely not be a replacement for gnu-as considering the large amount of features it lacks (not being able to accept or target most architectures).

2. LLD already exists as a drop-in replacement for the GNU linker for compilers[0], would it be desirable if an cli wrapper would be made for this?

3. What happened to tools like llvm-objcopy and llvm-strip?
Documentation is hard to find and are not included in the llvm tools documentation page[0].

4. In the case of missing functionality between 2 tools, Is it the goal to patch up what is missing or to go on to the next tool?

5. Is there a list of preferred tools that should be prioritized?

Sorry if there is said anything that is incorrect, trying to get a clear picture of the entire project was quite overwhelming.

Kind Regards,

Sebastiaan Peters

[0] https://lld.llvm.org/
[1] LLVM Command Guide — LLVM 16.0.0git documentation

Hi Sebastiaan,

I’m not the creator of the this GSOC project but I might be able to help. I’m the author of llvm-objcopy, and the soon to be creator of the symlink llvm-strip to llvm-objcopy (which currently doesn’t exist). The reason llvm-objcopy and llvm-strip don’t have documentation pages is because I haven’t written them yet; I’ll get on that!

  1. The goal is to work towards extending the existing interfaces so that the existing tools can be used as drop in replacements in more projects. Most tools that have GNU binutils equivalents have some GNU binutil CLI compatible interface. I think llvm-as could probably be extended slowly but surely to be a better more practical replacement for GNU as.

  2. What do you mean? It’s already cli compatible.

  3. They exist and are still being worked on! I’ve just been lazy about documentation. llvm-objcopy is actually the least mature GNU binutils replacement and could use lots more work.

  4. I’m not familiar enough with what people commonly use in bigger builds to say for sure what tools might be missing. In my team’s experience llvm-objcopy was the last missing piece which is why I wrote it. If you find some projects that use a tool that is 100% missing, You could work on implementing a small subset of features needed for one of those projects. I think it’s more likely that existing tools will need to be extended however.

  5. Well I’m biased of course and would say llvm-objcopy. A better more unbiased answer would be to find a project that currently uses GNU binutils, build it using llvm instead of binutils, and see what fails/breaks. That way you don’t go off and arbitrarily extend tool X with obscure feature Y that hardly anyone actually uses.

I think Eric Christopher is the proposal and confirmed mentor and might have more to say.

Best,
Jake

Is there a typo somewhere in there? "llvm-as" and "GNU as" are completely unrelated.

-Eli

Hi Jake,

Thank you for your reply! That would explain a lot.

What are some ways in which llvm-object still could use work?
Depending on how much work remains, maybe it would be an interesting GSOC project?

Kind Regards,

Sebastiaan Peters

Is there a typo somewhere in there? “llvm-as” and “GNU as” are completely unrelated.

It wasn’t a typo unfortunately but I should have said llvm-mc. I just googled “llvm as” saw “llvm-as” and thought “oh that must be the GNU CLI compatible version of llvm-mc”. That’s what making assumptions gets me. I suppose a symlink to llvm-mc or something might be better. Clang already supports assembly in a command line compatible way with gcc as well. I don’t really know what the proper thing to do there would be and it would depend on how different projects are using those tools.

What are some ways in which llvm-object still could use work? Depending on how much work remains, maybe it would be an interesting GSOC project?

After the switch to TableGen for command line arguments are landed filling out all the llvm-strip options would be one part. I’ve been meaning to implement --only-keep-debug for a while which is less trivial than the other existing strip options. Other options to implement include --discard-all, --discard-locals, section patterns, some missing short aliases, --set-section-flags, --dump-section, --strip-unneeded-symbols, etc…

While all this probably should probably be done at some point, I think it’s important to find an actual project that use some missing features or working on features that people are actually requesting (for llvm-objcopy --only-keep-debug is the only one that fits that criteria). Otherwise you fall into a trap of implementing things for which drop in replacements don’t matter. Those listed options above are things that I’ve seen here and there but are not necessarily widely or critically used.

Is there a typo somewhere in there? “llvm-as” and “GNU as” are completely unrelated.

It wasn’t a typo unfortunately but I should have said llvm-mc. I just googled “llvm as” saw “llvm-as” and thought “oh that must be the GNU CLI compatible version of llvm-mc”. That’s what making assumptions gets me. I suppose a symlink to llvm-mc or something might be better. Clang already supports assembly in a command line compatible way with gcc as well. I don’t really know what the proper thing to do there would be and it would depend on how different projects are using those tools.

We haven’t had a good reason to really want a command line assembler since we pretty much just suggest using “clang” as that driver - and honestly it’s what people should do with gcc as well. :slight_smile:

I had a look at implementing a gas-compatible wrapper a couple of years ago when we started seriously looking at removing GPL’d code from FreeBSD. I decided it wasn’t worth if for a few reasons:

- cc is required by POSIX, but as is not, so we don’t need it for conformance.
- gas has a *huge* number of command-line options so complete compatibility is very difficult
- Most things that care about the more complex options don’t actually use gas, they use nasm or yasm

I don’t wish to discourage anyone from providing a drop-in replacement for gas (in particular, teaching clang -cc1as to recognise all of the gas options) - there’s always going to be a long tail of things that will want to use all of the weird things that their old toolchain supported.

David