llvm-objcopy proposal

LLVM already implements its own version of almost all of binutils. The
exceptions to this rule are objcopy and strip. This is a proposal to implement
an llvm version of objcopy/strip to complete llvm’s binutils.

Several projects only use gnu binutils because of objcopy/strip. LLVM itself
uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. If you
want to distribute your build tools this is a problem due to licensing. It’s
also a bit of a blemish on LLVM because LLVM could be made more self sufficient
if there was an llvm version of objcopy. Additionally Chromium is one of the
popular benchmarks for LLVM so it would be nice if Chromium didn’t have to use
binutils. Using
elftoolchain
solves the licensing issue for Fuchsia but is elf specific and only solves the
issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum viable
replacement for objcopy.

I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try and
find the major use cases of objcopy. Here is a list of use cases I have found
and which projects use them. This list includes some use cases not found in
these 4 projects.

  1. Use Case: Stripping debug information of an executable to a file
    Who uses it: LLVM, Fuchsia, Chromium
objcopy --only-keep-debug foo foo.debug
objcopy --strip-debug foo foo

Example use
When it is useful:
This reduces the size of the file for distribution while maintaining the debug
information in a file for later use. Anyone distributing an executable in
anyway could benefit from this.

  1. Use Case: Stripping debug information of a relocatable object to a file
    Who uses it: None of the 4 projects considered
objcopy --only-keep-debug foo.o foo.debug
objcopy --strip-debug foo.o foo.o

When it is useful:
In distribution of an SDK in the form of an archive it would be nice to strip
this information. This allows debug information to be distributed separately.

  1. Use Case: Stripping debug information of a shared library to a file
    Who uses it: None of the 4 projects
objcopy --only-keep-debug foo.so foo.debug
objcopy --strip-debug foo.so foo.so

When is it Useful:
Same benefits as the previous case. If you want to distribute a library this
option allows you to distribute a smaller binary while maintaining the ability
to debug.

  1. Use Case: Stripping an executable
    Who uses it: None of the 4 projects
objcopy --strip-all foo foo

When is it useful:
Anytime an executable is being distributed and there is no reason to keep
debugging information. This makes the executable smaller than simply
stripping debug info and doesn’t produce an extra file.

  1. Use Case: “Complete stripping” an executable
    Who uses it: None of the 4 projects
eu-strip --strip-sections foo

When is it useful:
This is an extreme form of stripping that even strips the section headers
since they are not needed for loading. This is useful in the same contexts as
stripping but some tools and dynamic linkers may be confused by it. This is
possibly only valid on ELF unlike general stripping which is a valid option on
multiple platforms.

  1. Use Case: DWARF fission
    Who uses it: Clang, Fuchsia, Chromium
objcopy --extract-dwo foo foo.debug
objcopy --strip-dwo foo foo

Example use 1
Example use 2

When is it useful:
DWARF fission can be used to speed up large builds. In some cases builds can
be too large to be handled and DWARF fission makes this manageable. DWARF
fission is useful in almost any project of sufficient size.

  1. Use Case: Converting an executable to binary
    Who uses it: Fuchsia
objcopy -O binary magenta.elf magenta.bin

Example use

When is it useful:
For kernels and embedded applications that need just the raw segments.

  1. Use Case: Adding a gdb index
    Who uses it: Chromium
gdb -batch foo -ex "save gdb-index dir" -ex quit
objcopy --add-section .gdb_index="dir/foo.gdb-index" \
--set-section-flags .gdb_index=readonly foo foo

Example use

When is it useful:
Adding a gdb index reduces startup time for debugging an application. Any
sufficiently large program with a sufficiently large amount of debug
information can potentially benefit from this.

  1. Use Case: Converting between formats
    Who uses it: Fuchsia (only in Magenta GCC build)
objcopy --target=pei-x86-64 magenta.elf [megenta.pe](http://megenta.pe)

Example use

When is it useful:
This is primarily useful when you can’t directly target a needed format.

  1. Use Case: Removing symbols not needed for relocation
    Who uses it: Chromium
objcopy --strip-unneeded foo foo

Example use

When is it useful:
This is useful when shipping an SDK or some relocatable binaries.

  1. Use Case: Removing local symbols
    Who uses it: LLVM
objcopy --discard-all foo foo

Example use
(hidden in definition of “strip_command” using strip instead of objcopy and
using -x instead of --discard-all)

When is it useful:
Anytime you don’t need locals for debugging this can be useful.

  1. Use Case: Removing a specific unwanted section
    Who uses it: LLVM
objcopy --remove-section=.debug_aranges foo foo

Example use

When is it useful:
This is useful when you know that you have an unwanted section that isn’t
removed by one of the other stripping options. This can also be used to
remove an existing section for replacement by a new section.

We would like to build this up incrementally by solving specific use cases
as they come up. To start with we would like to tackle the use cases
important to us. We primarily care about fully linked executables and not
relocatable files. I plan to implement conversion from ELF to binary first.
After that I plan on implementing stripping for ELF executables.

I’ve thought about building an llvm-objcopy for a long time and the approach you’ve outlined is the same one that I would have suggested (analyzing a set of critical use cases, triaging them, and then incrementally building them. In other words, this approach SGTM. I’ve CC’ed a couple other people who might have some comments (but I’ve talked with them about objcopy before in one way or another and I don’t get the feeling that they would disagree with the overall approach).

A couple specific suggestions about the more concrete code design.

IIRC, when I looked at GNU objcopy I saw why it was called objcopy: it basically looked like it was originally a program that copied an object file without modification. Then command line argument parsing was added and tons of flags appeared that triggered a mess of random if statements that would modify the copying process. I don’t think we want to have an implementation like that, especially since we don’t have anything even remotely similar to the “writing” side of the BFD library (libObject’s object format agnostic interface is only for reading).

  1. It seems that (besides the format conversion operations) everything is ELF. It will dramatically simplify the implementation to make it ELF-only at first. I would even recommend against using libObject’s object-format agnostic reading implementation. One of the things we have learned while working on LLD is that abstracting across object formats is very difficult to get right. There are just too many subtle semantic differences that penetrate very deep into the program. As an example, LLD/ELF (which is ELF-only) and LLD/COFF (which is COFF-only) are each about 1/3 (or less) the size of the previous linker design that attempted to handle all 3 formats (MachO is the third format) together (and they are actually much more complete than the previous design was before we switched to the new design; normalizing for the difference in features, 1/6 the size is probably more accurate). Unless you also have as a goal (I don’t think you do) to make progress towards an LLVM-based analog of the GNU BFD library as you work on objcopy, sticking to object-format specific code is probably preferable. It’s a lot easier to look at format-specific implementations and see what can be shared vs making a mistake about the abstractions used across object formats and require untangling the incorrect abstraction.

  2. I would really suggest making sure that there is a very, very clear separation between the objcopy-compatible command line parsing and the internals that actually do the work. In fact, it may be reasonable to have the separation be so profound that tool is called llvm-objtool (with subcommands like llvm-objtool formatconvert ...) and have the objcopy-compatible command line parsing essentially dispatch into one of them (with such parsing be triggered by looking at argv[0]). Regardless of whether it makes sense to go that far, it’s best to err on the side of having separate implementations even if it seems to require duplicating some code. For example, if you have the same for loop in two different “subcommands”, it may be best to make an iterator encapsulating it (or a helper function that takes a lambda) rather than adding a bool parameter to the function containing that loop.

  3. (This is just a “keep an eye out” type thing. No specific suggestion.) As the implementation of objcopy progresses, especially if the object writing code is incrementally factored out between shared routines (as we try to avoid one huge writing routine taking 17 arguments controlling what it does), we may want to look at it together with other object file writing code in the LLVM project (LLD, llvm-dwp, MC) to see what can be unified. llvm-dwp is probably the most similar and most likely to be able to share code.

– Sean Silva

I think use-case #7, converting an executable to a ROMable/flashable binary image is important to a large number of people who are fairly invisible in the LLVM community. All the more so because LLVM is the easiest compiler to get up and running for a new or custom ISA, most of which are used in embedded applications, not in general purpose PCs.

Of course these people are not hurting too badly, because binutils exists, but it would be good not to need it.

yeah, something that people toss around from time to time - certainly if it’s useful enough to you to motivate the work/time/effort, great!

Seconding your comments & Sean’s: Implement features as needed (in a binutils compatible interface/behavior - but doesn’t have to have all the features. Could start out with clear errors for all the features “this isn’t supported” & fill out features as they’re needed by folks with the motivation to implement them).

The LLVM use of objcopy for DWARF Fission’s probably a marginal one - it’d be nice to remove that dependency and produce the separate files directly, at least when using the integrated assembler, but that’s a fair bit more work. Having an LLVM objcopy would provide the opportunity to address a couple of the issues that exist already:

  1. Windows support (I’m not sure what that really looks like - whether it would actually work with COFF files, etc, but I remember David Majnemer looking at this at one point)
  2. Object file size regression (the LLVM object size optimization of having two of the object file string tables (strtab and shstrtab) in one section instead of two saves a bunch of file size - but binutils objcopy doesn’t know that trick, so turning on fission undoes that improvement)

LLVM already implements its own version of almost all of binutils. The
exceptions to this rule are objcopy and strip. This is a proposal to
implement
an llvm version of objcopy/strip to complete llvm’s binutils.

A bit of info from the FreeBSD perspective. In FreeBSD we use ELF Tool
Chain versions of most binutils (addr2line, c++filt, objcopy, nm,
size, strings, strip, readelf) and bespoke versions of other tools
(ar, elfdump, ranlib). The exceptions are as (for which we have no
replacement), ld.bfd (we've switched to LLD for arm64 and are working
on other architectures), and objdump (investigating llvm-objdump,
although it still has some limitations).

That said, I would very much like to see LLVM equivalents for all of
the tools so that we can compare and benchmark against other
implementations, and so that we have an alternative available if it
becomes necessary. I would be happy for llvm-objcopy to exist.

1. Use Case: Stripping debug information of an executable to a file
6. Use Case: DWARF fission
8. Use Case: Adding a gdb index

It seems like these ought to just be done by the linker. We don't yet
use (6) in FreeBSD because our toolchain is still using some ancient
components on most architectures but it is something I very much wish
to start doing.

2. Use Case: Stripping debug information of a relocatable object to a file
   Who uses it: None of the 4 projects considered
5. Use Case: “Complete stripping” an executable
   Who uses it: None of the 4 projects

   eu-strip --strip-sections foo

I'd be surprised to find this being used.

3. Use Case: Stripping debug information of a shared library to a file
4. Use Case: Stripping an executable
7. Use Case: Converting an executable to binary
9. Use Case: Converting between formats [ELF->PE]
12. Use Case: Removing a specific unwanted section

We use these cases in FreeBSD.

One additional use case for you: converting from a binary to an ELF object file

objcopy -I binary -O elf64-x86-64 foo.bin foo.o

This is sometimes used for embedding binary files for use by drivers and such.

Having llvm-objcopy would be great! A really small version of it was already implemented:

https://github.com/RodAtDISA/llvm-objcopy
https://github.com/tpimh/llvm-objcopy (this fork can be compiled with LLVM master)

The functions of this implementation of objcopy are very, uhm... limited.

I think this topic appeared already several times on this mailing list, so probably we get some information from the previous discussions.

It would be great to have all the binutils in LLVM, so no external toolchain is needed.

Regards,
Dmitry

Hello Jake,

I don’t have any experience with objcopy but have some with the classic UNIX strip especially on darwin for Mach-O. I even prototyped up an llvm-strip and got it working enough to do the default stripping on an hello world executable to get a “bit for bit” match for a darwin for Mach-O file against the existing strip(1) tool.

As Sean points out, there is nothing currently in the llvm’s libObject code that writes a binary. I do agree with Sean that getting this correct it is best to not try to make a unified bit of code that writes the three formats llvm currently cares about (ELF, COFF and Mach-O).

Also my experience suggests, creating tools that write “modified binaries” from fully linked binaries is quite different than writing binaries from an assembler or linker. As you have very limited degrees of freedom slicing and dicing a fully linked file and still have a correctly formed file. That is you can’t usually change any addresses, etc. and you have to update all the references to things like indexes into the symbol table, string table, etc from other tables in the object file. So while it might be good to "keep an eye out” for what could be shared, if you push too hard on that I think your design may not turn out all that clean.

That said, I do think there is value sharing the "object file reader” code so that all the error checking can be in one place. While I’m not a big fan of libObject it did prove workable for my prototype for llvm-strip for the reading in of object files. But I did as Sean suggested and went with a totally object format dependent bit of code to write a modified linked object. I did this a bit cleaner that what I did with the darwin cctools open source code I wrote many decades ago. But I feel it is best to have an object format dependent bit of code to put back together the modified parts of a linked Mach-O file. As that is easy to get wrong and a pain to debug when one does get it wrong. My thinking was to have a bit of library code that the darwin tools like install_name_tool(1), bitcode_strip(1), etc could shared and use to reconstruct their modified fully linked binaries.

My thoughts,
Kev

Hi Jake,

I’m basically going to echo Sean here and don’t have a lot to add to what he said. Being able to get the writing aspect into a decent place is good. You’ll want to take a look at dsymutil as an approach if not necessarily what a final approach should look like. Perhaps sharing code with lld would work too.

Anyhow, feel free to send messages with what you need and any help or support as I’m particularly interested in this.

Thanks!

-eric

Yea, unfortunately the command-line you actually end up needing is more
like:
  objcopy -I binary -Bi386:x86-64 -Oelf64-x86-64 --rename-section
.data=.rodata,alloc,load,readonly,data,contents --add-section
.note.GNU-stack=/dev/null

Having to manually invoke objcopy and know what to specify for the -B and
-O options, and to know you need the .note.GNU-stack section, and how to
move it into rodata...it's really all quite terrible. Nobody should have to
do that. :frowning:

There's also the "-b binary" flag to GNU ld (both bfd and gold). But, you
typically need to do a dedicated "link" for that. You do:
  ld -r -b binary picture.jpg -o foo.o
How does ld know what output format to use here? It's gotta just choose the
default, which is kinda poor...or the user needs to know how to spell an
"emulation" and output format...

You could imagine trying to use -Wl to put it with the compile command, but
what do you use to switch back to the normal object format?
  gcc main.c -Wl,-b -Wl,binary -Wl,picture.jpg -Wl,-b -Wl,<<something to
undo binary mode?>>

So, anyways, while this is _possible_ with objcopy, it'd sure be nice if
you never needed to use it for that...

(BTW, Apple ld actually has an option "-sectcreate SEGNAME SECTNAME
INPUT_FILE", and the clang driver will pass it through to the linker.)

One additional use case for you: converting from a binary to an ELF
object file

objcopy -I binary -O elf64-x86-64 foo.bin foo.o

This is sometimes used for embedding binary files for use by drivers and
such.

Yea, unfortunately the command-line you actually end up needing is more
like:
  objcopy -I binary -Bi386:x86-64 -Oelf64-x86-64 --rename-section
.data=.rodata,alloc,load,readonly,data,contents --add-section
.note.GNU-stack=/dev/null

Having to manually invoke objcopy and know what to specify for the -B and
-O options, and to know you need the .note.GNU-stack section, and how to
move it into rodata...it's really all quite terrible. Nobody should have to
do that. :frowning:

There's also the "-b binary" flag to GNU ld (both bfd and gold). But, you
typically need to do a dedicated "link" for that. You do:
  ld -r -b binary picture.jpg -o foo.o
How does ld know what output format to use here? It's gotta just choose
the default, which is kinda poor...or the user needs to know how to spell
an "emulation" and output format...

One way to hack around this might be to pass in one of the other object
files in your project, and have the output .o file replace it. Still pretty
hacky and brittle (and hard to integrate into a build system I would think).

You could imagine trying to use -Wl to put it with the compile command,
but what do you use to switch back to the normal object format?
  gcc main.c -Wl,-b -Wl,binary -Wl,picture.jpg -Wl,-b -Wl,<<something to
undo binary mode?>>

So, anyways, while this is _possible_ with objcopy, it'd sure be nice if
you never needed to use it for that...

The other approaches I've seen or can imagine are:

- Assembler `.incbin` directive (could use it from an inline asm).
- Use a "bin2h" type program which takes a binary and spits out a C file
with a giant uint8_t literal in it, then include that in one of your
normal .c files. In theory a C++11 raw string literal could bypass most of
the parsing overhead of a big array literal, but the people that care about
including a binary in their program probably don't care about that.

-- Sean Silva

Fantastic! Thanks for all of the input! I’ll be considering all of it going forward. The plan right now is just to worry about ELF executables and nothing else. I’m very sympathetic to the “llvm-objtool” change. If everyone is cool with it I’ll change the name in the next CL to “llvm-objtool”.

To start out I implemented a very basic ELF64LE specific bit of code. I’m currently looking for reviewers on it. The phabricator link is here: https://reviews.llvm.org/D33964. I’d like to find people willing to review this as I work on this going forward as well. I haven’t bothered worrying about it but I imagine that this will template fairly easily to support ELF32LE, ELF32BE, and ELF64BE.

Would anyone be willing to let me set them as a reviewer going forward for future CLs?

Fantastic! Thanks for all of the input! I'll be considering all of it
going forward. The plan right now is just to worry about ELF executables
and nothing else. I'm very sympathetic to the "llvm-objtool" change. If
everyone is cool with it I'll change the name in the next CL to
"llvm-objtool".

To start out I implemented a very basic ELF64LE specific bit of code. I'm
currently looking for reviewers on it. The phabricator link is here:
https://reviews.llvm.org/D33964. I'd like to find people willing to
review this as I work on this going forward as well. I haven't bothered
worrying about it but I imagine that this will template fairly easily to
support ELF32LE, ELF32BE, and ELF64BE.

Would anyone be willing to let me set them as a reviewer going forward for
future CLs?

Please add me as a reviewer.

- Michael Spencer

Fantastic! Thanks for all of the input! I'll be considering all of it
going forward. The plan right now is just to worry about ELF executables
and nothing else. I'm very sympathetic to the "llvm-objtool" change. If
everyone is cool with it I'll change the name in the next CL to
"llvm-objtool".

To start out I implemented a very basic ELF64LE specific bit of code. I'm
currently looking for reviewers on it. The phabricator link is here:
https://reviews.llvm.org/D33964. I'd like to find people willing to
review this as I work on this going forward as well. I haven't bothered
worrying about it but I imagine that this will template fairly easily to
support ELF32LE, ELF32BE, and ELF64BE.

Yep. If you haven't found it, take a look at our "ELFT" infrastructure
which should allow easily templating this. A really simple example is the
ELF part of yaml2obj (tools/yaml2obj/yaml2elf.cpp). LLD is another example
that uses ELFT to work across all 4 combinations.
ELFT is so easy to use that going forward you probably won't find yourself
needing to write an initial version for a specific {endian,is64bit}
combination.

Also, one thing to keep in mind is that types like llvm::elf::Elf64_Word
will have the host endianness, which may causes output differences across
different host platforms if they sneak into the output buffer (we do have
some big endian bots, and making sure that tool output is deterministic
across host endianness is a goal of LLVM tools and such differences are
considered bugs). So you may find yourself wanting to use ELFT even in the
initial patch. By using ELFT everywhere, you make sure that things are
guaranteed correct. It's then fairly easy to remove it as needed.

That is exactly what happened in LLD/ELF. We started with everything ELFT
so there was no chance of bugs, then later on once the project was
stabilizing we detemplated many places it to make code simpler when there
wasn't any risk of getting it wrong (for example, in many places you can
just use uint64_t instead of a type that is 32 or 64 bits depending on
ELFT). Also, at the point where we were removing the ELFT templating, we
already had tons of test coverage. AFAIK, thanks to ELFT and that
methodology, LLD/ELF has had zero (really, *zero*; I can't think of a
single one) bugs due to endianness/64bit-ness mixups, despite being a tool
that natively supports all 4 combinations simultaneously and operates on
endian/64bit dependent values read from object files on almost every single
line of code. It's very impressive, and big thanks to Michael for all the
packed_endian_specific_integral / ELFT infrastructure (now if only he would
get packed_endian_specific_integral into the C++ standard :P).

-- Sean Silva

Yes, templating with ELFT works pretty well in LLD. It might be worth summarizing the best practices of the use of ELFT we’ve found in the LLD development. So here it is.

  • If a function or a data strucutre handle/correspond to on-disk ELF files, template them with ELFT.
  • Integral types such as Elf{32,64}_{XWord,Word,Addr,Offset} are not useful and better to avoid. We are using uint{8,16,32,64}_t instead. It seems it improves readability. (I honestly don’t memorize the real types of these ELFT types.)
  • ELFT::uint, whose size is 32/64 depending on ELF32/64, isn’t useful. Always uint64_t to represent a value that can be 32 or 64. The waste of doing this is negligible, but it could drastically simplify types because if your function uses only ELFT::uint, you can de-template that function by using uint64_t.