Extending llvm-objcopy to support COFF

Currently llvm-objcopy only supports ELF files, and most of it’s command line flags are ELF / DWARF specific that don’t make any sense on COFF files. So a useful set of options for COFF would be largely disjoint, with maybe 1-2 overlapping options. What would be the best way to add this in llvm-objcopy? I can think of 3 options:

  1. Re-write the existing CLI of llvm-objcopy to use subcommands, and put the current set of options behind an ELF subcommand. To me this is the cleanest approach, but it’s also the most disruptive, as existing users of llvm-objcopy would have to retrain themselves to use this new subcommand, and tools / scripts may have to be updated as wells.

  2. Throw in all of the COFF options to the current llvm-objcopy, and just have them be mixed with the ELF options. I think this makes the tool more difficult to use and more confusing, but it is admittedly the simplest approach.

  3. Make a new tool called llvm-coffcopy / llvm-objcopy-coff, or something to that effect.

Anyone have any thoughts or strong preferences?

Hi Zach!

I’ve been thinking a bit about this for a while now and I’m still of two opinions:

Currently llvm-objcopy only supports ELF files, and most of it’s command line flags are ELF / DWARF specific that don’t make any sense on COFF files. So a useful set of options for COFF would be largely disjoint, with maybe 1-2 overlapping options. What would be the best way to add this in llvm-objcopy? I can think of 3 options:

  1. Re-write the existing CLI of llvm-objcopy to use subcommands, and put the current set of options behind an ELF subcommand. To me this is the cleanest approach, but it’s also the most disruptive, as existing users of llvm-objcopy would have to retrain themselves to use this new subcommand, and tools / scripts may have to be updated as wells.

I really like this option. I like orderly commands and having an ELF subcommand is really nice, however…

  1. Throw in all of the COFF options to the current llvm-objcopy, and just have them be mixed with the ELF options. I think this makes the tool more difficult to use and more confusing, but it is admittedly the simplest approach.

This is the sort of thing that people expect from using gnu objcopy and so I’m reticent to have a tool with no way to get the “command line expected syntax”.

Mostly what I want is 1 with a shim that gets me 2.

Thoughts?

-eric

Hi Zach!

I’ve been thinking a bit about this for a while now and I’m still of two opinions:

Currently llvm-objcopy only supports ELF files, and most of it’s command line flags are ELF / DWARF specific that don’t make any sense on COFF files. So a useful set of options for COFF would be largely disjoint, with maybe 1-2 overlapping options. What would be the best way to add this in llvm-objcopy? I can think of 3 options:

  1. Re-write the existing CLI of llvm-objcopy to use subcommands, and put the current set of options behind an ELF subcommand. To me this is the cleanest approach, but it’s also the most disruptive, as existing users of llvm-objcopy would have to retrain themselves to use this new subcommand, and tools / scripts may have to be updated as wells.

I really like this option. I like orderly commands and having an ELF subcommand is really nice, however…

While this is tempting, I don’t think that we can break compatibility with existing tools as it is intended to be a replacement. However, we could do something like we did with readobj and readelf. Note that the obj copy from binutils also works with COFF, so that should be supported too.

  1. Throw in all of the COFF options to the current llvm-objcopy, and just have them be mixed with the ELF options. I think this makes the tool more difficult to use and more confusing, but it is admittedly the simplest approach.

This is the sort of thing that people expect from using gnu objcopy and so I’m reticent to have a tool with no way to get the “command line expected syntax”.

Mostly what I want is 1 with a shim that gets me 2.

Thoughts?

I think we are thinking more or less the same thing.

Hi,

It’s not clear to me what you mean by CLI “subcommands”. Would you mind giving a brief example?

Up to now, we’ve been trying to keep llvm-objcopy as close as possible to GNU objcopy, to make transitioning between them easier (I’m thinking in particular things like DWO generation). There are a small number of edge cases/unusual behaviours that we have chosen not to support whilst doing this. I guess the obvious question from me is do we need the same drop-in replacement for COFF support?

Also, Jake has just posted up a fairly major refactor of the existing command-line interface on Phabricator:

https://reviews.llvm.org/D44236

I don’t know how your proposal would interact with this, but if the approach we’re looking at taking looks like this would just make life harder, please shout!

James

Hey everyone,

Sorry to jump in on this so late. My two cents is that it should remain GNU objoppy compatible most likely. It was always vaguely a desire to have command line compatibility but it has turned out over time that this is actually a crucial feature and should be one of the top priorities. You can’t just go into a giant build system and swap out all the uses of GNU objcopy with their llvm-objcopy equivalents. This is the whole reason I’m adding the llvm-strip interface. We know of multiple instances where GNU strip is used in large systems and it’s non-trivial to replace every last individual usage of GNU strip with the corresponding llvm-objcopy use. We know of even more instances where GNU objcopy is super hard to replace. Most of the users of/people who want to use llvm-objcopy need it to be a drop in replacement. This all said, GNU objcopy hasn’t generally been used on windows platforms, so this might be a moot point. I suppose to convince me otherwise you’d have to convince me that supporting the GNU objcopy options in any reasonable way was detrimental to support for COFF in llvm-objcopy. COFF specific options are fine, we just need to also support the non-ELF specific GNU objcopy options. I’m sure there are several options that llvm-objcopy already supports that just don’t make sense on other formats. In short, you can add extra options but you’d have to convince me that not supporting most of the GNU objcopy options was a bad idea.

Separate thoughts on how to structure handeling other formats:

When I started on this project it was recommended to me that I not try and abstract the operations out over different binary formats. I firmly believe that was good advice now. I’ve since imagined setting things up in an LLD like fashion and having an ELF, COFF, and MachO directory which would more or less function separately. There are some details to be hashed out on how exactly that should happen and how the main executable should dispatch to the different implementations however.

Please assume that tablegen will be used going forward despite that fact that it isn’t now. From there, there are two options:

  1. Have a different .td file for every llvm-{objcopy, strip} and {ELF, COFF, MachO} combination with heavy use of common .td files
  2. Only have different .td files for llvm-objcopy and llvm-strip and have ELF, COFF, and MachO specific options emit errors on non-supported formats

I’m in favor of option 2 but I could be convinced of option 1 as well, especially if you also convinced me that GNU objcopy was too ELF centric and that supporting the GNU objcopy options wasn’t worth it.

Best,
Jake

Hi Jake,

I think we’re in agreement on having a gnu objcopy command line replacement. I was also hoping for a “better” command line interface tool at the same time. So, from the original email we need a #1 to do drop in replacements, but we can also design a tool with a “better” command line interface with some generic command line options and separated out per-object file ones, that sort of thing.

Thoughts?

-eric

FWIW, here's a case where GNU objcopy is used for windows: git.videolan.org Git - vlc.git/blob - extras/package/win32/package.mak

Here, the options --only-keep-debug, --strip-all and --add-gnu-debuglink are used, in addition to $(STRIP).

And elsewhere, for the strip tool, I've seen the option "-x" used.

// Martin