Extending llvm-objcopy to support Mach-O

Hey everyone! Objcopy is a powerful tool that allows one to modify object files in various manners, for example, modify symbols / symbol tables or copy / remove particular parts of a binary. It also serves as a basis for the strip tool.
Currently, llvm-objcopy only supports ELF files while binutils’ objcopy can handle Mach-O files as well. Besides extending the existing tool to support Mach-O binaries this would enable us to build LLVM-based replacements for cctools’ install_name_tool (for changing rpath(s), identification name etc) and lipo / libtool (for manipulating “fat” binaries) similarly to how llvm-strip was implemented on top of llvm-objcopy. Regarding the code organization, probably, in this case we will have separate folders: ELF, MachO and maybe a few top-level files (ObjcopyOpts.td, StripOpts.td). Any thoughts, concerns, or strong preferences ? Kind regards, Alex

This organization is what I’ve had in mind for a while. I’d like to see a good proposal for how to reorganize HandleArgs to work for different architectures so that CopyConfig and friends can be shared across different fipe formats. That can be worked out in review though.

I’m in full support.

This sounds reasonable to me. I have no objection on this being
pursued currently. I may have some comments about the design once we
have a first version of a patch up for review, though, as it'll be
easier to visualise.

James

I’d give some consideration to moving the objcopy support itself into a library inside llvm (possibly lib/Object as that makes the most sense) and then the tool is just a thin wrapper on top of it.

-eric

That’s something I want to do as well for several reasons. That’s an orthogonal issue however.

I suppose I can take this time to specify which of my mistakes I’d like to see corrected towards the making of this into a library.

  1. Expect/Error instead of hard fail.
  2. As others have stated we should keep section definitions self contained.
  3. Rather than carefully maintaining an order that fragily maintains the init and finalize order there should be a generic dependency manager for this sort of thing. This could also be used for option handling.
  4. Each option should be handled by a single bit of code not by an if statement in giant function.

With some consideration to the public interface we could make it a library and add unit testing. We could also expose the strip predictes and other details so other tools could know what the final state of an executable would be without actually running objcopy. Things like that.

I’m hoping to get a good chunk of time to do that in soon. I’ll send out a proposal for each but I’d welcome thos contributions.

Currently, llvm-objcopy only supports ELF files while binutils' objcopy can
handle Mach-O files as well.

binutils' objcopy also supports COFF

Any thoughts, concerns, or strong preferences ?

No particular, except that I'm considering having a go at starting to implement COFF codepaths for this as well (mainly for a strip tool, and for the --only-keep-debug and --add-gnu-debuglink actions). I might (no guarantees though) start working on that within a few months, unless there's already someone else working on that. (I remember seeing some other discussions about that but nothing that ever made it into patches.)

// Martin

It both is and isn’t. Looking at expanding and reorganizing the tool is a good time to move things out to make the overall effort smaller in the future.

Hey, many thanks for the replies,
yeah, in general I agree with what have been said, making a library makes sense to me, though right now it seems to be non-trivial (to come up with a good library interface + code reorg) (+ it’s not completely but kind of orthogonal to the proposed Mach-O effort). Probably, as for now I would prefer to move here step by step with some incremental changes, I’m planning to start sending patches in this direction soon (within the next ~1-2 weeks) .