[llvm-objcopy] update-section incompatible to GNU objcopy

Dear LLVM-community,

in my organization we are currently shifting heavily towards an LLVM based toolchain. In this context we came across an uncompatibility of the llvm-objcopy tool w.r.t. the “update-section” command.

Use Case
We want to modify a loadable section of an ELF file and replace its content with some other content coming from a file. The file with the new content is larger than the original section which shall be replaced.

GNU objcopy behavior
Section content is replaced AND the offset and VMA of subsequent sections within the parent segment are updated accordingly.

llvm-objcopy behavior
Ends with an error message stating:

/llvm/lib/ObjCopy/ELF/ELFObject.cpp:
“cannot fit data of size %zu into section ‘%s’ with size %zu that is part of a segment”.

What are the reasons for preventing from updating a sections content with some larger content? Would it introduce a major design change (e.g. itnroducing a new pass over the object file), which is considered a bad tradeoff when compared to the benefit.

Would it be possible without major design changes, but it was not realized yet, since there was no use case for it? If so, could you give me a starter on this?

I would be very grateful for hearing the opinions of the community especially from the people deeply involved with llvm-objcopy (e.g. @jh7370 ).

Best regards,
Viktor

@leonardchan implemented the --update-section flag in ⚙ D112116 [llvm-objcopy] Add --update-section, mentioning about this case:

We are opting to not support this case. GNU’s objcopy was able to do this because the linker and objcopy are tightly coupled enough that segment reformatting was simpler. This is not the case with llvm-objcopy and lld where they like to be separated.

In general we try to have LLVM utilities keep as much compatibility as possible with GNU binutils, except when it doesn’t make sense. It sounds like it might be technically possible to have llvm-objcopy support this, but the use case isn’t yet compelling enough to take on the design change.

@MaskRay also has knowledge here (and in LLD), maybe one of them can offer a workaround for your use case.

1 Like

The requirement is mainly for executables and shared objects. objcopy has a requirement that it generally cannot update addresses (unless --change-section-address, which has unclear semantics). Non-SHT_NOBITS sections in a PT_LOAD program header needs to satisfy sh_addr = sh_offset (mod p_align). For the section updated by --update-section, the difference between the current section’s address and the next section’s address sets an upper bound on the section size. If you put too much stuff in the content, objcopy/llvm-objcopy cannot adjust sh_offset in a way to make the addresses meaningful.

llvm-objcopy just uses the simple heuristic that the new content cannot have a larger size.

Thank you for your replies. Still have some more questions:

I guess you are speaking about llvm-objcopy, right? GNU-objcopy seems to offer several options to modify section addresses. So what’s the reason for having such a constraint requirement for llvm-objcopy when compared to GNU objcopy?

I’m not really familiar with GNU objcopy and how it is implemented. However, in principle, to be able to change the section address of a fully-linked program (i.e. one where addresses have already been assigned) would require essentially rerunning the relocation part of the link. This is because all the relocations targeting the sections whose addresses have been modified would need reapplying to ensure they are targeting the right place. Since usually those relocations have actually been discarded, in practice this would require llvm-objcopy to be able to disassemble the code and identify references in the code that need updating. This is a hard task, so we choose not to do it.

In theory, it would probably be reasonable to allow address changes in some limited cases, specifically where no references to the modified sections (i.e. those with address changes) exist. However, again typically it is impossible for llvm-objcopy to know that this is the case without the aforementioned instruction disassembling. If there was a sufficient need, we could probably add a switch that disabled the update-section check, but with the caveat that this wouldn’t attempt to update references, so may not be a safe thing to do.