ELF linkers GNU ld, gold, and ld.lld provide --compress-debug-sections=[zlib|zstd]
to compress .debug_*
sections.
This functionality can be extended to arbitrary sections. I have developed a prototype for lld
[RFC][ELF] Add --compress-ections by MaskRay · Pull Request #1 · MaskRay/llvm-project · GitHub with ~40 lines of code and filed a GNU ld feature request back in 2021 (27452 – ld: Support compressing arbitrary sections (generalized --compress-debug-sections=)).
ld.lld --compress-sections <sections-glob>=[zlib|zstd]
In recent years, metadata sections have gained more uses.
Some designers may want to implement compression within the format.
However, if multiple metadata sections adopt this approach, it would lead to duplicated code, considering that we have a generic feature at the object file format level.
For example, if we have --compress-debug
, we can straightforwardly support the compression proposal for the code coverage section __llvm_covmap
.
See Consider using mergeable section for filenames in coverage mapping · Issue #48499 · llvm/llvm-project · GitHub
One question that naturally arises is what should be done if <section-glob>
matches a SHF_ALLOC section.
In my analysis, I find that the linker can simply ignore the distinction between SHF_ALLOC and non-SHF_ALLOC, and the resulting behavior appears reasonable to me.
Compressed sections that are not SHF_ALLOC, similar to .debug_*
sections, do not require any special treatment. On the other hand, compressed sections with the SHF_ALLOC flag will be included as part of a PT_LOAD segment. During the loading process, the dynamic loader does not necessitate any specific handling as section headers are ignored.
When the program is executed, the runtime library responsible for the metadata section will allocate memory and decompress the content of the section. For instance, the section’s content may contain references to .text
, but there will be no references from .text
to the decompressed section. The uncompressed metadata section may start with 4 zero bytes to be distinguished from the Elf{32,64}_Chdr
header with a non-zero ch_type
.
However, I have concerns regarding non-compliance with the ELF standard due to the incompatibility of SHF_ALLOC|SHF_COMPRESSED sections, as stated in the current generic ABI documentation (Sections):
SHF_COMPRESSED - This flag identifies a section containing compressed data. SHF_COMPRESSED applies only to non-allocable sections, and cannot be used in conjunction with SHF_ALLOC. In addition, SHF_COMPRESSED cannot be applied to sections of type SHT_NOBITS.
Therefore, I made a generic-abi proposal to remove the SHF_ALLOC incompatibility from the wording: https://groups.google.com/g/generic-abi/c/HUVhliUrTG0
I have some replies to questions raised on the generic-abi thread. Rephrased below:
Q: Is it wasteful to have both the compressed and uncompressed copies in memory at runtime?
The tradeoff between compressed debug sections and using SHF_ALLOC|SHF_COMPRESSED is quite similar.
When a symbolizer or debugger loads the compressed debug information, it needs to allocate a memory chunk to hold the decompressed content instead of memory mapping the content from the disk.
Why do some people accept this tradeoff? Well, they may prioritize file size and consider debugging as an infrequent operation, or they simply accept this inefficiency.
I understand that SHF_ALLOC|SHF_COMPRESSED sections create an additional copy in the memory image, which can be seen as wasteful.
However, this portion is read-only and accessed on-demand. It’s not significantly different from when a program has an internal symbolizer that performs introspection (opening itself, parsing section headers, finding debug sections); sanitizers support such an internal symbolizer.
Q: Why not switch to non-ALLOC SHF_COMPRESSED?
I believe that SHF_ALLOC has two primary use cases:
- replace runtime introspection (opening its own file, parsing section headers, parsing section content) with inspecting the content between the encapsulation symbols
- prevent strip/llvm-strip from stripping the sections
Q: Is there any restriction for SHF_ALLOC|SHF_COMPRESSED sections?
Runtime library’s decompression and “relocation” operation imposes certain limitations on use cases. For instance, it would not be possible to define a symbol relative to an input section if its absolute address is significant, as the “relocation” of the section would nullify the absolute address. However, label differences within the output section would still be permissible.
In my prototype, I try to compute the output section size once, expect that it does not change, and give an error if it does change due to certain linker script constructs.
Q: What should PE/COFF, Mach-O, wasm, XCOFF do?
I wish that they have a generic compression feature as well:) Based on my observation, there are ELF users who have exceptionally large executables and prioritize compression. In the long term, I wish that object file format vendors who have users caring about compression provide compression at the object file format level, not add more compression code to various compiler instrumentation features.