[RFC] CREL: A compact relocation format for ELF

I’ve heard interests from a few users caring about relocatable file sizes and linker memory usage.

Some psABI groups (including RISC-V and s390x) are receptive to including SHT_CREL in the generic range.
They might be open to adopting the feature once the toolchain support is finalized and user demand picks up.

I’ve sent a subset of https://github.com/MaskRay/llvm-project/tree/demo-crel as [MC,llvm-readobj,yaml2obj] Support CREL relocation format by MaskRay · Pull Request #91280 · llvm/llvm-project · GitHub for review. lld/llvm-objdump/obj2yaml changed will follow.


Specification:

In https://www.sco.com/developers/gabi/latest/ch4.sheader.html, make the following changes.

In Figure 4-9: Section Types,sh_type, append a row

SHT_CREL | 20

Add text:

SHT_CREL - The section holds compact relocation entries with explicit addends. An object file may have multiple relocation sections. See ‘‘Relocation’’ below for details.

In Figure 4-16: Special Sections, append

.crelname | SHT_CREL | see below

Change the text below:

.relname, .relaname, and .crelname

These sections hold relocation information, as described in ‘‘Relocation’’. If the file has a loadable segment that includes relocation, the sections’ attributes will include the SHF_ALLOC bit; otherwise, that bit will be off. Conventionally, name is supplied by the section to which the relocations apply. Thus a relocation section for .text normally would have the name .rel.text, .rela.text, or .crel.text.

In Figure 4-23: Relocation Entries, add:

typedef struct {
  Elf32_Addr r_offset;
  Elf32_Word r_symidx;
  Elf32_Word r_type;
  Elf32_Sword r_addend;
} Elf32_Crel;

typedef struct {
  Elf64_Addr r_offset;
  Elf64_Word r_symidx;
  Elf64_Word r_type;
  Elf64_Sxword r_addend;
} Elf64_Crel;

Add text above “A relocation section references two other sections”:

A SHT_CREL section holds compact relocation entries that decode to Elf32_Crel or Elf64_Crel depending on the object file class (32-bit or 64-bit).
Its content begins with a ULEB128-encoded value count * 8 + addend_bit * 4 + shift (35-bit or 67-bit unsigned), where:

  • count: Relocation count (32-bit or 64-bit unsigned).
  • addend_bit: 1 indicates that relocation entries encode addends. 0 indicates implicit addends (stored in the location to be modified).
  • shift: The shift value (0 to 3) applies to delta_offset in relocation entries.

Relocation entries (which encode r_offset, r_symidx, r_type, and r_addend) follow the header.
Note: r_info in traditional REL/RELA formats has been split into r_symidx and r_type, allowing uint32_t relocation types for ELFCLASS32 as well.

  • Delta offset and flags (ULEB128): Holds delta_offset * (addend_bit ? 8 : 4) + flags (35-bit or 67-bit unsigned), where:
    • delta_offset: Difference in r_offset from the previous entry (truncated to Elf32_Addr or Elf64_Addr), right shifted by shift.
    • flags: 0 to 7 if addend_bit is 1; otherwise 0 to 3.
    • flags & 1: Indicate if delta symbol index is present.
    • flags & 2: Indicate if delta type is present.
    • flags & 4: Indicate if delta addend is present.
  • Delta symbol index (SLEB128, if present): The difference in symbol index from the previous entry, truncated to a 32-bit signed integer.
  • Delta type (SLEB128, if present): The difference in relocation type from the previous entry, truncated to a 32-bit signed integer.
  • Delta addend (SLEB128, if present): The difference in addend from the previous entry, truncated to a 32-bit or 64-bit signed integer depending on the object file class.

ULEB128 or SLEB128 encoded values use the canonical representation (i.e., the shortest byte sequence).
For the first relocation entry, the previous offset, symbol index, type, and addend members are treated as zero.

Encoding/decoding delta offset and flags does not need multi-precision arithmetic. We can just unroll and special case the first iteration.
The header can be encoded/decoded in a similar way. An implementation can assume that the relocation count cannot be larger than 2**61 and simplify the code.

Example C++ encoder:

// encodeULEB128(uint64_t, raw_ostream &os);
// encodeSLEB128(int64_t, raw_ostream &os);

const uint8_t addendBit = config->isRela ? 4 : 0, flagBits = config->isRela ? 3 : 2;
Elf_Addr offsetMask = 8, offset = 0, addend = 0;
uint32_t symidx = 0, type = 0;
for (const Elf_Crel &rel : relocs)
  offsetMask |= rel.r_offset;
int shift = std::countr_zero(offsetMask)
encodeULEB128(relocs.size() * 8 + addendBit + shift, os);
for (const Elf_Crel &rel : relocs) {
  Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
  uint8_t b = (deltaOffset << flagBits) + (symidx != rel.r_symidx) +
              (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
  if (deltaOffset < (0x80 >> flagBits)) {
    os << char(b);
  } else {
    os << char(b | 0x80);
    encodeULEB128(deltaOffset >> (7 - flagBits), os);
  }
  if (b & 1) {
    encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
    symidx = rel.r_symidx;
  }
  if (b & 2) {
    encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
    type = rel.r_type;
  }
  if (b & 4 & addendBit) {
    encodeSLEB128(std::make_signed_t<Elf_Addr>(rel.r_addend - addend), os);
    addend = rel.r_addend;
  }
}

Example C++ decoder:

// uint64_t decodeULEB128(uint8_t *&p);
// int64_t decodeSLEB128(uint8_t *&p);

const auto hdr = decodeULEB128(p);
const size_t count = hdr / 8, flagBits = hdr & 4 ? 3 : 2, shift = hdr % 4;
Elf_Addr offset = 0, addend = 0;
uint32_t symidx = 0, type = 0;
for (size_t i = 0; i != count; ++i) {
  const uint8_t b = *p++;
  offset += b >> flagBits;
  if (b >= 0x80)
    offset += (decodeULEB128(p) << (7 - flagBits)) - (0x80 >> flagBits);
  if (b & 1)
    symidx += decodeSLEB128(p);
  if (b & 2)
    type += decodeSLEB128(p);
  if (b & 4 & hdr)
    addend += decodeSLEB128(p);
  rels[i] = {offset << shift, symidx, type, addend};
}

Both encoder and decoder can be simplified if the desired addend_bit is hardcoded, making flagBits an integer literal.

In Figure 5-10: Dynamic Array Tags, d_tag, add:

DT_CREL | 38 | d_ptr | optional | optional

Add text below:

  • DT_CREL - This element is similar to DT_REL, except its table uses the CREL format. The relocation count can be inferred from the header.

Update DT_PLTREL and DT_PLTRELSZ:

  • DT_PLTRELSZ: This element holds the total size, in bytes, of the relocation entries associated with the procedure linkage table. If an entry of type DT_JMPREL is present and the DT_PLTREL entry value is DT_REL or DT_RELA, a DT_PLTRELSZ must accompany it.
  • DT_PLTREL: This member specifies the type of relocation entry to which the procedure linkage table refers. The d_val member holds DT_REL, DT_RELA, or DT_CREL, as appropriate. All relocations in a procedure linkage table must use the same relocation type.