RFC: Support for Memory Regions in ELF

Background

In embedded systems, it’s common for different parts of the memory to have different performance or power characteristics. These systems also typically don’t have an MMU and use a single address space, and usually rely on linker scripts to map the code and the data to appropriate locations.

To achieve this, at the C/C++ level, the applications typically utilize the section attribute:

__attribute__((section(".special"))) int buffer[1024];

At the linker script level, they would then match the section and place it in an appropriate memory region:

MEMORY {
  SPECIAL(rwxi) : ORIGIN = 0x00000000, LENGTH = 64k
  ...
}
SECTIONS {
  .special : { *(.special); } > SPECIAL
  ...
}

While this approach achieves the desired goal, it has several shortcomings.

First of all, it breaks the linker GC (that is --gc-sections). When the -ffunctions-sections and -fdata-sections options are used, the compiler places each symbol into its own section, and the linker can then discard unused sections by tracing the relocations.

When we use the section attribute, all symbols are placed into a single section and as such cannot be discarded individually.

To work around this issue, we can manually replicate the compiler behavior and use a separate section for each symbol:

__attribute__((section(".special.buffer"))) int buffer[1024];

This approach is used for example by Pico SDK.

For zero-initialized (explicitly or implicitly) variables, compilers typically use a separate .bss section to reduce binary size since these variables can be allocated during initialization.

We can achieve the same effect by again having a dedicated section which is an approach also used by Pico SDK, but this further increases the code complexity and adds maintenance overhead since these attributes need to be kept up-to-date during code changes.

Proposal

What developers are really trying to achieve is annotating certain symbols as living in a special part of the memory without changing the existing rules for section allocation and placement. To support that, our proposal is to introduce support for memory regions as a new memory attribute:

__attribute__((memory("special"))) int buffer[1024];

In LLVM IR this would be represented as:

@buffer = dso_local global [1024 x i32] zeroinitializer, memory "special", align 16

To represent the mapping of symbols to regions at the object file level, we propose introducing a new custom section type SHT_LLVM_MEMORY. This section contains an array of fixed-size entries where each entry is 2 uint32_t integers: section number and string table offset, where the string table offset points to the region name. sh_link should point to the particular string table section, as it does in SHT_SYMTAB sections.

To support the use of address spaces from linker scripts, we propose introducing a new matcher INPUT_SECTION_MEMORY (akin to INPUT_SECTION_FLAGS):

SECTIONS {
  .special : { INPUT_SECTION_MEMORY(special) *(.text .text.*) } > SPECIAL
  .text : { INPUT_SECTION_MEMORY(!special) *(.text .text.*) } > RAM
}

The matcher would use exact match (plus negation), but could be extended to support fnmatch (plus negation) pattern as for sections if there’s a valid use case.

Initially, we plan to implement this feature only for the ELF format which is most commonly used in embedded systems, but there’s no fundamental reason why this idea couldn’t be extended to other object formats in the future.

We are open to suggestions for alternative names as memory may be overly generic. Some of the other ideas we considered were memory_name or section_category.

Alternatives

Instead of introducing a new attribute and LLVM IR construct, we could reuse address_space, this has some downsides though. Most notably, address_space has type system implications—pointers to the same type in different address spaces are incompatible—as well as target-dependent semantics which may be undesirable. Furthermore, address_spaces are identified by an integer which affects usability. Support for string keys for address space has already been proposed before but there hasn’t been any progress in implementing this proposal and it would likely require significant changes throughout Clang and LLVM.

AMD also described the concept of memory_spaces as part of the proposed DWARF Extensions For Heterogeneous Debugging. These appear to have the same constraints as address_spaces though.

Some initial thoughts based on first reading.

Does attribute((memory(name))) require -fdata-sections, or it turns it on implicitly for those symbols?

I assume attribute((memory(name))) is incompatible with attribute((section(name))). Or one would win over the other?

As I understand it there are two separate parts to this RFC:

  1. A many to one mapping from section to memory region name. The intent here is to avoid named sections with the same name being merged into the same section.
  2. A way of avoiding named sections with generic names such as .bss.named_section being matched by patterns like *(.bss) before a later more specific pattern like *(.bss.named_section).

I think 2, or something like it would be very useful, particularly for those that use variables to place over memory mapped registers. These currently need names like .bss.foo or a NO_LOAD section.

As specified in the RFC the biggest drawback I can think of is that tools like objcopy, particularly GNU objcopy, that don’t understand SHT_LLVM_MEMORY could easily break the section number to name mapping if it removes/adds a section and doesn’t update the mapping.

The other possible drawback is that as it requires compiler and linker support it may be harder to get GCC/binutils support which may limit the use of the attribute for code that needs to build with both GCC and LLVM. I guess macros may come to the rescue here for the compiler side, with named sections used in GCC and memory in Clang.

Will have to think a bit harder to see if there’s any alternative way to make this work without needing a new section.

For example an attribute to prevent named sections from being merged. For example an old Arm proprietary compiler had a zero_init attribute that could be combined with attribute((section(name))) to get a SHT_NOBITS named section that didn’t start with .bss. Documentation – Arm Developer

There would need to be something in the linker script that serves the same purpose of INPUT_SECTION_MEMORY to stop the named section matching though.

I’ve wanted something similar to this too. The ESP-IDF also has something similar. I’m prototyping a noxip attribute to do this for the specific execute-in-place (XIP) problem where code must be able to run when XIP memory is temporarily disabled. There are a couple things to think about.

There are three other aspects I’ve been thinking about:

  1. How does this interact with LTO? In my noxip attribute prototype I made the caller inherit the attribute from an inlined callee.
  2. Is there a step to propagate this attribute to referenced functions and data? This is essential to getting the noxip case correct because referencing any other memory can cause a fault. The ESP-IDF discusses this here.
  3. Do you want to allow a negation of this as well? Init code may need to be in any region except those that it initializes. If you want to support this, then would it be even better to add enable/disable attributes to analyze memory regions available through a code flow?

I don’t think I’d use annotations for performance related information because the size of the fast memory may vary. The compiler with PGO can also be smarter to include hot basic blocks in fast memory but then split out cold basic blocks into slow memory. (Assuming that it isn’t critical for XIP for example.)

I’m @tannewt on Discord as well so feel free to ping me if you want to talk more interactively about this.

This seems like a root cause of the problem, although I may be missing other issues. If the user could control the section name for a symbol without breaking -fdata/function-sections and garbage collection, that would achieve the same goal in a more generic way and keeping the same link flow. For example,
– allow the compiler to issue multiple sections with the same name, which are garbage collected separately.
– have something like __attribute__((section_prefix(".bss.special"))), where the produced section name for each symbol with this attribute is a concatenation of the specified prefix, possibly a dot, and the symbol name.