[lld] RFC: Allow custom sections to be under GNU_RELRO

Hey,

We would like to propose an idea that would help security harden applications that define custom sections.

Motivation and Background
In Chromium we have a garbage collector that implements some RTTI machinery in the form of a table. This table is used by the collector to trace and finalize garbage collected objects. We’re thinking of using attribute((section(…))) so that the table can be created and merged at link time. We also use -fPIC and therefore rely on the dynamic linker to process relocations in this table after the program is loaded. At the same time, we want the table to be read-only after relocations are applied, in the same fashion as e.g. .got sections are write protected after eager binding (with -z,relro,-z,now). The custom section can’t be mprotected, because it can live in the same PT_LOAD segment as other modifiable data (e.g. .data).

At the moment, all big 3 ELF linkers hardcode names of read-only-after-relocation sections (.data.rel.ro, .bss.rel.ro, .ctors, .eh_frame, …). We would like to propose extending this for custom sections that end with “.rel.ro”.

What do you think? Would this be useful to you?

Hello Anton,

At the moment, all big 3 ELF linkers hardcode names of read-only-after-relocation sections (.data.rel.ro, .bss.rel.ro, .ctors, .eh_frame, ...). We would
like to propose extending this for custom sections that end with ".rel.ro".

What do you think? Would this be useful to you?

In principle, if you can get a convention agreed by the ELF linkers then I don't see too much of a problem. There are two places I can see where you may get some push back.

The first is the use of a custom suffix, all other linker conventions, that I know of, use prefixes as these are much easier and faster to match against names. This can be important in large programs compiled -ffunction-sections as there can be millions of sections to match.

The second is that you may be able to accomplish what you need with a linker script, I'm guessing you don't want to use the existing .data.rel.ro.* convention to take advantage of linker generated symbols. The following example (not tested) assumes a naming convention of .data.rel.ro.RTTI.* in your program.

  .data.rel.ro : {
                        PROVIDE_HIDDEN (__RTTI_start = .); *(.data.rel.ro.RTTI.*) ; PROVIDE_HIDDEN (__RTTI_end = .);
                        *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) }

This would have your table contiguous in the .data.rel.ro OutputSection and can be found with the __RTTI_start and __RTTI_end symbols. It may not work so well if you are using SHF_LINK_ORDER for the RTTI sections as some linkers tend to handle these better when every InputSection in the OutputSection has SHF_LINK_ORDER.

Peter

Peter,

Thanks for the great feedback!

The first is the use of a custom suffix, all other linker conventions, that I know of, use prefixes as these are much easier and faster to match against names.
This can be important in large programs compiled -ffunction-sections as there can be millions of sections to match.

I understand the reason of having these conventions in linkers. On the other hand, there already exists a format with the fixed “.rel.ro” suffix for .data and .bss. Custom suffixes would also mean that the user code would depend more on implementation-specific things, i.e. prefixes. For example, one would wonder, should they annotate their data with attribute((section(…))) for “.data.rel.ro.my_section” or “.bss.rel.ro.my_section”?

The second is that you may be able to accomplish what you need with a linker script
We discussed the possibility of using a custom linker script to achieve that. The main problem is that we’re planning to ship the garbage collector as a library. Requiring users to plug in the linker script will be too much of overkill for them and for us. Another problem is that in Chromium we continuously update our toolchain. If there is any change in the default linker script (or logic thereof), we would also have to rebase it on top of our custom one.

In principle, if you can get a convention agreed by the ELF linkers then I don’t see too much of a problem.
LLD is the primary linker we use for the most of the platforms we officially support. I wonder, would lld folks be happy if this feature is first implemented as an lld extension and then ported to other linkers as well?

пт, 27 мар. 2020 г. в 10:06, Peter Smith <Peter.Smith@arm.com>:

This can be important in large programs compiled -ffunction-sections as there can be millions of sections to match.

I understand the reason of having these conventions in linkers. On the other hand, there already exists a format with the fixed ".rel.ro" suffix for
.data and .bss. Custom suffixes would also mean that the user code would depend more on implementation-specific things, i.e. prefixes. For
example, one would wonder, should they annotate their data with __attribute__((section(...))) for ".data.rel.ro.my_section" or
".bss.rel.ro.my_section"?

The way that they are implemented the .data.rel.ro and the .bss.rel.ro are 2 separate prefixes, rather than having a single shared .rel.ro suffix, infix even if we are counting .data.rel.ro.* and .bss.rel.ro.*

Having prefixes and suffices being significant could also lead to situations where we have a recognised prefix and a recognised suffix on the same selector. I suspect that for most use cases this won't happen but the tools ought to guard against it which does require more complexity.
Anyhow, that's just my opinion on the use of a suffix, there may be others without my concerns.

Given a clean slate, I'd try and do this via a section flag. If all Input Sections in an OutputSection have the SHF_RELRO flag then the Section is SHF_RELRO, this isn't ideal for a flag as we have to actively clear it if there is a mix of SHF_RELRO and non SHF_RELRO, but it would at least solve the naming problem. The difficulty here is getting support for a new ELF Section flag across ELF toolchains.

In principle, if you can get a convention agreed by the ELF linkers then I don't see too much of a problem.

LLD is the primary linker we use for the most of the platforms we officially support. I wonder, would lld folks be happy if this feature is first
implemented as an lld extension and then ported to other linkers as well?

Personally I'd prefer trying to come up with a solution in both communities first. There is a danger that one community won't accept an extension that they haven't been involved in the design process for. There are some efforts such as https://www.openwall.com/lists/libc-coord/2020/01/30/1 set up to try and do more cross community design. Again this is just my opinion, I can't claim to speak for all of the LLD developers.

Peter

This can be important in large programs compiled -ffunction-sections as there can be millions of sections to match.

I understand the reason of having these conventions in linkers. On the other hand, there already exists a format with the fixed ".rel.ro" suffix for
.data and .bss. Custom suffixes would also mean that the user code would depend more on implementation-specific things, i.e. prefixes. For
example, one would wonder, should they annotate their data with __attribute__((section(...))) for ".data.rel.ro.my_section" or
".bss.rel.ro.my_section"?

The way that they are implemented the .data.rel.ro and the .bss.rel.ro are 2 separate prefixes, rather than having a single shared .rel.ro suffix, infix even if we are counting .data.rel.ro.* and .bss.rel.ro.*

Having prefixes and suffices being significant could also lead to situations where we have a recognised prefix and a recognised suffix on the same selector. I suspect that for most use cases this won't happen but the tools ought to guard against it which does require more complexity.
Anyhow, that's just my opinion on the use of a suffix, there may be others without my concerns.

This may require some logic in GNU ld's orphan section placement, which
may be more difficult given that it is linker script driven. gold and
lld may implement such rules more easily, but I am just speculating.

Given a clean slate, I'd try and do this via a section flag. If all Input Sections in an OutputSection have the SHF_RELRO flag then the Section is SHF_RELRO, this isn't ideal for a flag as we have to actively clear it if there is a mix of SHF_RELRO and non SHF_RELRO, but it would at least solve the naming problem. The difficulty here is getting support for a new ELF Section flag across ELF toolchains.

Florian Weimer also mentioned we can use a new section flag
https://sourceware.org/pipermail/binutils/2020-March/110428.html

FWIW, some people expressed willingness to maintain ELF spec again
(unmaintained since ~2015)
https://groups.google.com/forum/#!topic/generic-abi/cfOCv5Y0-B4

Non-GNU perspecitives will be nice:)

In principle, if you can get a convention agreed by the ELF linkers then I don't see too much of a problem.

LLD is the primary linker we use for the most of the platforms we officially support. I wonder, would lld folks be happy if this feature is first
implemented as an lld extension and then ported to other linkers as well?

Personally I'd prefer trying to come up with a solution in both communities first. There is a danger that one community won't accept an extension that they haven't been involved in the design process for. There are some efforts such as libc-coord - Welcome to the libc-coord list set up to try and do more cross community design. Again this is just my opinion, I can't claim to speak for all of the LLD developers.

+1 for a topic on libc-coord