[RFC] Execute-only code support for runtime libraries on AArch64

Hi all,

I’ve been working on enabling support for execute-only code generation on AArch64. This RFC aims to gather feedback about enabling this feature for LLVM’s runtime libraries, specifically compiler-rt, libc++, libc++abi, and libunwind.

Background

The old approach to mark code segments as executable-only is to use the --execute-only flag for LLD. This flag marks all executable segments --x (as opposed to r-x) unconditionally. The issue with this approach is that there is an incompatibility with -fsanitize=function, which reads 8 bytes of metadata in front of the called function’s address on each indirect function call, causing a segmentation fault when executable-only code is enabled. (A similar issue happens when using -fsanitize=kcfi)

To solve the incompatibility, the SHF_AARCH64_PURECODE section flag was introduced in the AArch64 ELF spec, mirroring the existing SHF_ARM_PURECODE section flag in the 32-bit Arm ELF spec. Using the new section flag, object files can communicate to the linker whether they allow their code sections to be put into executable-only segments. The linker is then able to check whether all input sections have the SHF_AARCH64_PURECODE flag set, and if yes, mark the output section as execute-only. This makes sure that the output is marked execute-only if and only if all input sections allow it.

In the new approach, users must opt-in to compiling their code for executable-only support in the compiler front-end and assembly code, as opposed to when linking. In Clang, for C and C++ code, this is done using the -mexecute-only or -mpure-code compiler flags, and in assembly, the “y” section flag should be used:

.section .text,"axy",@progbits,unique,0
//              ^^^ = SHF_ALLOC | SHF_EXECINSTR | SHF_AARCH64_PURECODE

Implementation status

The necessary changes in LLVM, Clang and LLD have already been made to enable compiling and linking code for executable-only output over the past few months. I am also not aware of any other compilers that have implemented or started on this yet, so enabling execute-only code generation in the runtime libraries should probably be restricted to a runtimes build for now.

RFC

With the move from a linker flag to a front-end option, execute-only code generation needs to be explicitly enabled for any pre-compiled component of a compiler toolchain.

The intended use case for this feature is to try to re-enable execute-only code in Android. For that, Android’s LLVM toolchain needs to support generating execute-only code, including LLVM’s runtime libraries (compiler-rt, libc++, libc++abi, libunwind). Because of the limited initial scope, I think using this feature should be opt-in via a CMake parameter at build time.

Build configuration will need to set two things:

  1. Add -mexecute-only (or -mpure-code) to C and C++ flags;
  2. Add a define for assembly files indicating that they should use the “y” section flag for code sections.

As far as I can tell, the second point only concerns compiler-rt and libunwind, as libc++ and libc++abi don’t have any assembly source files.

For CMake configuration I see two possible approaches:

  1. Use one variable for all runtime libraries, e.g. LLVM_EXECUTE_ONLY_CODE
    • Only one variable to indicate the same feature across different libraries;
    • Doesn’t necessarily apply to all runtime libraries (e.g. LLVM libc for now).
  2. Use separate variables for each sub-project, i.e. COMPILER_RT_EXECUTE_ONLY_CODE, LIBCXX_EXECUTE_ONLY_CODE, etc.
    • It’s clear which libraries have execute-only code generation support;
    • The variable needs to be translated in some cases, for example MSan and TSan build their own versions of libc++ and libc++abi, so COMPILER_RT_EXECUTE_ONLY_CODE needs to be propagated as its LIBCXX_* and LIBCXXABI_* equivalent in those cases.

Personally I prefer the second option, but I’d like to get feedback on what others think is the right solution, which may be different from the two I mentioned above.

This mostly covers my design approach and questions for this feature, but I don’t have a lot of experience with how CMake configuration should be handled in LLVM. I also couldn’t find a comparable configurable feature, which I could have based my approach on, so if you have any thoughts, suggestions or questions, I’d like to hear them all.

Could there be a case for both options, with the LLVM_EXECUTE_ONLY_CODE expanding to COMPILER_RT_EXECUTE_ONLY_CODE, LIBCXX_EXECUTE_ONLY_CODE etc?

To get the build right someone would need to know, or derive all the individual flags for all the runtime libraries, these flags often aren’t discoverable or can easily be missed out. Doing both would cover the most common case and would permit an advanced user to override or choose individually.

I’m assuming the majority of cases for a user to want all or none of the libraries built with execute-only.

That sounds good to me, we can set the default values of COMPILER_RT_EXECUTE_ONLY_CODE et al. to LLVM_EXECUTE_ONLY_CODE to make this work.

I agree. The only one I could think of was if someone wanted to use the function sanitizer for libc++ for example, but it looks like the configuration scripts add -fsanitize=undefined -fno-sanitize=function in all cases anyways, so it doesn’t seem like a common or well-supported use case.

For reference, here are the PRs I’ve seen so far:

In the same vein as @philnik 's question in #140552 , I wonder if it really makes sense to make this an option as opposed to something that is always enabled. Is there a reason why this isn’t controlled on a target-triple basis instead?

Also, have you looked into requesting for native CMake support for this option? Sorry if this is naive, but it’s the first time I read about this feature and I don’t fully understand how it fits into the grand scheme of things.

In #140552 you mention that this should be set consistently across all runtime components of the toolchain. This makes me lean towards having just a single RUNTIMES_EXECUTE_ONLY_CODE CMake setting if we must have one (e.g. because CMake doesn’t support it yet and setting it per target-triple doesn’t make sense).

For context, the angle I’m approaching this from is to try and avoid increasing the complexity of our CMake by adding yet another configuration option. Sometimes it’s absolutely necessary, but in general we try to avoid adding new boolean flags which basically double the number of configurations that we should theoretically maintain and test.

It needs to be an option for the same reason the compiler flag exists in the first place: it conflicts with anything that actually needs to access the text segment, in particular certain sanitizers. (See [AArch64][Docs] Add release note for execute-only support on AArch64 by Il-Capitano · Pull Request #134799 · llvm/llvm-project · GitHub .)

Another PR is [compiler-rt] Add CMake option to enable execute-only code generation on AArch64 by Il-Capitano · Pull Request #140555 · llvm/llvm-project · GitHub. I can’t edit my original post to add these.
After these 4 have been merged, my intention is to add a common CMake option to control the default values of the individual flags, as discussed in this thread previously.

Another reason for making this an option is if a new toolchain wants to experiment with enabling this feature, they would simply need to add new CMake flags to their build as opposed to a local patch file that needs to be applied.

This could be a workable solution, however currently there aren’t any targets that would enable this. For experimenting with the feature they would need a local patch again.
Also, execute-only code isn’t tied to a specific target triple. For example someone might want to assemble an aarch64-linux-musl or aarch64-none-elf toolchain with execute-only support for a specific use case. Currently the only requirement for execute-only code generation on AArch64 is that it needs to be an ELF target.

No, I haven’t. Simply setting -DCMAKE_C_FLAGS="-mexecute-only" would achieve the same result, so I’m not sure there’s a need for native CMake support. The first time I tried enabling execute-only code generation in the runtime libraries I used CMAKE_C_FLAGS, but there are some parts of compiler-rt, e.g. CRT startup objects, where this value isn’t propagated, so I assume a native CMake option wouldn’t propagate either.

Using a single CMake variable is also an option. Currently my intention is to have a common LLVM_EXECUTE_ONLY_CODE option that controls the defaults of the individual variables, so one could enable them with -DRUNTIMES_<triple>_LLVM_EXECUTE_ONLY_CODE=ON -DBUILTINS_<triple>_LLVM_EXECUTE_ONLY_CODE=ON during build configuration. I’m not sure if there is a valid use case for enabling it for certain runtime libraries and disabling it for others, so I’m fine with using a single variable for everything. I’ll ping people in the compiler-rt PR to see if they have any thoughts on this.

Thank you for your insights, I hope I was able to answer everything.

But doesn’t Clang already know whether the code it emits is incompatible with this or not? Why does the user need to tell it that? That is: why doesn’t Clang enable -mexecute-only by default unless there are weird instrumentation features enabled (-fsanitize=function or whatever else)?

Ah, actually that’s interesting. For libc++, libc++abi and libunwind, these patches seem to be more or less equivalent to just passing new CMAKE_CXX_FLAGS. For libunwind, additionally we want to tweak the assembly in two places to add .section .text,"axy",@progbits,unique,0.

If we had a way to know that the compiler is generating execute-only code from the sources (such as a __has_feature check), I think we could avoid changing libc++ and libc++abi’s CMake and simplify the libunwind patch to something like

#if __has_feature(__execute_only_code_whatever__)
.section .text,"axy",@progbits,unique,0
#endif

Something along those lines. That would have the benefit of not increasing the complexity of CMake for just passing an additional compiler flag.

The compiler-rt build is challenging so I’m not going to speak to that, but I’d be much more in favour of an approach where we don’t have to add new CMake options to libc++/libc++abi/libunwind and reuse something more open-ended like passing CMAKE_CXX_FLAGS. We even have {LIBCXX,LIBCXXABI,LIBUNWIND}_ADDITIONAL_COMPILE_FLAGS if you ever have to pass it only to some but not all of the runtimes.

I’m curious to hear your thoughts on that.

Yes, definitely, thanks for your thorough reply!

ObjC autorelease return value markers are incompatible with execute-only text sections.

If we go ahead with combining the project-specific options into a single LLVM_EXECUTE_ONLY_CODE CMake option, then this simplifies configuration a bit. We can just do something like the following in runtimes/CMakeLists.txt:

if (LLVM_EXECUTE_ONLY_CODE)
  set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mexecute-only")
  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mexecute-only")
endif()

This eliminates the need for the libc++ and libc++abi changes, and simplifies libunwind and compiler-rt too.

-mexecute-only doesn’t apply to the assembler, so there’s no way to do this without changing how the flag is handled currently. I think detecting this in the assembler doesn’t really make sense because you have to explicitly create execute-only sections anyways, so the flag wouldn’t change anything in how the code is assembled. Also I don’t see how this is different from my current change, which is to define _LIBUNWIND_EXECUTE_ONLY_CODE and use that in the #if.

I’m not aware of a planned use-case which requires project-level control over the configuration, so I’m not against simply using CMAKE_CXX_FLAGS for setting the -mexecute-only flag, and keeping the assembly changes as-is in libunwind and compiler-rt. If there’s no push back against combining the project-level CMake options into a single global option, I’ll update my PRs to use your idea for simplifying the config. Thank you for the suggestion!

The compiler can’t really assume how the object file is going to be used later. For example I see a potential issue arising if an executable compiled with -fsanitize=function indirectly calls a function defined in a DSO. If the DSO was compiled without sanitizers and Clang enabled -mexecute-only by default, that could lead to a segmentation fault because the sanitizer check reads from .text. The linker can handle mismatched execute-only and non-execute-only code sections, but it can’t do anything about DSO boundaries.

One more point is that we’re still experimenting with execute-only code generation, and until the feature is shipped and tested in a distro or platform, it’s definitely too early to make it the default.

Hm…then it seems to me that the new compiler option isn’t going to actually do much good? Both in the ObjC and sanitizer cases, code is reading from other functions text segments, which could be in a different DSO. The new compiler flag lets us specify compatibility on a per-object basis but if that only impacts the way a single DSO is mapped, and not the whole program, it cannot actually ensure the required compatibility, it seems to me.

So perhaps the DSO loader does need to be taught about this property, and only enable execute-only mode if all DSOs in the process are compatible?