Hi all,
I’ve been working on enabling support for execute-only code generation on AArch64. This RFC aims to gather feedback about enabling this feature for LLVM’s runtime libraries, specifically compiler-rt, libc++, libc++abi, and libunwind.
Background
The old approach to mark code segments as executable-only is to use the --execute-only flag for LLD. This flag marks all executable segments --x (as opposed to r-x) unconditionally. The issue with this approach is that there is an incompatibility with -fsanitize=function, which reads 8 bytes of metadata in front of the called function’s address on each indirect function call, causing a segmentation fault when executable-only code is enabled. (A similar issue happens when using -fsanitize=kcfi)
To solve the incompatibility, the SHF_AARCH64_PURECODE section flag was introduced in the AArch64 ELF spec, mirroring the existing SHF_ARM_PURECODE section flag in the 32-bit Arm ELF spec. Using the new section flag, object files can communicate to the linker whether they allow their code sections to be put into executable-only segments. The linker is then able to check whether all input sections have the SHF_AARCH64_PURECODE flag set, and if yes, mark the output section as execute-only. This makes sure that the output is marked execute-only if and only if all input sections allow it.
In the new approach, users must opt-in to compiling their code for executable-only support in the compiler front-end and assembly code, as opposed to when linking. In Clang, for C and C++ code, this is done using the -mexecute-only or -mpure-code compiler flags, and in assembly, the “y” section flag should be used:
.section .text,"axy",@progbits,unique,0
// ^^^ = SHF_ALLOC | SHF_EXECINSTR | SHF_AARCH64_PURECODE
Implementation status
The necessary changes in LLVM, Clang and LLD have already been made to enable compiling and linking code for executable-only output over the past few months. I am also not aware of any other compilers that have implemented or started on this yet, so enabling execute-only code generation in the runtime libraries should probably be restricted to a runtimes build for now.
RFC
With the move from a linker flag to a front-end option, execute-only code generation needs to be explicitly enabled for any pre-compiled component of a compiler toolchain.
The intended use case for this feature is to try to re-enable execute-only code in Android. For that, Android’s LLVM toolchain needs to support generating execute-only code, including LLVM’s runtime libraries (compiler-rt, libc++, libc++abi, libunwind). Because of the limited initial scope, I think using this feature should be opt-in via a CMake parameter at build time.
Build configuration will need to set two things:
- Add
-mexecute-only(or-mpure-code) to C and C++ flags; - Add a define for assembly files indicating that they should use the “y” section flag for code sections.
As far as I can tell, the second point only concerns compiler-rt and libunwind, as libc++ and libc++abi don’t have any assembly source files.
For CMake configuration I see two possible approaches:
- Use one variable for all runtime libraries, e.g.
LLVM_EXECUTE_ONLY_CODE- Only one variable to indicate the same feature across different libraries;
- Doesn’t necessarily apply to all runtime libraries (e.g. LLVM libc for now).
- Use separate variables for each sub-project, i.e.
COMPILER_RT_EXECUTE_ONLY_CODE,LIBCXX_EXECUTE_ONLY_CODE, etc.- It’s clear which libraries have execute-only code generation support;
- The variable needs to be translated in some cases, for example MSan and TSan build their own versions of libc++ and libc++abi, so
COMPILER_RT_EXECUTE_ONLY_CODEneeds to be propagated as itsLIBCXX_*andLIBCXXABI_*equivalent in those cases.
Personally I prefer the second option, but I’d like to get feedback on what others think is the right solution, which may be different from the two I mentioned above.
This mostly covers my design approach and questions for this feature, but I don’t have a lot of experience with how CMake configuration should be handled in LLVM. I also couldn’t find a comparable configurable feature, which I could have based my approach on, so if you have any thoughts, suggestions or questions, I’d like to hear them all.