LLVM Pointer Authentication sync-ups

We’ve been running monthly sync-up calls on pointer authentication ABIs and implementation in LLVM for about a year now.

With this post, I wish to move the meeting minutes from a Google doc to a Discourse discussion forum. I hope that there will be at least the following advantages of doing so:

  1. LLVM project information is scattered slightly less across multiple service.
  2. Give a bit more visibility about this online sync-up to the wider community.
  3. By using the Discourse Event plugin, potentially make it a bit easier to see when this online sync-up occurs and for people to indicate if they plan on joining.
  4. Also by using the Discourse Event plugin, hopefully automatically show meeting times in the reader’s default time zone automatically.
  5. Make meeting minutes show up in normal Discourse searches.

I’m not sure if other online sync-ups already use Discourse in this way, so we may have to experiment a bit as we gain experience on how to best do this using Discourse rather than a Google doc.
A calendar invite is still available as an ics link at Getting Involved — LLVM 15.0.0git documentation. If you’d like to receive a calendar invite directly, please DM me with which email address you’d like to receive the invite on.

The next sync-up call is planned for next week Monday at 6pm CET/5pm BST/9am PT.

1 Like

This reply copy-pastes the historical meeting minutes from the google doc to here, so that the minutes are searched through using normal Discourse search and everything becomes available in 1 place.

When: Mon April 25th 2022. 9am PST/5pm BST/6pm CET; length: 1 hour.

Agenda:

  • Kristof: Any concerns if we’d move this document to a topic in Discourse somehow and keep the minutes/announcements of next sync-up etc there?

Minutes:

  • Status of upstreaming
    • Ahmed busy rebasing; across opaque pointers stuff and more.
    • patch with intrinsics also progressed - still needs to be updated upstream.
  • Adding support to PAuthABI to Rust requires clang and rust compiler to agree on how to encode function type.
    • Requires removing some C-specific things from the signing schema.
    • Thinking about using type encoding for software-based CFI too.
    • Leading to a discussion on how to support interoperability across multiple/many languages, without having to go through an FFI that only supports the subset supported by C.
  • Moving to Discourse for the meeting taking and meeting invites sounds good.

When: Mon March 28th 2022. 9am PST/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Peter C has a question about interoperability with Rust and the role the function signing schema plays in that. We had too many people missing to discuss in depth.

When: Mon February 28th 2022. 9am PST/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Upstreaming progress
    • Bundles
    • 2 IR things that need review, constants, indirect branches need IR changes. Otherwise boilerplate stuff.
  • Relative offset vtables
    • Darwin can use relative offsets for Objective C and Swift, managed by the runtime. Less of an ABI break but still painful.
    • Gave up for C++ as too locked in.
    • Good to do, but still painful ABI break.
  • Signing both vtable entry and vtable pointers
    • Must do the vtable pointer
    • Shared inheritance trees with one root object, many functions will receive a pointer to the base object that would unlock the whole hierarchy.
    • General policy of erring on the side on signing everything that can be signed as it is difficult to know what the security properties will be.

When: Mon January 24th 2022. 9am PST/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Further discussion on getting pauthabi support reviewed and committed into mainline.
  • changes in mainline LLVM/clang causes downstream conflicts with the pauthabi implementation. Hopefully this is just an unfortunate one-time occurence.
  • First let’s land patch for bundle support. Then instruction selection can build on that?
  • There’s a late pass for pseudo-expansion.
  • There’s one very large clang patch - that has been split up already in Ahmed’s version.
  • Upstreaming is near top priority, but not absolutely top of priority list.
  • Some discussion on how -fptrauth_intrinsics should control whether the ptrauth intrinsics are available. Should the intrinsics be available by default when targeting an architecture version that support Armv8 PAuth?

When: Mon November 22nd 2021. 9am PST/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Upstreaming: progressing and getting traction with reviewers.
  • A discussion was started about command-line options: how to combine regularly-used options in a less verbose format than the current one. For example, we could have a PAC-ABI option with a finite number of options, or could reuse the field of the target triple when the ABI is more stable, or could add aliases for the PAC ABI options into the existing -mbranch-protection option… each with its pros and cons.
  • Decision to cancel the meeting in December due to holiday season. The next one will be end of January, but feel free to arrange an ad-hoc one if needed.

When: Mon October 25th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Upstreaming: patches should be split so that all C support should be in the first patches; C++ support in later patches.
  • There are ways to tell the back end to auth/sign only the return addresses. That feature already exist upstream. The patches for arm64e have similar functionality, but it’s enabled using different option names. That duplication should be cleaned up at some point.
  • The constexpr issue discussed last month persists still.
  • Make open phabricator reviews linked to each other so it’s clear how patches depend on each other.
    First patches to review: C intrinsics, LLVM-IR intrinsics, ISel.
    Ahmed will mention this on the LLVM dev mailing lists.
  • There will be a sync-up over email with regular participants next week Monday.
  • Kristof mentions a new open source book low-level software security for compiler developers: GitHub - llsoftsec/llsoftsecbook: Low-Level Software Security for Compiler Developers

When: Mon September 27th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Ahmed just pushed an updated diff, see https://github.com/ahmedbougacha/llvm-project/commits/eng/arhttps://github.com/ahmedbougacha/llvm-project/commits/eng/arm64e-upstream-llvmorgm64e-upstream-llvmorg
  • All the clang parts are split now.
    Looking into splitting i64 discriminators into 2, so that the backends “see” 2 components that make up the discriminator.
    In the backend, it produces a number of fixed hardened sequences, e.g. guaranteeing that the discriminator does not get spilled (which makes it easier to break the hardening).
    Passing the discriminator components separately to the backend ensures more of the fixed, hardened, sequences are produced.
  • 2 open problems: (1) splitting discriminators (see above bullet point); (2) better constexpr.
  • Bruno: is there a github branch with more ELF fixes? Ana suggests trying Peter Collingbourne’s github (https://github.com/pcc/llvm-project/commit/e8c6684c875fd9019610fe0ff323606b6dad694d).
  • Bruno: ABI suggests a need for start/ending symbols. It seems Bruno’s loader does not need that. Peter suggests that probably these are needed only for static linking in a deep embedded system that does not have the ELF file at “load” time.
  • Most people experimenting are currently doing static linking; dynamic linking will be needed. Some experimental patches are in the works for an Android platform.
  • Upstreaming to llvm.org depends mostly on a decision regarding needing to change the intrinsics or not (see i64 discriminator above).

When: Mon August 23rd 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Ana: have done a lot of validation of PAuth patches on a wide variety of code. Now also testing including type of variable into discriminator.
    Casting operations on function pointers sometimes seems to result to a scheme where no function pointer type is included in the discriminator.
    Ahmed: For example, if a function pointer in the source code gets cast to a void*, that pointer will get the signing schema for void*. That is intentional.
    void* is always signed without diversity.
    When a variable gets cast from one function pointer type to another function pointer type, the intention is that the signing schema used should be the one of the “destination” function pointer type.
    On SPEC vortex, function pointer type in discriminators fails because it passes a function pointer address, and different function pointer types are assumed in the place where it stores vs where it loads it.
  • Ana: how to speed up the upstreaming?
    Ahmed: A lot of work still going into updating from old branch. 1 year of downstream changes is being integrated into Ahmed’s set of patches.
    The intrinsics patch: only one minor outstanding comment not addressed yet.
    Ana: focussing on clang c function pointer functionality would be helpful for them.
    Ahmed: another patch that causes a lot of merge conflicts is a patch that enables adding an attribute to a structure to indicate a specific signing schema for that structure.
    Ana: the more automation the compiler can support (i.e. developers not needing to hand-insert calls to intrinsics), the better. Not all developers will be experts, and more automation should lead to fewer human mistakes.
  • ELF marker to indicate PAuth signing scheme: Initial thought was to put metadata info into a GNU note section. But that’s up for discussion - instead use something different in object files that is more platform agnostic than a GNU note section.
  • PAuthABI: currently alpha-quality specification. The relocation codes still need to be renumbered before the specification can move to beta or final quality. When would be a good time to make that update to the spec?
  • Still looking into whether the C-level intrinsics would be good to be specified as part of the Arm ACLE.
  • LLVM dev conference - should we do a round table, a lightning talk or something similar?
  • Performance impact of enabling PAuth? It really depends, between not even noticeable and single digit percents. Some signing on vtbl in C++ can be significant - and some optimizations were done to reduce that overhead.
    With 8.6 pauth it becomes even cheaper because then you don’t have to do software checks to force a trap on a failing authentication check.
  • Ana: should there be general advice to users who want to enable PAuth that they should enable some specific clang warnings? Such as assignments between incompatible function pointer types?
    Ahmed: a number of lower hanging fruits to fix first before dropping to a weaker signing scheme due to implicit casting becomes the biggest issue. Lower hanging issues are: assembly code not signing some code addresses; in some case spill/fill code generation maybe introducing raw pointer stores from register to memory.

When: Mon July 26th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • QuIC keen on getting the toolchain work upstream. Ahmed would still like a bit more feedback on the intrinsics patch, which is gating the rest of the patches.
  • There is a proposed change to the marking scheme in the ABI.
  • Discussed the possibility to add the intrinsics to the ACLE. Could Arm help here?
  • Discussed the reasons behind using B-key for PAC-ret and A-key for the rest. On Apple, it is a matter of separating process-specific keys from general-keys. This may be an entirely different case for other platforms though. Regardless, there seems to be no reason behind using specifically the A or B key for one or another case. Android could end up using A-key for PAC-ret as opposed to iOS which uses B-key.

When: Mon June 28th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Plans to have pauth enabled on Android? Maybe as part of a new armv9 abi (Android U+). But not set in stone.
    • Could either require pauthabi; or make it opt-in.
  • Ana’s planning to use it in some non-Android images.
  • A better strategy would be good to have arm64e/Android patch in llvm.org
    • Both Ana and Peter rebase the patch to ToT llvm.org from time to time and the rebases are somewhat painful.
    • clang patch touches 176 files. Always lots of merge conflicts (even though most are easy to resolve).
  • tools for checking/verifying hardening?
    • llvm-cfi-verify checks CFI code sequences in your code.
    • ROPGadget: not very useful as is, as it reports lots of “false positives”; i.e. just report all sequences ending in RET, not all are exploitable.
  • Separating pauthabi enable on Android platform by not for apps: use prctl mode to select whether keys are enabled or not. Could be used for 2 zygotes: one with keys enabled; one with keys disabled. This is already in the Android prototype.

When: Mon May 24th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Arm (ABI relocation code reservation merged)
  • Ana (Team have done a lot of testing, would like to help Ahmed get the changes upstream). Architecture neutral bits of highest importance.
  • Ahmed (Working on intrinsics)
  • Intrinsics are likely to be architecture neutral.
  • Likely that intrinsics won’t need anything special in the ACLE. We can probably reference documentation for clang if needed.
  • Arm (check to see if there is any interest in GCC community for things that would need to be common, such as intrinsics).

When: Mon April 26th 2021. 9am PST/4pm UTC/5pm BST/6pm CET; length: 1 hour.

Minutes:

  • Peter Smith shares the link to abi-aa/pauthabielf64 at main · ARM-software/abi-aa · GitHub, the low-level ELF ABI specification.
  • Relocation code reservations (in pauthabielf64 specification pointed to above)? Peter S: not yet approved.
  • What is the status of upstreaming arm64e support to llvm.org? No major news since last time. There are a few arm64e abi changes.
    • Ahmed will reply to comments made on first patches on phabricator to support arm64e.
  • v8m pacbti: is that very different/not much overlap with AArch64 PAuth?
    • Peter: signing of arbitrary pointers with v8m pacbti is limited. For now, this enables pac-ret (i.e. signing return addresses; not other code pointers).
    • Specification is now public. Arm will aim to upstream implementation over the next few months.

When: Round table at LLVM dev meeting

Wed. Oct 7, 2020 8:55 AM - 9:30 AM

Intro:

The Arm instruction set has introduced a “Pointer Authentication” extension in the Armv8.3-A architecture. All cores implementing the Armv8.3 or later architecture implement this extension.

The extension enables putting a cryptographic hash of the address and other “salt” info into the upper bits of the pointer, effectively “signing” the pointer. Later in the execution, just before the pointer is used, the pointer can be authenticated, i.e. the cryptographic hash stored in the upper bits of the pointer can be verified, checking with high probability that the pointer has not been tampered by a hacker.

Protecting return addresses (i.e. hardening backwards control flow integrity, i.o.w. hardening against ROP attacks) can be done without breaking ABI. This has been implemented in clang/llvm/gcc and is further being implemented in other parts of the system software stack.

Protecting indirect calls/jumps (i.e. hardening forward control flow integrity, i.o.w. hardening against JOP attacks) can NOT be done without breaking ABI.

We are starting to define a new ELF ABI to enable using the Armv8.3 PAuth extension. The topic of this round table is to discuss the design of that ABI and its implementation in LLVM.

This ELF ABI is similar to the Apple arm64e ABI which has similar goals for the Darwin platform (see 2019 LLVM Developers’ Meeting: A. Bougacha & J. McCall “arm64e: An ABI for Pointer Authentication ” - YouTube and llvm-project/PointerAuthentication.rst at a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87 · apple/llvm-project · GitHub).

The draft ELF ABI is being developed at abi-aa/pauthabielf64.rst at main · ARM-software/abi-aa · GitHub.

There is a pull-request to take the document to version 0.2 open at the moment. To view 0.2 of the document please use abi-aa/pauthabielf64.rst at ed1151099f52c8caf5e575e8d8c00450d43dcbc2 · ARM-software/abi-aa · GitHub

Feel free to comment on the pull-request Make encoding of signing schema and evaluation of relocation concrete by smithp35 · Pull Request #41 · ARM-software/abi-aa · GitHub

Topics we could discuss:

  • Clarify there are different levels of Pointer Authentication ABIs: language level ABI (e.g., how C/C++ ptrs are signed/auth), and executable/relocatable obj format level ABI (for ELF, MacOS etc.).
  • Versioning of the ABI:
    • What mechanism could we use л to record versions?
    • When does a new version number need to be used? Whenever there is even the smallest change in default signing schema?
    • Would it be feasible and worthwhile to try and have some level of compatibility between different versions?
  • Currently there is no salt on function pointers in arm64e; Peter Collingbourne is investigating using a salt similar to Clang’s CFI implementation (i.e. based on function signature?)
    • What are the difficulties and tradeoffs at play here?
      • Function declarations don’t always match the definition (common source of CFI warnings). Straightforward fix, if you can modify source.
    • Do these difficulties extend to other default signing schema aspects?
  • Could a kernel/libc/other shared system libs support multiple versions? How hard would that be in practice?
  • How do we deal with dlsym and dlopen? Options include, no diversity, diversity per function type, record in .symauth/.dynauth
  • Could everyone live with a RELRO GOT?
  • Would it make sense to use different signing schema for exported and internal functions to make easier the sharing of libraries?
  • Where there is a choice that affects compiler, static linker and dynamic linker, for example is the GOT signed? How do we record that in the binary? The .note.gnu.property can be extended but alternatives are possible.
  • What about pointer equivalence? Do two pointers to the same entity but signed with a different schema compare the same?
    • Can two function pointers be comparable with authenticated pointers ? I believe so but I am confirming.
  • Is work needed in LLDB/debuggers to support this ABI?
  • Discuss available tools to use to detect/report missed opportunities to protect pointers. E.g., Safely derived checks in Clang, ROPgadget analyzer, other suggestions?
  • How do we handle optional parts of the ABI. For example platforms may not need all of the functionality (RELRO GOT and PLT GOT)
  • Discuss how the Pointer Authentication ABI will look with MTE at the same time.
  • Does the linker allow static linking or the model needs to be only dynamic ?
  • Can pointer authenticated objects mixed with other objects, an example is compiled third party libraries. How to distinguish objects that need pointer authentication ?
  • Is there a new dynamic tag to distinguish dynamic libraries that need pointer authentication ?
  • Dynamic loading - How to differentiate between shared libraries used at link time vs load time ? Users could have used an authenticated library at link but when its loaded, they may override.
  • Should we have a somewhat regular call on this topic?

Minutes:

  • Breaking the abi, we’ll need multiple iterations to design this new abi. How can we record version information for the different variants of this ABI a binary file assumes?
    • current proposal - a note section in the ELF file with a (vendor, version) tuple.
    • As a mitigation - it’s needed to record version information. Indeed, ABI is expected to evolve as we understand the tradeoffs (security and others) over time.
    • No better solution known than just assuming every different variant is fully incompatible with every other variant.
    • Also good to help make sure that all binary objects are actually built with the right abi variant.
    • versioning at the ELF object level may not be needed? as it’s “just” indications/relocations needed how to sign specific pointers? The ELF part stabilises early but the language level mapping to it does not. That implies the compiler should encode the signing scheme it uses in ELF object files?
    • Within even a single OS, there probably are going to be different ABI variants - e.g. firmware using a more strict variant (but harder to use/deploy) than user space software.
    • version info probably should contain platform (e.g. “FreeBSD 2.0 signing scheme”)
  • Use cases for deeply bare metal or is this mostly an OS-level?
    • A bare-metal system may not support RELRO, lazy-binding cannot be RELRO either. What do we do about the GOT? Ideally it is RELRO so does not need to be signed.
    • You don’t need to specify how to construct a signed GOT - basically the compiler constructs fragments of a “GOT” without using GOT generating relocations.
    • complexity of the draft spec comes mostly from a signed GOT. would be good to get rid of the complexity.
    • You should expect a section of the GOT to be signed; another section not signed. All these presumably would need to go on different pages? If you use RELRO, non-lazy binding for all GOTs they can all live on the same set of pages. But maybe on ELF this turns out to not be a big problem? Typically in AArch64 ELF the GOT and PLT GOT are separate sections with the GOT placed in the RELRO segment and a lazy binding enabled PLT GOT in RW segment.
  • Lazy binding needs the PLT GOT to be writable, which may be a security issue if it is not signed as an attacker may modify the pointers in there to redirect function calls.
    • A signed PLT GOT is a contract between static linker and dynamic linker, we already have a dynamic tag and a signing schema defined for this. LLD and ld.bfd implement support but there is as yet no support in dynamic linkers.
  • dlopen/dlsym - what to do about it?
    • arm64e found that from process launch either the process is “signed” or “unsigned”. so, a signed application loading a shared library needs that library to be signed. Loading a signed library from a non-signed application works though.
    • Could we use fat binaries in ELF for packing a signed and non-signed version in a single library?
    • dlsym: whether you get a signed pointer or not depends on whether you are in a signed segment in arm64e?
      • Peter Collingbourne has implemented a prototype where an extra side table is present to have the extra per-symbol signing information.
      • Another approach is to have the compiler re-sign pointers when casting from void* to a function type.
  • Tools to measure “how much security did you get” by using a particular abi version? e.g. post-processing tools.
    • front-end diagnostics (some of them have been implemented)?
    • in the backend there are cases where a compiler bug might cause an issue. Optimization remarks in the backend very late could help catch these.
    • There’s a whole community of security folks that are interested that have their own tools (non-LLVM) like IDA, hydra etc.
    • The binary analysis part (last part) is actually really important. patterns are easy to recognize if there is something problematic. (e.g. catching corner cases in the compiler; or bad assembly code).
    • the compiler can emit standard code patterns to make spotting these easier.
  • upstreaming for arm64e is starting.
  • debug these things: major issue (Ahmed): downstream lldb arm64e has enough support to debug this.
  • Could introduce new DWARF extensions to make the debugging experience better.
    • A spec for this? Would be a separate document for ELF AArch64 at least.
  • Could the debugger just strip the PAC codes? It gets complicated(similar to TBI) because the size of the PAC may be different based on whether it is a data or code pointer, and the debugger doesn’t always know if something is a code or data pointer?
  • LLDB in UI tries to print both signed and pointer without PAC bits.
  • Since debugger runs as different process, it cannot do anything with PAC bits (different keys)
  • LLDB uses cross-process JIT to evaluate expressions. Different processes have different crypto keys. How to make that work?
  • Kristof will look into setting up a monthly zoom call to discuss this topic further in the future.

Meeting minutes for May 23rd meeting

  • There is a rebase available on Ahmed’s github on top of top-of-trunk from this morning. Still a few test failures. Adapting the code to opaque pointers is a lot of work.
  • Best way to make progress?
    • Would it be helpful to create a “pointer authentication” group in phabricator?
  • Will enabling opaque pointers by default in LLVM in July result in another big rebasing effort?
  • Ana’s team will prioritize helping with pointer authentication upstreaming when patches get posted for review.

I’m sorry I haven’t been attending regularly for a while. Peter, would you like to just talk through what you’re thinking about the schemas being C-specific?