[RFC] Pointer authentication for arm64e

Hello, LLVM. Apple would like to upstream our implementation of pointer authentication in LLVM, Clang, and Swift. Pointer authentication is a language technology which mitigates the security impact of certain kinds of memory corruption. Principally, it provides a control-flow integrity (CFI) check which can be implemented at low enough cost that it is feasible to protect all indirect control flow in the language ABI, making exploit techniques such as ROP/JOP substantially more difficult.

There are three closely-related terms that readers should know:

  • Pointer authentication is a general language technology for signing and authenticating pointers, protecting against arbitrary replacement by an attacker. It is not inherently tied to specific hardware or software support.
  • ARMv8.3 is a revision of ARM’s AArch64 architecture that describes efficient hardware support for pointer authentication. Apple chips have included this technology since the Apple A12 used in (e.g.) the iPhone XR and XS.
  • arm64e is a specific ABI for pointer authentication in C, C++, Objective-C, and Swift. It is used on iOS 12+ on systems that provide ARMv8.3, as well as several other Apple OSes. This ABI is not yet considered “stable” for third-party use because some of the details are still evolving. Indeed, part of our goal in open-sourcing this work is to receive feedback to help improve this ABI.

Pointer authentication is described in detail in the language documentation, which can temporarily be found here:
https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst
We encourage contributors who are deeply interested in this subject to read that document, which goes in depth into the basic mechanisms of pointer authentication, their theory of operation, and how they are used (and exposed for customization) in the four supported programming languages. For the convenience of reviewers, we’ll also briefly summarize the main points right now:

  • Signing a pointer means computing a cryptographic signature of the pointer value (not the pointed-to contents) which will be passed along with the pointer. Signing is generally done when constructing the pointer, e.g. when taking the address of a function.
  • Authenticating a pointer means checking that the passed-along cryptographic signature is correct. Authentication is generally done when using a pointer, e.g. when calling an opaque function pointer.
  • In ARMv8.3, the signature is stored in the unused high bits of a 64-bit pointer. The number of bits can be adjusted by the operating system depending on the address-space needs of the system.
  • Pointer authentication assumes that an attacker cannot reliably duplicate the signing step to forge a signed pointer. In ARMv8.3, this is achieved by incorporating data from private key registers that only the kernel can access.
  • Preventing signature forgery is not enough to prevent reliable exploitation because attackers can still replace validly-signed pointers with other signed pointers. This can be blocked by signing pointers with a discriminator value that is, ideally, unique for the specific purpose at hand.
  • Under pointer authentication, all function pointers are signed, as well as specific data pointers that are important for protecting the control flow of the program.
  • Language ABIs specify which key register to use and how to compute the discriminator for any particular signed pointer.

We will also be giving a talk on this work at the LLVM Developer’s Conference.

At the LLVM level, pointer authentication requires several different things:

  • basic code-generation support for the ARMv8.3 instructions, which was contributed to LLVM several years ago by ARM;
  • intrinsics to sign, resign, and (unsafely) authenticate pointers, as well as several other operations;
  • the ability to perform an authenticated call, which we’re representing with a special operand bundle on call sites, which the optimizer must be made aware of at various points;
  • the ability to perform other authenticated operations, such as loads and stores, which we’re representing using separate auth intrinsics;
  • signed pointer constants that can be stored in global memory; and
  • special changes to various aspects of AArch64 code-generation, such as switch lowering and tail calls, in order to avoid creating opportunities for exploits with over-aggressive scheduling.

Some of the LLVM representations we’re currently using are problematic, usually because they allow the compiler to break up critical sequences of code and potentially introduce vulnerabilities. We’d like to figure out better representations to avoid these problems, and we intend to start conversations on llvm-dev about them as they come up in the individual patches. In the short term, however, we’d like to land what we’ve currently got. These problems are discussed in depth in the Clang language documentation as well as covered in our LLVM talk.

At the Clang level, pointer authentication builds on that work:

  • Several new builtins have been added to expose the underlying intrinsic operations; this includes a new <ptrauth.h> header.
  • Abstractions for representing a pointer signature have been introduced into IRGen.
  • IRGen’s FunctionPointer abstraction now requires signing information for indirect calls.
  • Various places throughout IRGen have been improved to sign and authenticate pointers.
  • There is a new __ptrauth type qualifier to make working with signed pointers easier. The qualifier can include the storage address in the discriminator; when added to a struct field, this forces the struct to be passed indirectly and requires copying the struct to be non-trivial. Basic support for handling C structs with these restrictions was already introduced into Clang in order to support structs with Objective-C __strong and __weak fields.

We also expect substantial future developments in Clang, both to improve the default pointer-signing rules for various language features and to add new language features to allow programmers to opt in to stronger protections for their code.

All of these changes have been broken down into a relatively fine-grained sequence of patches which we’ll be submitting individually for review. If you’d like to “skip ahead”, the entire patch sequence can be found here:
https://github.com/apple/llvm-project/pull/14

John.

Hey folks,

We’re getting ready to land the initial main patches:

And from there, onwards to the changes John described above, which we’re starting to submit for review in earnest.

There have been a lot of changes since the initial RFC, and several of us in the community have been meeting regularly around this (see https://llvm.org/docs/GettingInvolved.html#online-sync-ups). Still, the above remains an excellent summary of all the bits involved.

If you’re interested, and have comments or questions, please join us in the review threads (or even in our monthly sync-ups!)

Thanks,

-Ahmed

Hello, LLVM. Apple would like to upstream our implementation of pointer authentication in LLVM, Clang, and Swift. Pointer authentication is a language technology which mitigates the security impact of certain kinds of memory corruption. Principally, it provides a control-flow integrity (CFI) check which can be implemented at low enough cost that it is feasible to protect all indirect control flow in the language ABI, making exploit techniques such as ROP/JOP substantially more difficult.

There are three closely-related terms that readers should know:

  * Pointer authentication is a general language technology for signing and authenticating pointers, protecting against arbitrary replacement by an attacker. It is not inherently tied to specific hardware or software support.
  * ARMv8.3 is a revision of ARM’s AArch64 architecture that describes efficient hardware support for pointer authentication. Apple chips have included this technology since the Apple A12 used in (e.g.) the iPhone XR and XS.
  * arm64e is a specific ABI for pointer authentication in C, C++, Objective-C, and Swift. It is used on iOS 12+ on systems that provide ARMv8.3, as well as several other Apple OSes. This ABI is not yet considered “stable” for third-party use because some of the details are still evolving. Indeed, part of our goal in open-sourcing this work is to receive feedback to help improve this ABI.

Pointer authentication is described in detail in the language documentation, which can temporarily be found here:
https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst
We encourage contributors who are deeply interested in this subject to read that document, which goes in depth into the basic mechanisms of pointer authentication, their theory of operation, and how they are used (and exposed for customization) in the four supported programming languages. For the convenience of reviewers, we’ll also briefly summarize the main points right now:

  * Signing a pointer means computing a cryptographic signature of the pointer value (not the pointed-to contents) which will be passed along with the pointer. Signing is generally done when constructing the pointer, e.g. when taking the address of a function.
  * Authenticating a pointer means checking that the passed-along cryptographic signature is correct. Authentication is generally done when using a pointer, e.g. when calling an opaque function pointer.
  * In ARMv8.3, the signature is stored in the unused high bits of a 64-bit pointer. The number of bits can be adjusted by the operating system depending on the address-space needs of the system.
  * Pointer authentication assumes that an attacker cannot reliably duplicate the signing step to forge a signed pointer. In ARMv8.3, this is achieved by incorporating data from private key registers that only the kernel can access.
  * Preventing signature forgery is not enough to prevent reliable exploitation because attackers can still replace validly-signed pointers with other signed pointers. This can be blocked by signing pointers with a discriminator value that is, ideally, unique for the specific purpose at hand.
  * Under pointer authentication, all function pointers are signed, as well as specific data pointers that are important for protecting the control flow of the program.
  * Language ABIs specify which key register to use and how to compute the discriminator for any particular signed pointer.

We will also be giving a talk on this work at the LLVM Developer’s Conference.

At the LLVM level, pointer authentication requires several different things:

  * basic code-generation support for the ARMv8.3 instructions, which was contributed to LLVM several years ago by ARM;
  * intrinsics to sign, resign, and (unsafely) authenticate pointers, as well as several other operations;
  * the ability to perform an authenticated call, which we’re representing with a special operand bundle on call sites, which the optimizer must be made aware of at various points;
  * the ability to perform other authenticated operations, such as loads and stores, which we’re representing using separate auth intrinsics;
  * signed pointer constants that can be stored in global memory; and
  * special changes to various aspects of AArch64 code-generation, such as switch lowering and tail calls, in order to avoid creating opportunities for exploits with over-aggressive scheduling.

Some of the LLVM representations we’re currently using are problematic, usually because they allow the compiler to break up critical sequences of code and potentially introduce vulnerabilities. We’d like to figure out better representations to avoid these problems, and we intend to start conversations on llvm-dev about them as they come up in the individual patches. In the short term, however, we’d like to land what we’ve currently got. These problems are discussed in depth in the Clang language documentation as well as covered in our LLVM talk.

At the Clang level, pointer authentication builds on that work:

  * Several new builtins have been added to expose the underlying intrinsic operations; this includes a new <ptrauth.h> header.
  * Abstractions for representing a pointer signature have been introduced into IRGen.
  * IRGen’s FunctionPointer abstraction now requires signing information for indirect calls.
  * Various places throughout IRGen have been improved to sign and authenticate pointers.
  * There is a new __ptrauth type qualifier to make working with signed pointers easier. The qualifier can include the storage address in the discriminator; when added to a struct field, this forces the struct to be passed indirectly and requires copying the struct to be non-trivial. Basic support for handling C structs with these restrictions was already introduced into Clang in order to support structs with Objective-C __strong and __weak fields.

We also expect substantial future developments in Clang, both to improve the default pointer-signing rules for various language features and to add new language features to allow programmers to opt in to stronger protections for their code.

All of these changes have been broken down into a relatively fine-grained sequence of patches which we’ll be submitting individually for review. If you’d like to “skip ahead”, the entire patch sequence can be found here:
https://github.com/apple/llvm-project/pull/14

John.

Hey folks,

We're getting ready to land the initial main patches:
- IR intrinsics https://reviews.llvm.org/D90868
- clang builtins: https://reviews.llvm.org/D112941
And from there, onwards to the changes John described above, which we're starting to submit for review in earnest.

There have been a lot of changes since the initial RFC, and several of us in the community have been meeting regularly around this (see https://llvm.org/docs/GettingInvolved.html#online-sync-ups). Still, the above remains an excellent summary of all the bits involved.

If you're interested, and have comments or questions, please join us in the review threads (or even in our monthly sync-ups!)

Does this implementation require the PAC/BTI extensions? Or it it more generic?

-Tom

Hello, LLVM. Apple would like to upstream our implementation of pointer authentication in LLVM, Clang, and Swift. Pointer authentication is a language technology which mitigates the security impact of certain kinds of memory corruption. Principally, it provides a control-flow integrity (CFI) check which can be implemented at low enough cost that it is feasible to protect all indirect control flow in the language ABI, making exploit techniques such as ROP/JOP substantially more difficult.

There are three closely-related terms that readers should know:

  * Pointer authentication is a general language technology for signing and authenticating pointers, protecting against arbitrary replacement by an attacker. It is not inherently tied to specific hardware or software support.
  * ARMv8.3 is a revision of ARM’s AArch64 architecture that describes efficient hardware support for pointer authentication. Apple chips have included this technology since the Apple A12 used in (e.g.) the iPhone XR and XS.
  * arm64e is a specific ABI for pointer authentication in C, C++, Objective-C, and Swift. It is used on iOS 12+ on systems that provide ARMv8.3, as well as several other Apple OSes. This ABI is not yet considered “stable” for third-party use because some of the details are still evolving. Indeed, part of our goal in open-sourcing this work is to receive feedback to help improve this ABI.

Pointer authentication is described in detail in the language documentation, which can temporarily be found here:
https://github.com/apple/llvm-project/blob/a63a81bd9911f87a0b5dcd5bdd7ccdda7124af87/clang/docs/PointerAuthentication.rst
We encourage contributors who are deeply interested in this subject to read that document, which goes in depth into the basic mechanisms of pointer authentication, their theory of operation, and how they are used (and exposed for customization) in the four supported programming languages. For the convenience of reviewers, we’ll also briefly summarize the main points right now:

  * Signing a pointer means computing a cryptographic signature of the pointer value (not the pointed-to contents) which will be passed along with the pointer. Signing is generally done when constructing the pointer, e.g. when taking the address of a function.
  * Authenticating a pointer means checking that the passed-along cryptographic signature is correct. Authentication is generally done when using a pointer, e.g. when calling an opaque function pointer.
  * In ARMv8.3, the signature is stored in the unused high bits of a 64-bit pointer. The number of bits can be adjusted by the operating system depending on the address-space needs of the system.
  * Pointer authentication assumes that an attacker cannot reliably duplicate the signing step to forge a signed pointer. In ARMv8.3, this is achieved by incorporating data from private key registers that only the kernel can access.
  * Preventing signature forgery is not enough to prevent reliable exploitation because attackers can still replace validly-signed pointers with other signed pointers. This can be blocked by signing pointers with a discriminator value that is, ideally, unique for the specific purpose at hand.
  * Under pointer authentication, all function pointers are signed, as well as specific data pointers that are important for protecting the control flow of the program.
  * Language ABIs specify which key register to use and how to compute the discriminator for any particular signed pointer.

We will also be giving a talk on this work at the LLVM Developer’s Conference.

At the LLVM level, pointer authentication requires several different things:

  * basic code-generation support for the ARMv8.3 instructions, which was contributed to LLVM several years ago by ARM;
  * intrinsics to sign, resign, and (unsafely) authenticate pointers, as well as several other operations;
  * the ability to perform an authenticated call, which we’re representing with a special operand bundle on call sites, which the optimizer must be made aware of at various points;
  * the ability to perform other authenticated operations, such as loads and stores, which we’re representing using separate auth intrinsics;
  * signed pointer constants that can be stored in global memory; and
  * special changes to various aspects of AArch64 code-generation, such as switch lowering and tail calls, in order to avoid creating opportunities for exploits with over-aggressive scheduling.

Some of the LLVM representations we’re currently using are problematic, usually because they allow the compiler to break up critical sequences of code and potentially introduce vulnerabilities. We’d like to figure out better representations to avoid these problems, and we intend to start conversations on llvm-dev about them as they come up in the individual patches. In the short term, however, we’d like to land what we’ve currently got. These problems are discussed in depth in the Clang language documentation as well as covered in our LLVM talk.

At the Clang level, pointer authentication builds on that work:

  * Several new builtins have been added to expose the underlying intrinsic operations; this includes a new <ptrauth.h> header.
  * Abstractions for representing a pointer signature have been introduced into IRGen.
  * IRGen’s FunctionPointer abstraction now requires signing information for indirect calls.
  * Various places throughout IRGen have been improved to sign and authenticate pointers.
  * There is a new __ptrauth type qualifier to make working with signed pointers easier. The qualifier can include the storage address in the discriminator; when added to a struct field, this forces the struct to be passed indirectly and requires copying the struct to be non-trivial. Basic support for handling C structs with these restrictions was already introduced into Clang in order to support structs with Objective-C __strong and __weak fields.

We also expect substantial future developments in Clang, both to improve the default pointer-signing rules for various language features and to add new language features to allow programmers to opt in to stronger protections for their code.

All of these changes have been broken down into a relatively fine-grained sequence of patches which we’ll be submitting individually for review. If you’d like to “skip ahead”, the entire patch sequence can be found here:
https://github.com/apple/llvm-project/pull/14

John.

Hey folks,

We're getting ready to land the initial main patches:
- IR intrinsics https://reviews.llvm.org/D90868
- clang builtins: https://reviews.llvm.org/D112941
And from there, onwards to the changes John described above, which we're starting to submit for review in earnest.

There have been a lot of changes since the initial RFC, and several of us in the community have been meeting regularly around this (see https://llvm.org/docs/GettingInvolved.html#online-sync-ups). Still, the above remains an excellent summary of all the bits involved.

If you're interested, and have comments or questions, please join us in the review threads (or even in our monthly sync-ups!)

Does this implementation require the PAC/BTI extensions? Or it it more generic?

The full AArch64 support requires PAC (and where relevant supports BTI on top of that). Note that Armv8.1-M PACBTI is very different from Armv8.3-A/v8.5-A PAC/BTI; we're doing the latter here.

Everything above that, at the IR level and in clang, is mostly generic. For instance, our initial patch series does have prototype support for software emulation (see https://github.com/apple/llvm-project/pull/14/commits/e004081624dd3c76239f4b05f1d2b410f5d0cf82, https://github.com/apple/llvm-project/pull/14/commits/af247a2114947590b95145098feab8f922946b03).

-Ahmed