[RFC] Upstream target support for CHERI-enabled architectures

Authors: Owen Anderson, Jessica Clarke, Alex Richardson, David Chisnall

This RFC is a proposal to gain consensus on upstreaming target support for the CHERI-enabled architectures to the LLVM project. This is an “entire project” RFC, as CHERI support touches many parts of the toolchain: primarily LLVM, Clang, and LLD, with other components such as runtime libraries or LLDB potentially being touched as well.

Upstreaming many of these sub-components of CHERI support will likely merit their own area-specific RFCs. The purpose of this RFC is to seek consensus on the directional goal of upstreaming CHERI support, not to get into the details of the individual parts.

One item of note is that we do propose upstreaming support for the RISC-V CHERI platforms as a component of this work, as we believe it is important to have at least one end-to-end functional toolchain for available hardware [3] in upstream LLVM for testing purposes.

Please see [14] for a previous early-stage discussion of this effort from 2022.

Background

CHERI [2] is capability architecture developed over more than 10 years as a research project out of the University of Cambridge and SRI, which enables both hardware-enforced memory safety as well as significant security and isolation improvements. It has been embodied in several host architectures, including MIPS, RISC-V, and Armv8-A.

In terms of hardware and emulator availability:

  • CHERIoT is a CHERI-enabled 32-bit RISC-V embedded platform [1]. It was initially developed by Microsoft [9] but is now developed by an open-source community [10] [12]. CHERIoT development boards [7] developed by lowRISC are available today, and SCI Semi will ship commercial CHERIoT hardware in 2026 [3].

  • Codasip’s X730 implements CHERI on a 64-bit RISC-V core [11].

  • Arm has implemented a CHERI prototype for Armv8-A as the Morello project, for which development boards are available [8] as well as support in their FVP emulators.

  • CHERI-QEMU supports Morello and CHERI-RISC-V.

  • Various open-source research cores.

Why upstream?

The CHERI toolchain was originally developed out-of-tree because, as an active research project, the details of all components of the platform were rapidly changing, and it would have been burdensome on upstream developers to deal with a shifting and inaccessible target.

With the stabilization of CHERI platforms and the improved access to hardware, end-users of CHERI platforms now want to be able to develop for these platforms using the latest and greatest toolchains. That means using the latest Clang/LLVM, up to and including HEAD. We are also seeing interest in, and intend to support, efforts to build a CHERI-enabled Rust toolchain, which would be greatly facilitated by enabling CHERI in upstream LLVM.

Additionally, many of the core components of CHERI are shared by the various hardware embodiments of the architecture. By moving these common components upstream, compiler developers for all the platforms that share a CHERI heritage will be more easily able to collaborate on improvements to them. As a specific example, work on a standardized form of CHERI for RISC-V is being carried out as the Y extension [6], and many parties are interested in collaborating on support for that extension.

Finally, CHERI exercises LLVM in interesting ways by virtue of having non-integral pointers, non-default address spaces, and various other features. While these are nominally supported by upstream today, regressions and missing support are common due to lack of an upstream target that tests these areas. Upstreaming this support will help prevent the accidental introduction of bugs with respect to IR semantics in this area.

What has changed since the previous discussion? [14]

The primary changes since the 2022 discussion have been an increase in the number of interested parties collaborating on CHERI support in open source and the increased availability of hardware CHERI implementations. The former has increased the need for upstream integration of CHERI support to facilitate collaboration and avoid a proliferation of downstream forks of LLVM. The latter has eased the burden associated with validating CHERI correctness upstream.

What does upstreaming entail?

The core of this proposal is to upstream the required elements from CTSRD-CHERI/llvm-project and CHERIoT-Platform/llvm-project into llvm/llvm-project to make targeting RISC-V CHERI platforms with an upstream build of Clang/LLVM possible. By upstreaming the backend components of at least one CHERI-enabled architecture along with general CHERI support, we will be able to ensure that the upstream codebase is tested end-to-end.

Based on the current state of the downstream CHERI and CHERIoT forks of LLVM, we currently estimate that this will result in approximately 40KLOC of code being added to upstream LLVM. This is lower than discussed in [14] partially because untyped pointers defined away some downstream changes.

In keeping with standard LLVM development practices, we propose to upstream these changes in self-contained, incremental pieces, and welcome feedback on everything from code quality to overall architecture.

The major components to be upstreamed include:

  • Support for CHERI annotations and intrinsics from the C/C++ source level through the IR level

  • RISC-V backend support for CHERI / CHERIoT types and instructions

  • Optimizer changes required to safely optimize CHERI types, annotations, and intrinsics, including bug fixes related to use of non-integral pointers, address spaces, etc.

  • Linker support for CHERI and CHERIoT RTOS ABIs

  • Testsuites throughout

Many of these areas are large enough that we expect to present more detail-oriented RFCs in those areas down the road before proceeding. At this point we are only looking for directional consensus on the upstreaming overall.

Note that this does not include upstreaming support for Morello. Whilst it is currently the most capable CHERI system, it is a prototype, not a commercial product, and only a limited number of systems have been produced. We may propose upstream changes that facilitate Morello, but we will maintain Morello support downstream until it is no longer useful for us and the wider CHERI community. Similarly, research on CHERI will continue for the foreseeable future, and so any research extensions we are working on, even for CHERI-RISC-V, will be maintained downstream, until such time as they go through the standardisation process.

This RFC also does not concern CHERI as a host architecture (outside of runtimes), which is orthogonal to target support.

Ongoing Support

We are committed to ongoing support and development of CHERI and CHERIoT support in LLVM. While CHERI is not strictly a backend in LLVM terms, we are open to feedback on what kinds of CI support we can provide to support upstream.

We are also committed to continuing to track the evolution of CHERI in the RISC-V ecosystem, including future development of an official RISC-V CHERI extension [6] [13] as appropriate.

FAQ About CHERI

Where can I learn about CHERI?

Here are a few resources:

  • “An Introduction to CHERI” [16] provides a general introduction to CHERI.

  • The CHERI ISA v9 [17] specifies the architecture for current CHERI implementations.

  • The CHERIoT Programmers’ Guide [18] presents a software-oriented introduction to the implementation of CHERI in CHERIoT.

  • The CHERI C/C++ Programming Guide [19] provides a programmer-oriented guide for pure-capability CHERI C/C++.

  • “Formal Mechanised Semantics of CHERI C: Capabilities, Provenance, and Undefined Behaviour” [20] is a more in-depth exploration and discussion of CHERI C/C++’s semantics.

  • The draft RISC-V CHERI specification is available [21]

What are some of the challenges of supporting CHERI in LLVM?

  • CHERI replaces pointers with capabilities, which are non-integral and non-forgeable. This violates historical assumptions throughout Clang and LLVM that pointers could be losslessly reinterpreted as integers, and vice-versa. This also extends to type-punning through memory. This has been improved in recent releases due work on non-integral pointers, as well as in-progress work such as [22].

  • CHERI targets in “pure-capability” (i.e. all pointers are capabilities) ABIs do not use address space zero as the default address space. This is to allow implementation reuse between hybrid (where both pointers and capabilities exist, the former in address space zero) and pure-capability ABIs.

What are areas of active development and known issues in CHERI support for LLVM?

  • RISC-V CHERI is in the process of standardization as the Y extension, but existing implementations use the older RISC-V XCheri (and XCheriot) vendor extensions.

  • LLVM’s DataLayout does not have a concept of “default address space corresponding to void *”, only address spaces for specific purposes, instead assuming that void * is address space 0, but pure-capability CHERI code uses a non-zero address space for void *.

  • atomicrmw does not support operating on pointers (except for xchg); instead they are lowered using a pointer-sized iN. Downstream this is supported, but by stuffing the integer to add/or/etc into a pointer and using that, with the later generated code turning the pointer operand back into an integer. Ideally the value would be the index type in this case, and a separate type added (like load has) to use as the in-memory type, that would be permitted to be a pointer. This is also useful outside of CHERI [15].

  • Relocations used to initialise capabilities need to support SymA + (SymB - SymA) + Const in order to express a capability with bounds of SymA offset to point at SymB + Const (e.g. a landing pad SymB within a function SymA). This would be folded to SymB + Const for an integer address, but for a capability the assembler has to disable this fold to preserve provenance, just as is the case for IR. MCExpr can represent this, but MCValue cannot (it can only represent SymA - SymB + Const) if linker relaxation is in use (if it is not, the entire offset can be constant-folded to give SymA + FoldedCons). As a result, linker relaxation is not supported for any CHERI implementations, and the relocations that would be required to encode such constructs are not specified.

External Links

[1] https://cheriot.org/cheriot-sail/cheriot-architecture.pdf

[2] https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201406-isca2014-cheri.pdf

[3] https://www.scisemi.com/news-1/press-release-iceni-family/

[4] https://github.com/CTSRD-CHERI/llvm-project

[5] https://github.com/CHERIoT-Platform/llvm-project

[6] https://riscv.github.io/riscv-cheri/

[7] https://www.sunburst-project.org/

[8] https://www.morello-project.org/

[9] https://www.microsoft.com/en-us/research/publication/cheriot-rethinking-security-for-low-cost-embedded-systems/

[10] https://cheriot.org

[11] https://codasip.com/solutions/riscv-processor-safety-security/cheri/

[12] https://cheriot.org/rtos/sail/2024/07/31/moving-to-the-cheriot-org.html

[13] https://cheriot.org/isa/roadmap/2024/10/31/isa-roadmap.html

[14] https://discourse.llvm.org/t/is-it-time-to-start-upstreaming-the-cheri-support-to-llvm/60032

[15] https://github.com/llvm/llvm-project/issues/120837

[16] https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf

[17] https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-987.pdf

[18] https://cheriot.org/book/

[19] https://ctsrd-cheri.github.io/cheri-c-programming/

[20] https://www.cl.cam.ac.uk/~pes20/asplos24spring-paper110.pdf

[21] https://riscv.github.io/riscv-cheri/

[22] https://github.com/llvm/llvm-project/pull/105735

16 Likes

Tagging @jrtc27 @arichardson and @davidchisnall for visibility.

I support upstreaming of CHERI support.

From a middle-end perspective, I think pretty much all changes that are needed for CHERI support are changes we’d like to see anyway, but CHERI often provides a more obvious motivation for them.

4 Likes

To add a little bit of background:

Our first CHERIoT chips are currently at the fab and we will be shipping in volume in early 2026. This means that the CHERIoT v1 ISA is baked and something that we need to support long term (and would prefer to support upstream rather than in our branch). We expect a CHERIoT v2 to be based on the RISC-V Y base architecture and so to share more common code in the toolchain with other RISC-V CHERI subtargets, but that will not remove our need to support the v1 ISA (we’re targeting devices with a 10+ year lifespan, so this isn’t something we can simply abandon).

I think this is a good direction, assuming that the forks are in a healthy state for upstreaming. Some points for discussion:

  1. I would like to second @nikic’s sentiment about middle-end changes we’d like to see anyway, around pointer provenance and non-default address spaces. Can you briefly describe how you handle ptrtoint?
  2. How do you plan to upstream the CHERI RISC-V backend support? Will it be a separate target that shares code with the RISCV target, or do you propose to directly modify the RISCV target? If the latter, would you worry about burdening existing RISCV backend devs with CHERI details?
  3. Will Alive2 need to be modified to check middle-end LLVM IR with this upstreaming? If so, would you be willing to dedicate some resources to make the necessary changes to Alive2?
  4. If I understand correctly, CHERI-IoT and CHERI are different forks: why is this the case, and would you not prefer to merge the forks before upstreaming?
  1. ptrtoint returns the address component of a CHERI capability, which on all RISC-V CHERI implementations is a sub-register extraction. Turning that back into a capability will produce an invalid capability; specifically, the valid tag will not be set, preventing the capability from being dereferenced.
  2. CHERI support lives within the existing RISC-V backend in a manner similar to other vendor extensions. We have to make a small amount of changes to common TableGen code, namely adding a not-CHERI predicate to normal RISC-V memory instructions. Beyond that, we do not expect to burden RISC-V developers beyond the normal burden of any other extension.
  3. I’m unclear on what changes you’re anticipating in Alive2. Can you elaborate? Perhaps @arichardson can comment further?
  4. The CHERI and CHERIoT forks share a common ancestor, but move at different paces, different release cadences, etc. The CHERI fork supports a wider range of CHERI implementations, but is slower in tracking upstream LLVM (currently on LLVM 17). The CHERIoT fork supports a narrower range of CHERI implementations, but is more aggressively tracking upstream LLVM (stable branch on LLVM 20, next-release branch based on LLVM 21 ready to go, tracking branch merges with LLVM upstream approximately daily). As called out in the proposal, one of the intentions of upstreaming is to move the common subset of the two forks into a location where will be easily for all interested parties to collaborate on them.

This is one of the motivations for the ptrtoaddr instruction (PR 139357). In our current code, we have three intrinsics:

  • Get address (extracts the address from the pointer)
  • Set address (takes a pointer and an address, sets the address of the pointer to the provided address).
  • Add to address (does address displacement)

Add to address could be implemented as get address, add, set address, but it’s a sufficiently common sequence that it’s worth having a separate intrinsic (and instruction) for. The Rust strict-provenance APIs have a similar shape: addr() is equivalent to our get-address intrinsic, with_addr() to our set-address intrinsic, and add() is equivalent to our add-address intrinsic. Each can be lowered first to this intrinsic and then to the corresponding instruction.

Somewhat counter-intuitively, work to add the last of these to LLVM started first: ptradd is proposed as a replacement for GEP in the untyped-pointer world. This is provenance-preserving (at least if inbounds).

PR 139357 will add the ptrtoaddr instruction as an explicitly non-provenance-preserving operation to extract an address. I would also like to see a ptrsetaddr instruction to fill in the remaining gap, with the sequence ptrtoaddr, add, ptrsetaddr being canonicalised to in-bounds ptradd. A comment on PR 139357 suggests not adding ptrsetaddr and doing the canonicalisation the other way around so that every set-address operation requires a get address and a subtraction, but I believe that is likely to make life far too painful for mid-level optimisers.

1 Like

+100!

As someone who has followed this odyssey for a decade, it’s long due to have native CHERI-like support in LLVM upstream. The folks behind this are very long term LLVM contributors, so I have no doubt maintenance will continue. And with actual hardware being available (CharIoT, Morello) and more support coming, there’s enough real-world applicability for the code to be tested.

And who knows, hopefully one day, all architectures will have CHERI-like features…

3 Likes

I support this RFC; I see the benefits of doing this work upstream.

Will you be providing a post-commit CI bot to help catch regressions specific to CHERI needs?

Will you be providing a post-commit CI bot to help catch regressions specific to CHERI needs?

I’m interested in opinions on what’s required here. At the very least, we have a large body of LIT tests for clang, llvm, and lld that we expect to be able to bring along as we upstream. Some of those are written against CHERI Mips today, but I expect to port those to RISC-V, as nobody is particularly interested in upstreaming CHERI Mips support.

For CHERIoT, which I’m personally most familiar with, the LLVM post-commit test suite doesn’t make sense, as it does not have a POSIX-like execution environment.

For “normal” RISC-V CHERI, I believe post-commit testing with the LLVM test-suite would hypothetically be possible on CheriBSD under qemu, but I’m unsure what the throughput would be. Perhaps @jrtc27 or @arichardson can comment here?

I don’t have hard requirements in mind, personally. I was thinking more generally in terms of the requirements for adding a new target. I think you’re looking to upstream something that’s an official target and those should come with some sort of build bot to help ensure the community doesn’t accidentally regress the implementation. But if you’re looking for this to be an experimental target (one where the rest of the community is not responsible for keeping it working, generally speaking), then that would be good to know.

Practically speaking, those are the fundamental ones. CHERI-like support in Clang/LLVM/LLD don’t necessarily have a particular target in mind, just the general idea around pointer safety. How these manifest on actual hardware would need a (perhaps separate) set of tests that focus on the particular execution environments we have available today.

Having said that, it would be good to have at least one environment that runs some “relevant workload” and makes sure the expectations are held. QEMU sounds like a nice first step, if it already implements RISC-V’s CHERI extensions. But I personally think it would be ok to have that some time later.

It would be good to understand the requirements here. Previously the only times I’ve been involved with adding new buildbots have been when adding a new target that can self host and therefore needs to provide testing for LLVM building and running on that target. Our existing LLVM tests follow the upstream model and are not executable. These do not require any other developer to have access to any specialised hardware.

We also have an executable test suite for CHERIoT that we test with new compiler releases. We run this in a couple of simulators (which are public, we provide a CHERIoT dev container that includes a simulator from the formal model, a cycle-accurate verilator simulation of the reference implementation of the core, and a much faster MPact simulator contributed by Google).

I would expect any regressions that are not covered by the in-tree test suite to be our responsibility to fix (and add tests for!).

2 Likes

That’s my understanding of the minimal requirements, yes.

This fulfills the “relevant workload” testing I mention. I don’t think we’d need more than this, at least not at this stage.

Exactly!

I strongly support the addition of a CHERI-like architecture to the target-independent parts of LLVM.

The downstream work on CHERI has already been improving LLVM’s semantics around pointers, provenance, and address spaces – which has all been useful for non-CHERI users as well. Getting a hardware capability pointer model like CHERI supported upstream will be great to really put an emphasis on the need to be serious about getting all this stuff nailed down, specified, and working properly.

The proposal to add the non-standard RISC-V vendor extensions, and maintain them for the long-term even after “Y” is standardized, worries me. But I’m happy to leave it up to the RISC-V maintainer group to decide how they feel about that.

3 Likes

+1, I think that would meet my expectations as well.

2 Likes

I am curious, from a WG21/WG14 perspective, will this limit the choice clang can make wrt to various UBs etc? CHERI has come up in various discussion over the years in WG21 at least and it would be good to know if this changes our flexibility and or limits or maybe expands our choices in some areas or not.

The intent behind the work being done as part of [DataLayout][LangRef] Split non-integral and unstable pointer properties by arichardson · Pull Request #105735 · llvm/llvm-project · GitHub, which is one of the initial steps to enable CHERI upstream, is that IR can identify where it needs to be more careful (due to being something like a CHERI capability) and where it can be more fast-and-loose (due to just being an address). We do not intend to add any new restrictions on what optimisations can be made just because of CHERI, unless you are dealing with CHERI[1].


  1. Most of the time this is “is the type in question a pointer with property X”, but occasionally it will be “does the DataLayout say there are //any// legal pointer types with property X” (especially in the case of something like memcpy inlining). Either way if your target is not a CHERI one you can continue as before. ↩︎

1 Like

I would expect the impacts to be minimal. CHERI already comes up already when WG14 or WG21 discuss certain topics (things like pointer provenance, or how large pointers are, etc), so now Clang will be mentioned in those same kinds of topics. But otherwise, I expect this is just like any other extension; mostly something the standards bodies don’t have to care about until they want to make changes that would break the extension in significant ways.

This seems reasonable to me.