Following onto the broader CHERI upstreaming RFC, I would like to open discussion on adding support for the CHERIoT in LLVM’s target triple. For context, CHERIoT is a full system architecture (both HW and SW) for a CHERI-based RISCV32E microcontroller.
While adding target triple enums would not normally merit a full RFC, CHERIoT requires more discussion because the proposal is to add it as a subarchitecture of riscv32, for which there are no extant examples.
What exactly is being proposed?
I am proposing adding support for riscv32cheriotv1-unknown-unknown and riscv32cheriotv1-unknown-cheriotrtos triples to LLVM, which the cheriotv1 component captured in Triplevia the SubArchfield. I also propose to parse riscv32cheriot-… as an alias for riscv32cheriotv1-…
What makes CHERIoT special in this regard?
Relative to a RISCV32E + CHERI baseline, CHERIoT extends (and restricts) the instruction set, changes calling conventions, changes the capability and permission formats, exposes additional source-level annotations, adds relocations, and defines a custom (non-Unix-like) linkage model. All of these divergences are documented in the CHERIoT Architecture Specification.
Critically to this discussion, these divergences are not supported independently of each other, such that it is not desirable to toggle them individually. Moreover, CHERIoT users want to be able to use clang to target CHERIoT without needing to pass a long series of individual feature flags, but rather a single flag identifying CHERIoT as the target.
Could you not just use the OS field of the triple?
While some elements are captured by the OS field (CHERIoT uses OS unknown for bare metal code, and cheriotrtos for code running inside the CHERIoT RTOS), many of the divergences apply across both, with the custom linkage model being the primary exception.
Should this not be a new Arch rather than SubArch?
While CHERIoT has many divergences compared to the baseline, it is still ultimately a derivative of RISCV32E + CHERI, and the number of checks for CHERIoT SubArch in CHERIoT clang is tiny compared to the number of checks for RISCV(32E) and CHERI. As such, splitting it in to its own Arch would create significantly more code churn and complexity.
Where can I see the code?
The SubArch as described is implemented in CHERIoT clang today, requiring fairly minimal code changes. The core parsing is here, with other minor pieces throughout Triple.cpp