For those who aren’t aware, Embecosm have been asked to aid in the process of attempting to get CHERI support in LLVM, and eventually the Rust compiler, to a state in which it could be upstreamed.
Following on from a round table discussion held at EuroLLVM, we think the ability to discuss the tasks ahead, potential challenges and progress with the community is vital. And so to help us do this we have decided to host a series of public sync-ups surrounding this topic.
The meetings will be on the second Wednesday of each month at 3 PM (UK time) which we hope will allow the most people to join based on timezones. This will be flexible if there’s a need to change it.
Keep an eye on this thread for agendas and meeting minutes for each call!
I missed this thread when it was announced. It would be good to tag the relevant folks who wrote most of the original code (@jrtc27, @arichardson) and make sure that they’re aware of it.
Tomorrow is the next of our monthly sync-ups. The agenda from my end is still quite short, but it would be good if anyone did have any topics to bring up that they could let me know and we can discuss it in the meeting.
Agenda for tomorrow’s sync-up:
Progress on subdividing CHERI LLVM work into smaller packages
Documentation - what documentation of the CHERI LLVM work is out there to help understand the changes that have been made (I’m aware the relevant people working on the original code will likely have some resources)
Open discussion for possible issues and concerns in upstreaming CHERI LLVM support
There are two main aspects to CHERI LLVM: greater support for non-zero address spaces and adding CHERI support. I don’t think the latter can really be split up in any meaningful way (assuming you’re only adding CHERI support to one backend) other than the usual Clang/LLVM/LLD divide (you can subdivide further but you’d want such patch series to land all at once as the subdivided patches wouldn’t be useful in isolation, just like how new backends get landed), and we don’t want it to be either (see later). The former is what can and ideally would be upstream; we’ve upstreamed some things but there are still a bunch that we haven’t (and I couldn’t tell you what that list is, it’s just looking at the diff and figuring out what’s not CHERI-specific). This gets into your issues and concerns point, though.
There isn’t. Various old documents exist in places, but a lot of it will be outdated or not very complete. Most of the changes can only be understood by inferring from knowing how CHERI works or trawling commit messages (or being very familiar with CHERI LLVM already, i.e. being myself or Alex, or possibly David still, though there have been a lot of changes since).
I wrote down summaries of some known long-standing issues a month or so ago in Issues · CTSRD-CHERI/llvm-project · GitHub. Some of those require discussion with upstream, with the address space one potentially changing LangRef, and the atomics one (hopefully) requiring adding a second type to AMOs, something we wouldn’t want to have as a downstream diff. Other concerns would be that CHERI-RISC-V’s ISA and ABI are not stable; Arm do not want Morello upstreamed; Morello LLVM has hacks over and above CHERI LLVM that we don’t want even in CHERI LLVM, let alone upstream LLVM; CHERI-LLVM continues to see changes and we would want to continue working in our fork; we would not want to be responsible for maintaining upstream LLVM’s CHERI support; and we do not have a true linkage implementation (even Morello), though that one is documented in the issue tracker.
Thanks for the information Jessica. I think the issue of documentation is an important one - regardless of potential upstreaming or not - because it’s quite a huge chunk of changes that have been made, and to try to understand it as someone who wants to help the CHERI work mature further it’s a bit daunting. Hopefully while I’m looking through all these changes and my own understanding improves I can try and help improve this situation.
Regarding upstreaming, I would say those are really good points which certainly mean having this work upstreamed would be difficult. In the general case though, as the CHERI LLVM work becomes more visible it’s inevitable that interested parties will want to see development take place not in a downstream branch, but will want it upstreamed as soon as they think it’s feasible. And they will provide resources to help guide it in that direction by tackling some of those barriers that you’ve already identified. At least that’s what we have seen. Of course, effort that goes into tackling the barriers to upstreaming is not necessarily wasted effort if upstreaming doesn’t happen.
Today is the next of our monthly sync-ups. If anyone has any topics to bring up let me know and we can discuss it in the meeting.
Agenda for today’s sync-up:
Contributing a version of the CHERI changes with a merge of upstream to CTRSD-CHERI
Any other items
I had also meant to make a comment regarding moving this meeting to 4pm UK time in order to better accomodate the US pacific coast, but since I didn’t do that this one will remain at 3pm UK time.
Any lurkers wondering about CHERI / LLVM progress:
There were two CHERI demoes at Cyber UK last week. The Morello demo showed FreeBSD, Wayland, KDE, and a few GUI applications (and all OS userspace services) running with the memory-safe model (currently, only spatial memory safety, temporal is off by default and should be enabled soon, but requires only libc / kernel features, no impact on the compiler). This gave an impressively usable GUI and was an amazing showcase of how LLVM can support novel architectures.
At the opposite extreme, my team demonstrated a clean-slate RTOS co-designed with a RISC-V CHERI extension, reusing a load of third-party embedded code (FreeRTOS network stack, mBedTLS, the Microvium JavaScript VM) on a device with 384 KiB of RAM (lots of pretty debug messages for the demo bloated the code so we don’t quite fit in 256 KiB). All of this runs with spatial and temporal memory safety and, unlike most of the CHERI work, uses a 32-bit address space.
It’s really amazing to see how far the CHERI LLVM toolchain has come on since I stopped being the only person working on it and I look forward to seeing this work upstreamed soon.
Since next week’s sync-up would overlap with the EuroLLVM dev meeting, I am going to propose moving the call to the Friday instead of the Wednesday. I’ve moved this in the calendar, but I’m not sure how shared calendars work so whether or not that is reflected for other people. If not, the link and time of day remains the same, just Friday instead of Wednesday.
Feel free to let me know if this is unworkable and perhaps we can have it on a different day instead.