RFC: Forming LLVM Working Group on Memory Safety

Hi LLVM Community,

We propose forming a new working group within the LLVM project to focus on developing (and using) memory safety bug-prevention techniques. This would bring together LLVM contributors and users to focus on things like runtime hardening, language extensions, and general improvements to LLVM’s C++ toolchain (Clang, libc++, LLVM, etc).

We tentatively plan to have the first meeting on Wednesday, February 19, 2025 (exact time and other details will follow) and are eager to hear your feedback on the proposal.

Goals and non-goals

This group will discuss safety features and techniques that are readily available or can be implemented with relatively low effort, and those that can be applied in an existing codebase with low cost. We want the group to be pragmatic and look at incrementally-deployable mitigations that provide minimal performance hit and are otherwise practical and can be used today.

Examples of topics that are in-scope for this group:

  • Memory safety features in Clang that are actively deployed or currently developed:
    • Language extensions (-fbounds-safety) and warnings (-Wunsafe-buffer-usage) for spatial safety.
    • Hardened libc++ for spatial safety.
    • Typed allocators for temporal safety.
    • Pointer authentication for temporal and spatial safety.
    • [[lifetimebound]] and similar annotations for temporal safety. Other extensions for lifetimes.
  • Cataloging existing safety features, providing documentation, establishing and instilling best practices and good defaults for safety mitigation.
  • Sharing and generalizing experience of deploying existing safety features.

We also propose to start by focusing specifically on memory safety. We have observed that this is where the majority of energy is being invested across the industry. We still anticipate this group getting into other areas over time too (e.g. thread safety), after the initial topics will run out of steam, while acknowledging that there may be some overlap already.

We should attempt to prioritize techniques that eliminate entire bug classes or make them unexploitable with high assurance, over probabilistic measures, where feasible.

Non-goals

This is not a security or language research group. We would like to keep the group focused on pragmatic and practical outcomes and avoid getting into areas that require significant research or too much engineering investment.

Example topics that are out of scope:

  • Borrow checker, garbage collection, other large and non-incremental language overhauls.
  • Efforts requiring major backwards-incompatible language extensions.

Relation to LLVM Security Group

As mentioned above, this group does not aim to find or address security vulnerabilities in LLVM itself. LLVM Security Group is already responsible for that and we do not intend to overlap with their responsibilities. If the Safety Working Group discovers vulnerabilities in LLVM itself, we will follow the procedures for vulnerability disclosure and inform the LLVM Security Group.

As much of the work being done here pertains to security, there will be discussions about how we handle bypasses or newly introduced weaknesses. We will coordinate with the LLVM Security Group to ensure we’re following best practices and have the ability to notify users of important fixes.

All the communications of this group will stay open and anyone would be welcome to participate in it. We would also want to create a separate low-frequency communication channel for sharing important news, e.g. when the new hardening technique ships or critical fixes to existing hardening is available (likely through Discourse, but concrete details are TBD).

Participants

We encourage anyone interested from the LLVM community to participate in the group. The meetings and communications will be public. We have circulated the idea beforehand and identified the following initial members of the LLVM Community that would be interested in participating in the kick-off:

* Arlie Davis <ardavis@microsoft.com> / Microsoft
* Brian Gaeke <bgaeke@nvidia.com> / Nvidia
* Dan Liew <dan@su-root.co.uk> /Apple
* Devin Coughlin <dcoughlin@apple.com> / Apple
* Ilya Biryukov <ibiryukov@google.com> / Google
* Joe Bialek <jobialek@microsoft.com> / Microsoft
* Jon Bauman <jonbauman@rustfoundation.org> / Rust Foundation
* Josh Stone <cuviper@gmail.com> / RedHat
* Max Shavrick <mxms@google.com / Google
* Miguel A. Arroyo <miguel@arroyo.me>
* Ravi Kandhadai Madhavan <rkandhadaimadhav@apple.com> / Apple
* Vivek Kale <vlkale@sandia.gov> / Sandia National Labs
* Yael Meller <yael@nvidia.com> / Nvidia
* Yeoul Na <yeoul_na@apple.com> / Apple
* Yitzhak Mandelbaum <yitzhakm@google.com> / Google

Organizational details

Regular Meetings. We propose to start meeting every 2 weeks for 1 hour. After 4 meetings, we will reflect on progress and adjust the long-term schedule, structure, etc.

Communications. We will keep public meeting notes, Discourse topics and Discord Channel for general discussions. We will also create a low-frequency topic on Discourse to share important progress and news.

Initial discussion topics

We propose to start by focusing on a few topics that we expect to produce most fruitful outcomes in the short term. We will adjust as needed after a few initial rounds to see where this would come from.

  • Formulating best practices and sharing experiences across the industry for available mitigations.
    • Bug prevalence.
    • Deployed mitigation techniques and their effectiveness.
    • Performance and other environmental constraints, trade-offs they entail.
  • Discussing and coordinating active developed mitigations
    • Spatial and temporal safety techniques from examples in the Goals and non-goals section.
    • Tooling for automated warning mitigation and code migration.
14 Likes

(Sorry about the weird formatting for the list of participants. As a “new user” on this Discourse I cannot post more than 5 links and each email count as a link) :person_shrugging:

This is something we are very excited about! There is a bunch of very valuable improvements we can make here and we look forward to working with you all on it!

This is great to see! For my part, I’m mostly interested in this WG insofar as it may impact Rust, which mainly means changes to LLVM, but perhaps also Clang changes that could affect interoperability. I’ll also check if any of my Red Hat peers would like to join with more of a C and C++ focus.

I am interested in participating in the kick-off. How do I join?

I’d like to join as well.

I will schedule the meeting and post the details with an invite in this thread over the next week.
The first meeting is planned on Feb. 19.

Saving the date! This is great :slight_smile:

1 Like

Sounds great! Please send me an invitation although I can’t be sure I can attend it if you don’t mind. Thanks.

@ilya-biryukov, this is great! I would like to attend the kick-off when possible. Can you send me an invitation please? I attended @rapidsna 's presentation at EuroLLVM’23 and liked it a lot. I’m interested to hear about the current status and ideas for the future.

I’m interested in attending if practical.

(FWIW, Wednesdays aren’t great from a CHERI Project perspective as we commonly have meeting much of the morning Pacific time)

I’m also quite interested in helping with implementation or documentation! I’ve worked on various memory safety techniques and deployment for the last 10 years or so, and have been wanting to get back into working on LLVM proper.

I am also interested in participating.

The first meeting will take place next Thursday, February 20 at 10am PT.
Feel free to add yourself to the shared calendar below if you use Google calendar or simply dial into the Google Meet video call.

Google Meet video call: link
Meeting notes: link
Shared Google Calendar: link

2 Likes

It was great to meet everyone and hear about the commonalities and difference between the problems and approaches in different environments and companies! @ilya-biryukov perhaps next time we can have a presentation from Apple folks about the approach we have been taking on memory safety?

3 Likes

Thanks everyone for joining the first meeting, I second Devin’s comment, it was great to meet everyone!

next time we can have a presentation from Apple folks about the approach we have been taking on memory safety

That sounds great, let’s set this up. I’ve reached out over email to discuss the details.

As a reminder, the next meeting is on Thursday, March 6 at 10am PT.
We will continue our discussions about different perspectives on memory safety that various groups have for another meeting or two. After that, we will start talking what outcomes we can get from the WG.

I am very excited to have such a form to discuss memory-related topics. However, the current meeting time is not very friendly to participants in the Asian time zone (the current meeting time is 2 or 3 am for Asia). I wonder if a meeting time friendly to the Asian time zone can be found. I would be very grateful!

I wonder if a meeting time friendly to the Asian time zone can be found. I would be very grateful!

Thanks for calling this out!
It would be tough to have every meeting that fits both timezone, as we already have people in the PT and in the CET timezones. We could, however, alternate between two different times.

We’ll need to poll people on their timezones and meeting times for this.
Feel free to poke at the meetings notes for now, at least (they are not great, but have some discussion threads).

Just a reminder that the next meeting is happening this Thursday.

We also got a Discord channel now, you can find the link in the meeting notes linked above (I don’t want to post here in hopes that Google Docs are less likely to be scanned by various scams)

For people who attend these meetings: please respond to the survey to help us make sense of time zone distribution of the participants. We’d like to make the meeting accessible to folks in Asia and need this information to see which options we have on the table.