[RFC] LLVM Security Group and Process

Hello compiler enthusiasts,

The Apple LLVM team would like to propose that a new a security process and an associated private LLVM Security Group be created under the umbrella of the LLVM project.

A draft proposal for how we could organize such a group and what its process could be is available on Phabricator. The proposal starts with a list of goals for the process and Security Group, repeated here:

The LLVM Security Group has the following goals:

  1. Allow LLVM contributors and security researchers to disclose security-related issues affecting the LLVM project to members of the LLVM community.
  2. Organize fixes, code reviews, and release management for said issues.
  3. Allow distributors time to investigate and deploy fixes before wide dissemination of vulnerabilities or mitigation shortcomings.
  4. Ensure timely notification and release to vendors who package and distribute LLVM-based toolchains and projects.
  5. Ensure timely notification to users of LLVM-based toolchains whose compiled code is security-sensitive, through the CVE process.

We’re looking for answers to the following questions:

  1. On this list: Should we create a security group and process?
  2. On this list: Do you agree with the goals listed in the proposal?
  3. On this list: at a high-level, what do you think should be done differently, and what do you think is exactly right in the draft proposal?
  4. On the Phabricator code review: going into specific details, what do you think should be done differently, and what do you think is exactly right in the draft proposal?
  5. On this list: to help understand where you’re coming from with your feedback, it would be helpful to state how you personally approach this issue:
  6. Are you an LLVM contributor (individual or representing a company)?
  7. Are you involved with security aspects of LLVM (if so, which)?
  8. Do you maintain significant downstream LLVM changes?
  9. Do you package and deploy LLVM for others to use (if so, to how many people)?
  10. Is your LLVM distribution based on the open-source releases?
  11. How often do you usually deploy LLVM?
  12. How fast can you deploy an update?
  13. Does your LLVM distribution handle untrusted inputs, and what kind?
  14. What’s the threat model for your LLVM distribution?

Other open-source projects have security-related groups and processes. They structure their group very differently from one another. This proposal borrows from some of these projects’ processes. A few examples:

I’ll go first in answering my own questions above:

  1. Yes! We should create a security group and process.
  2. We agree with the goals listed.
  3. We think the proposal is exactly right, but would like to hear the community’s opinions.
  4. Here’s how we approach the security of LLVM:
  5. I contribute to LLVM as an Apple employee.
  6. I’ve been involved in a variety of LLVM security issues, from automatic variable initialization to security-related diagnostics, as well as deploying these mitigations to internal codebases.
  7. We maintain significant downstream changes.
  8. We package and deploy LLVM, both internally and externally, for a variety of purposes, including the clang, Swift, and mobile GPU shader compilers.
  9. Our LLVM distribution is not directly derived from the open-source release. In all cases, all non-upstream public patches for our releases are available in repository branches at https://github.com/apple.
  10. We have many deployments of LLVM whose release schedules vary significantly. The LLVM build deployed as part of Xcode historically has one major release per year, followed by roughly one minor release every 2 months. Other releases of LLVM are also security-sensitive and don’t follow the same schedule.
  11. This depends on which release of LLVM is affected.
  12. Yes, our distribution sometimes handles untrusted input.
  13. The threat model is highly variable depending on the particular language front-ends being considered.
    Apple is involved with a variety of open-source projects and their disclosures. For example, we frequently work with the WebKit community to handle security issues through their process.

Thanks,

JF

Hey JF,

Thanks for putting this RFC together. LLVM security issues are very important, and I’m really glad someone is focusing attention here.

I’m generally in agreement with much of what you have proposed. I do have a few thoughts I’d like to bring up.

Having the group appointed by the board seems a bit odd to me. Historically the board has not involved itself technical processes. I’m curious what the board’s thoughts are relating to this level of involvement in project direction (I know you wanted proposal feedback on Phabricator, but I think the role of the board is something worth discussing here).

My other meta thought is about focus and direction for the group. How do you define security issues?

To give you where I’m coming from. One of the big concerns I have at the moment is about running LLVM in secure execution contexts, where we care about bugs in the compiler that could influence code generation, not just the code generation itself. Historically, I believe, the security focus of LLVM has primarily been on generated code, do you see this group tackling both sides of the problem?

Thanks,
-Chris

Hi Chris!

Hey JF,

Thanks for putting this RFC together. LLVM security issues are very important, and I’m really glad someone is focusing attention here.

I’m generally in agreement with much of what you have proposed. I do have a few thoughts I’d like to bring up.

Having the group appointed by the board seems a bit odd to me. Historically the board has not involved itself technical processes. I’m curious what the board’s thoughts are relating to this level of involvement in project direction (I know you wanted proposal feedback on Phabricator, but I think the role of the board is something worth discussing here).

I consulted the board before sending out the RFC, and they didn’t express concerns about this specific point. I’m happy to have another method to find the right seed group, but that’s the best I could come up with :slight_smile:

My other meta thought is about focus and direction for the group. How do you define security issues?

To give you where I’m coming from. One of the big concerns I have at the moment is about running LLVM in secure execution contexts, where we care about bugs in the compiler that could influence code generation, not just the code generation itself. Historically, I believe, the security focus of LLVM has primarily been on generated code, do you see this group tackling both sides of the problem?

That’s indeed a difficult one!

I think there are two aspects to this: for a non-LLVM contributor, it doesn’t really matter. If they think it’s a security thing, we should go through this process. They shouldn’t need to try to figure out what LLVM thinks is its security boundary, that’s the project’s job. So I want to be fairly accepting, and let people file things that aren’t actually security related as security issues, because that has lower risk of folks doing the wrong thing (filing security issues as not security related).

On the flip side, I think it’s up to the project to figure it out. I used to care about what you allude to when working on PNaCl, but nowadays I mostly just care about the code generation. If we have people with that type of concern in the security group then we’re in a good position to handle those problems. If nobody represents that concern, then I don’t think we can address them, even if we nominally care. In other words: it’s security related if an LLVM contributor signs up to shepherd this kind of issue through the security process. If nobodies volunteers, then it’s not something the security process can handle. That might point out a hole in our coverage, one we should address.

Does that make sense?

I think it’s great to make a policy for reporting security bugs.

But first, yes, we need to be clear as to what sorts of things we consider as “security bugs”, and what we do not. We need to be clear on this, both for users to know what they should depend on, and for LLVM contributors to know when they should be raising a flag, if they discover or fix something themselves.

We could just keep on doing our usual development process, and respond only to externally-reported issues with the security-response routine. But I don’t think that would be a good idea. Creating a process whereby anyone outside the project can report security issues, and for which we’ll coordinate disclosure and create backports and such is all well and good…but if we don’t then also do (or at least try to do!) the same for issues discovered and fixed within the community, is there even a point?

So, if we’re going to expand what we consider a security bug beyond the present “effectively nothing”, I think it is really important to be a bit more precise about what it’s being expanded to.

For example, I think it should generally be agreed that a bug in Clang which allows arbitrary-code-execution in the compiler, given a specially crafted source-file, should not be considered a security issue. A bug, yes, but not a security issue, because we do not consider the use-case of running the compiler in privileged context to be a supported operation. But also the same for parsing a source-file into a clang AST – which might even happen automatically with editor integration. Seems less obviously correct, but, still, the reality today. And, IMO, the same stance should also apply to feeding arbitrary bitcode into LLVM. (And I get the unfortunate feeling that last statement might not find universal agreement.)

Given the current architecture and state of the project, I think it would be rather unwise to pretend that any of those are secure operations, or to try to support them as such with a security response process. Many compiler crashes seem likely to be security bugs, if someone is trying hard enough. If every time such a bug was fixed, it got a full security-response triggered, with embargos, CVEs, backports, etc…that just seems unsustainable. Maybe it would be nice to support this, but I think we’re a long way from there currently.

However, all that said – based on timing and recent events, perhaps your primary goal here is to establish a process for discussing LLVM patches to workaround not-yet-public CPU errata, and issues of that nature. In that case, the need for the security response group is primarily to allow developing quality LLVM patches based on not-yet-public information about other people’s products. That seems like a very useful thing to formalize, indeed, and doesn’t need any changes in llvm developers’ thinking. So if that’s what we’re talking about, let’s be clear about it.

One problem with defining away “arbitrary code execution in Clang” as “not security relevant” is that you are inevitably making probably-wrong assumptions about the set of all possible execution contexts.

Case in point: Sony, being on the security-sensitive side these days, has an internal mandate that we incorporate CVE fixes into open-source products that we deliver. As it happens, we deliver some GNU Binutils tools with our PS4 toolchain. There are CVEs against Binutils, so we were mandated to incorporate these patches. “?” I said, wondering how some simple command-line tool could have a CVE. Well, it turns out, lots of the Binutils code is packaged in libraries, and some of those libraries can be used by (apparently) web servers, so through some chain of events it would be possible for a web client to induce Bad Stuff on a server (hopefully no worse than a DoS, but that’s still a security issue). Ergo, security-relevant patch in GNU Binutils.

For my product’s delivery, the CVEs would be irrelevant. (Who cares if some command-line tool can crash if you feed it a bogus PE file; clearly not a security issue.) But, for someone else’s product, it would be a security issue. You can be sure that the people responsible for Binutils dealt with it as a security issue.

So, yeah, arbitrary code-execution in Clang, or more obviously in the JIT, is a potential security issue. Clangd probably should worry about this kind of stuff too. And we should be ready to handle it that way.

–paulr

One problem with defining away “arbitrary code execution in Clang” as “not security relevant” is that you are inevitably making probably-wrong assumptions about the set of all possible execution contexts.

Case in point: Sony, being on the security-sensitive side these days, has an internal mandate that we incorporate CVE fixes into open-source products that we deliver. As it happens, we deliver some GNU Binutils tools with our PS4 toolchain. There are CVEs against Binutils, so we were mandated to incorporate these patches. “?” I said, wondering how some simple command-line tool could have a CVE. Well, it turns out, lots of the Binutils code is packaged in libraries, and some of those libraries can be used by (apparently) web servers, so through some chain of events it would be possible for a web client to induce Bad Stuff on a server (hopefully no worse than a DoS, but that’s still a security issue). Ergo, security-relevant patch in GNU Binutils.

For my product’s delivery, the CVEs would be irrelevant. (Who cares if some command-line tool can crash if you feed it a bogus PE file; clearly not a security issue.) But, for someone else’s product, it would be a security issue. You can be sure that the people responsible for Binutils dealt with it as a security issue.

So, yeah, arbitrary code-execution in Clang, or more obviously in the JIT, is a potential security issue. Clangd probably should worry about this kind of stuff too. And we should be ready to handle it that way.

The reality is that clang is a long way from being hardened in that way (pretty much every crash/assertion failure on invalid input is probably a path to arbitrary code execution if someone wanted to try hard enough) & I don’t think the core developers are currently able to/interested in/motivated to do the work to meet that kind of need - so I tend to agree with James that it’s better that this is clearly specified as a non-goal than suggest some kind of “best effort” behavior here.

  • Dave

One problem with defining away “arbitrary code execution in Clang” as “not security relevant” is that you are inevitably making probably-wrong assumptions about the set of all possible execution contexts.

Case in point: Sony, being on the security-sensitive side these days, has an internal mandate that we incorporate CVE fixes into open-source products that we deliver. As it happens, we deliver some GNU Binutils tools with our PS4 toolchain. There are CVEs against Binutils, so we were mandated to incorporate these patches. “?” I said, wondering how some simple command-line tool could have a CVE. Well, it turns out, lots of the Binutils code is packaged in libraries, and some of those libraries can be used by (apparently) web servers, so through some chain of events it would be possible for a web client to induce Bad Stuff on a server (hopefully no worse than a DoS, but that’s still a security issue). Ergo, security-relevant patch in GNU Binutils.

For my product’s delivery, the CVEs would be irrelevant. (Who cares if some command-line tool can crash if you feed it a bogus PE file; clearly not a security issue.) But, for someone else’s product, it would be a security issue. You can be sure that the people responsible for Binutils dealt with it as a security issue.

So, yeah, arbitrary code-execution in Clang, or more obviously in the JIT, is a potential security issue. Clangd probably should worry about this kind of stuff too. And we should be ready to handle it that way.

The reality is that clang is a long way from being hardened in that way (pretty much every crash/assertion failure on invalid input is probably a path to arbitrary code execution if someone wanted to try hard enough) & I don’t think the core developers are currently able to/interested in/motivated to do the work to meet that kind of need - so I tend to agree with James that it’s better that this is clearly specified as a non-goal than suggest some kind of “best effort” behavior here.

I’d rephrase this: it’s not currently something that LLVM developers have tried to address, and it’s known to be insecure. Were someone to come in and commit significant amount of work, it would definitely be something we can support.

I don’t want to say “non-goal” without explaining why that’s the case, and what can be done to change things. In other words, if the security group is willing to call something security-related, then it is. Whoever is in that group has to put in the effort to address an issue. Until such people are part of the group, the group should respond to issues of this kind as “out of scope because ”.

I agree we should document those reasons as we encounter them! I just don’t think we should try to enumerate them right now. We’ll have a transparency report, and that’s a great opportunity to revisit what we think is / isn’t in scope, and call it out.

I think that’s a problematic way to go about things, because the security group has limited membership and the discussions are private and limited – even if there’s limited visibility after-the-fact. That is certainly a necessary and desirable property when working to resolve undisclosed vulnerabilities, but it is not when making general decisions about what we as a project want to claim to support. Of course, we all will need to trust the people on the security group to make certain decisions on a case-by-case basis, but the discussion about what we want to be security supported should be – must be – public.

This is not simply about deciding how to resolve an issue that’s reported externally, it’s about the entire process. The project needs to be on the same page as to what our security boundaries are, otherwise the security group will just end up just doing CVE-issue-response theater.

And I do agree that if someone were to come in and put in the significant amounts of work to make LLVM directly usable in security-sensitive places, then we could support that. But none of that should have anything to do with the security group or its membership. All of that work and discussion, and the decision to support it in the end, should be done as a project-wide discussion and decision, just like anything else that’s worked on.

One problem with defining away “arbitrary code execution in Clang” as “not security relevant” is that you are inevitably making probably-wrong assumptions about the set of all possible execution contexts.

Case in point: Sony, being on the security-sensitive side these days, has an internal mandate that we incorporate CVE fixes into open-source products that we deliver. As it happens, we deliver some GNU Binutils tools with our PS4 toolchain. There are CVEs against Binutils, so we were mandated to incorporate these patches. “?” I said, wondering how some simple command-line tool could have a CVE. Well, it turns out, lots of the Binutils code is packaged in libraries, and some of those libraries can be used by (apparently) web servers, so through some chain of events it would be possible for a web client to induce Bad Stuff on a server (hopefully no worse than a DoS, but that’s still a security issue). Ergo, security-relevant patch in GNU Binutils.

For my product’s delivery, the CVEs would be irrelevant. (Who cares if some command-line tool can crash if you feed it a bogus PE file; clearly not a security issue.) But, for someone else’s product, it would be a security issue. You can be sure that the people responsible for Binutils dealt with it as a security issue.

So, yeah, arbitrary code-execution in Clang, or more obviously in the JIT, is a potential security issue. Clangd probably should worry about this kind of stuff too. And we should be ready to handle it that way.

The reality is that clang is a long way from being hardened in that way (pretty much every crash/assertion failure on invalid input is probably a path to arbitrary code execution if someone wanted to try hard enough) & I don’t think the core developers are currently able to/interested in/motivated to do the work to meet that kind of need - so I tend to agree with James that it’s better that this is clearly specified as a non-goal than suggest some kind of “best effort” behavior here.

I’d rephrase this: it’s not currently something that LLVM developers have tried to address, and it’s known to be insecure. Were someone to come in and commit significant amount of work, it would definitely be something we can support.

I don’t want to say “non-goal” without explaining why that’s the case, and what can be done to change things. In other words, if the security group is willing to call something security-related, then it is. Whoever is in that group has to put in the effort to address an issue. Until such people are part of the group, the group should respond to issues of this kind as “out of scope because ”.

I agree we should document those reasons as we encounter them! I just don’t think we should try to enumerate them right now. We’ll have a transparency report, and that’s a great opportunity to revisit what we think is / isn’t in scope, and call it out.

I think that’s a problematic way to go about things, because the security group has limited membership and the discussions are private and limited – even if there’s limited visibility after-the-fact.

It has full visibility after the facts, not limited.

That is certainly a necessary and desirable property when working to resolve undisclosed vulnerabilities, but it is not when making general decisions about what we as a project want to claim to support. Of course, we all will need to trust the people on the security group to make certain decisions on a case-by-case basis, but the discussion about what we want to be security supported should be – must be – public.

What we’re really discussing here is: how do we go from today’s status (nothing is security) to where we want to be (the right things are security). I think we agree what we have today isn’t good, and we also agree that we eventually want to get to a point where some issues are treated as security. We also agree that the criteria for what is treated as security should be documented. I’ll gladly add a section to that effect in the documentation, it is indeed missing so thanks for raising the issue.

This is not simply about deciding how to resolve an issue that’s reported externally, it’s about the entire process. The project needs to be on the same page as to what our security boundaries are, otherwise the security group will just end up just doing CVE-issue-response theater.

I definitely don’t want theater.

And I do agree that if someone were to come in and put in the significant amounts of work to make LLVM directly usable in security-sensitive places, then we could support that. But none of that should have anything to do with the security group or its membership. All of that work and discussion, and the decision to support it in the end, should be done as a project-wide discussion and decision, just like anything else that’s worked on.

Here’s where we disagree: how to get from nothing being security to the right things being security.

I want to put that power in the hands of the security group, because they’d be the ones with experience handling security issues, defining security boundaries, fixing issues in those boundaries, etc. I’m worried that the community as a whole would legislate things as needing to be secure, without anyone in the security group able or willing to make it so. That’s an undesirable outcome because it sets them up for failure.

Of course neither of us is saying that the community should dictate to the security group, nor that the security group should dictate to the community. It should be a discussion. I agree with you that, in that transition period from no security to right security there might be cases where the security group disappoints the community, behind temporarily closed doors. There might be mistakes, an issue which should have been treated as security related won’t be. I would rather trust the security group, expect that it’ll do outreach when it feels unqualified to handle an issue, and fix any mistakes it makes if it happens. Doing so is better than where we are today.

And again, I expect that the security group will document what is treated as security over time. The transparency report ensures this, but as I said above we should have documentation to that effect as well (I’ll add it).

Does this help mitigate your concerns?

My answers to your "on the list" questions:

1. Should we create a security group and process?
SGTM. It appears that gcc has CVEs against it, why should they have all the fun.

2. Do you agree with the goals listed in the proposal?
They also SGTM.

3. at a high-level, what do you think should be done differently, and what do you think is exactly right in the draft proposal?
The involvement of the Foundation Board to bootstrap the initial security team... seems a tad odd. Basically you're calling for volunteers and wanting some sort of vetting process, and picked the Board to do that initially for lack of any other alternatives? I agree that the Board should sign on to have a security team at all, that falls within their purview, but they don't need to be part of the initial selection process. The initial volunteers can demonstrate their appropriateness to each other, just like later nominees would.

And answers to "where you're coming from":

1. Are you an LLVM contributor (individual or representing a company)?
I am a contributor, as a Sony employee, and code-owner for the PS4 target.

2. Are you involved with security aspects of LLVM (if so, which)?
I have participated in security-related discussions that come up. I've recently done some work on the stack-smash protector pass; IIRC, Sony contributed the 'strong' flavor, which I reviewed. Some years ago there was a random-nop-insertion pass (for ROP gadget removal) proposed, which didn't stick; we recently had a summer intern work on it but did not get to proper quality; I'd like to revive that.
Pre-LLVM, I spent over a decade working on OS security for DEC and Tandem. I can’t say I’m still current on the topic, but it remains an interest.

3. Do you maintain significant downstream LLVM changes?
Yes.

4. Do you package and deploy LLVM for others to use (if so, to how many people)?
We package a Clang-based toolchain for app and game-development studios; I don't have exact numbers but the developer population is definitely in the thousands.

5. Is your LLVM distribution based on the open-source releases?
Yes; we do continuous integration from upstream master but we base our releases on the upstream release branches.

6. How often do you usually deploy LLVM?
Twice a year; rarely, we deploy hot fixes.

7. How fast can you deploy an update?
A new release based on a new upstream branch has a very long lead time (months). We have deployed hot fixes based on a previous release in a few weeks, but we don't like to do that.

8. Does your LLVM distribution handle untrusted inputs, and what kind?
I'm unclear what you mean by this.

9. What’s the threat model for your LLVM distribution?
I can't speak to our internal security team's thoughts--we will likely want to nominate a second Sony person, from that team, to be a non-compiler-expert "vendor representative" who can better address that question. I can say that we use the same toolchain to build our OS, as well as other sensitive software such as the browser, along with games and other apps that could engage in online transactions involving actual money.

Hi Paul,

I'm curious about what the use case for this was. In the normal course of binary distribution of programs, the addition of nops doesn't affect ROP in any significant way. (For a while, inserting a nop before a ret broke ROPgadget's [1] ability to find interesting code sequences since it was looking for fixed sequences of instructions.)

I could imagine it being used for JITted code. If that was the use case in mind, did you happen to compare it to other randomized codegen?

I'm only curious because this has historically been an area of research of mine [2,3,4], not any sort of pressing matter.

Thank you,

Steve

1. GitHub - JonathanSalwan/ROPgadget: This tool lets you search your gadgets on your binaries to facilitate your ROP exploitation. ROPgadget supports ELF, PE and Mach-O format on x86, x64, ARM, ARM64, PowerPC, SPARC and MIPS architectures.
2. https://checkoway.net/papers/evt2009/evt2009.pdf
3. https://checkoway.net/papers/noret_ccs2010/noret_ccs2010.pdf
4. https://checkoway.net/papers/fcfi2014/fcfi2014.pdf

Hi all,

To elaborate on what Stephen said, compile-time nop insertion is only effective if the adversary and victim have different versions of the same binary. This obviously creates difficulties w.r.t. binary distribution and subsequent updates*. That said, my colleagues and I at UCI did attempt to upstream a nop insertion pass into LLVM a couple of years ago. You can find patches for LLVM 3.8.1 that allow nop insertion and many other randomizing transformations here: https://github.com/securesystemslab/multicompiler (Some of these have been forward ported to LLVM 7 as well but I don’t believe the code has been made public yet.)

Thanks,
Per

*We built a robust load-time randomizer that does function shuffling that works with off the shelf compilers and loaders, not sure if that’s of interest in your case: https://github.com/immunant/selfrando

Hi folks!

At some point I had read a paper (which appears to have gotten lost in my last move) regarding NOP insertion to disrupt gadgets. It identified gadgets in some lump of software, then rebuilt the software with random NOPs enabled, and proudly pointed to X% of the previous gadgets no longer being present, or usable, or something.

(To my mind this is not the right question; not “were previous gadgets disrupted” but “how many gadgets are there in the rebuilt software compared to the previous version?” If it’s known that the answer is “there is still an abundance of gadgets no matter what you do” then I’m answered, and thank you!)

I don’t know whether this would lead to any practical uses within Sony, but if we didn’t have a pass at all, there would be nothing to pursue. We had an intern on the compiler team who was also interested in security. I remembered the NOP insertion pass that had been committed upstream but later reverted, so we gave him that pass to play with. I was casually interested in my question above, of course, but there are plenty of software bits that we distribute online rather than on disks, so in principle there is potential for a possible use-case. I may be stating that too strongly.

In the end, the intern didn’t quite get it working well enough, and I’ve had too many other things going on to want to pick it up myself.

So that’s where things stand today: one of those spare-time things that might be worth real resources someday.

And thanks for the pointer to the multicompiler project!

–paulr

My worry is actually the inverse – that there may be a tendency to treat more issues as “security” than should be. When some bug is reported via the security process, I suspect there will be a default-presumption towards using the security process to resolve it, with all the downsides that go along with that.

What I want is for it to be clear that certain kinds of issues are currently explicitly out-of-scope. E.g. crashes/code-execution/etc resulting from parsing or compiling untrusted C source-code with Clang, or parsing/compiling untrusted bitcode with LLVM, or linking untrusted files with LLD. These sorts of things should not, currently, be treated with a “security” mindset. They’re bugs, which should be fixed, but if something’s security depends on llvm being able to securely process untrusted inputs, sorry, that’s not reasonable. (And yes, that’s maybe sad, but is the reality right now). Until someone is willing to put in the significant effort to make those processes generally secure for use on untrusted inputs, handling individual bug-reports of this kind via a special process is not going to realistically improve security.

Furthermore, if people disagree with the above paragraph, I’d like that discussion to be had in the open ahead of time, rather than in private after the first time such an issue is reported via a defined security process.

It feels like you want the security team to be two different things:

  1. A way to privately report security issues to LLVM, and a group of people to privately work on fixing such issues, for a coordinated release.
  2. A group of people working generally on defining or improving security properties of LLVM.

I don’t think these two need or should be linked, though some of the same people might be involved in both.

I agree 100% with this. LLVM is not secure in that way. Treating that
kind of report as a serious security issue would just be security
theatre and give the impression the project is developed with
different goals than many people writing the code (including me)
actually have. Moving to a world where it's reasonable to make those
crashes a security issue won't even be substantially helped by that
kind of whack-a-mole approach.

Cheers.

Tim.

And I do agree that if someone were to come in and put in the significant amounts of work to make LLVM directly usable in security-sensitive places, then we could support that. But none of that should have anything to do with the security group or its membership. All of that work and discussion, and the decision to support it in the end, should be done as a project-wide discussion and decision, just like anything else that’s worked on.

Here’s where we disagree: how to get from nothing being security to the right things being security.

I want to put that power in the hands of the security group, because they’d be the ones with experience handling security issues, defining security boundaries, fixing issues in those boundaries, etc. I’m worried that the community as a whole would legislate things as needing to be secure, without anyone in the security group able or willing to make it so. That’s an undesirable outcome because it sets them up for failure.

Of course neither of us is saying that the community should dictate to the security group, nor that the security group should dictate to the community. It should be a discussion. I agree with you that, in that transition period from no security to right security there might be cases where the security group disappoints the community, behind temporarily closed doors. There might be mistakes, an issue which should have been treated as security related won’t be. I would rather trust the security group, expect that it’ll do outreach when it feels unqualified to handle an issue, and fix any mistakes it makes if it happens. Doing so is better than where we are today.

My worry is actually the inverse – that there may be a tendency to treat more issues as “security” than should be. When some bug is reported via the security process, I suspect there will be a default-presumption towards using the security process to resolve it, with all the downsides that go along with that.

Agreed, that polarity is also a risk. I don’t see how to fix this issue either, except to trust the security group. Its members will be more competent at doing the right thing than the general LLVM community because they’ve dealt with this stuff before.

What I want is for it to be clear that certain kinds of issues are currently explicitly out-of-scope.

Yes I want this list, but I don’t think we need it now. Once we’ve got a group of experts looking at security issues they can incrementally figure out that list. Do you think that’s acceptable?

E.g. crashes/code-execution/etc resulting from parsing or compiling untrusted C source-code with Clang, or parsing/compiling untrusted bitcode with LLVM, or linking untrusted files with LLD. These sorts of things should not, currently, be treated with a “security” mindset. They’re bugs, which should be fixed, but if something’s security depends on llvm being able to securely process untrusted inputs, sorry, that’s not reasonable. (And yes, that’s maybe sad, but is the reality right now). Until someone is willing to put in the significant effort to make those processes generally secure for use on untrusted inputs, handling individual bug-reports of this kind via a special process is not going to realistically improve security.

Furthermore, if people disagree with the above paragraph, I’d like that discussion to be had in the open ahead of time, rather than in private after the first time such an issue is reported via a defined security process.

It feels like you want the security team to be two different things:

  1. A way to privately report security issues to LLVM, and a group of people to privately work on fixing such issues, for a coordinated release.
  2. A group of people working generally on defining or improving security properties of LLVM.

I don’t think these two need or should be linked, though some of the same people might be involved in both.

I don’t want 1 and 2 linked, though as you say it can be the same group. I’m saying that 2 should inform 1, they don’t exist in a vacuum.

  Hello compiler enthusiasts,

The Apple LLVM team would like to propose that a new a security process and an associated private LLVM Security Group be created under the umbrella of the LLVM project.

A draft proposal for how we could organize such a group and what its process could be is available on Phabricator <Login. The proposal starts with a list of goals for the process and Security Group, repeated here:

The LLVM Security Group has the following goals:

1. Allow LLVM contributors and security researchers to disclose
    security-related issues affecting the LLVM project to members of
    the LLVM community.
2. Organize fixes, code reviews, and release management for said issues.
3. Allow distributors time to investigate and deploy fixes before
    wide dissemination of vulnerabilities or mitigation shortcomings.
4. Ensure timely notification and release to vendors who package and
    distribute LLVM-based toolchains and projects.
5. Ensure timely notification to users of LLVM-based toolchains whose
    compiled code is security-sensitive, through the CVE process
    <https://cve.mitre.org/&gt;\.

We’re looking for answers to the following questions:

1. _On this list_: Should we create a security group and process?

Probably, thought we haven't seen a strong need to date.

If a group does form, we (Azul) are definitely interested in participating as a vendor.

1. _On this list_: Do you agree with the goals listed in the proposal?

Yes

1. _On this list_: at a high-level, what do you think should be done
    differently, and what do you think is exactly right in the draft
    proposal?

I'm a bit uncomfortable with the board selected initial group. I see the need for a final decision maker, but maybe require public on-list nominations before ratification by the board? If there's broad consensus, no need to appeal to the final decision maker.

1. _On the Phabricator code review_: going into specific details,
    what do you think should be done differently, and what do you
    think is exactly right in the draft proposal?
2. _On this list_: to help understand where you’re coming from with
    your feedback, it would be helpful to state how you personally
    approach this issue:
     1. Are you an LLVM contributor (individual or representing a
        company)?

Yes, in this email responding in both my capacity as an individual contributor, and on the behalf of my employer, Azul Systems.

     1. Are you involved with security aspects of LLVM (if so, which)?

We have responded to a couple of security relevant bugs, though we've generally not acknowledged that fact upstream until substantially later.

     1. Do you maintain significant downstream LLVM changes?

Yes.

     1. Do you package and deploy LLVM for others to use (if so, to
        how many people)?

Yes. Can't share user count.

     1. Is your LLVM distribution based on the open-source releases?

No. We build off of periodic ToT snapshots.

     1. How often do you usually deploy LLVM?

We have a new release roughly monthly. We backport selectively as needed.

     1. How fast can you deploy an update?

Usual process would be a week or two. In a true emergency, much less.

     1. Does your LLVM distribution handle untrusted inputs, and what
        kind?

Yes, for any well formed java input we may generate IR and invoke the optimizer. We fuzz extensively for this reason.

     1. What’s the threat model for your LLVM distribution?

In the worst case, attacker controlled bytecode. Given that, the attacker can influence, but not entirely control IR fed to the compiler.

On this list: Should we create a security group and process?

Yes, as long as it is a funded mandate by several major contributors.
We can’t run it as a volunteer group.

Also, someone (this group, or another) should do proactive work on hardening the
sensitive parts of LLVM, otherwise it will be a whack-a-mole.Of course, will need to decide what are those sensitive parts first.

On this list: Do you agree with the goals listed in the proposal?

In general - yes.
Although some details worry me.
E.g. I would try to be stricter with disclosure dates.

public within approximately fourteen weeks of the fix landing in the LLVM repositoryis too slow imho. it hurts the attackers less than it hurts the project.
oss-fuzz will adhere to the 90/30 policy

On this list: at a high-level, what do you think should be done differently, and what do you think is exactly right in the draft proposal?

The process seems to be too complicated, but no strong opinion here.
Do we have another example from a project of similar scale?

On the Phabricator code review: going into specific details, what do you think should be done differently, and what do you think is exactly right in the draft proposal?

commented on GitHub vs crbug

On this list: to help understand where you’re coming from with your feedback, it would be helpful to state how you personally approach this issue:
Are you an LLVM contributor (individual or representing a company)?

Yes, representing Google.

Are you involved with security aspects of LLVM (if so, which)?

To some extent:

  • my team owns tools that tend to find security bugs (sanitizers, libFuzzer)
  • my team co-owns oss-fuzz, which automatically sends security bugs to LLVM

Do you maintain significant downstream LLVM changes?

no

Do you package and deploy LLVM for others to use (if so, to how many people)?

not my team

Is your LLVM distribution based on the open-source releases?

no

How often do you usually deploy LLVM?

In some ecosystems LLVM is deployed ~ every two-three weeks.
In others it takes months.

How fast can you deploy an update?

For some ecosystems we can turn around in several days.
For others I don’t know.

Does your LLVM distribution handle untrusted inputs, and what kind?

Third party OSS code that is often pulled automatically.

What’s the threat model for your LLVM distribution?

Speculating here. I am not a real security expert myself

  • A developer getting a bug report and running clang/llvm on the “buggy” input, compromising the developer’s desktop.
  • A major opensource project is compromised and it’s code is changed in a subtle way that triggers a vulnerability in Clang/LLVM.
    The opensource code is pulled into an internal repo and is compiled by clang, compromising a machine on the build farm.
  • A vulnerability in a run-time library, e.g. crbug.com/606626 or crbug.com/994957
  • (???) Vulnerability in a LLVM-based JIT triggered by untrusted bitcode. <2-nd hand knowledge>
  • (???) an optimizer introducing a vulnerability into otherwise memory-safe code (we’ve seen a couple of such in load & store widening)
  • (???) deficiency in a hardening pass (CFI, stack protector, shadow call stack) making the hardening inefficient.

My 2c on the policies: if we actually treat some area of LLVM security-critical,

we must not only ensure that a reported bug is fixed, but also that the affected component gets
additional testing, fuzzing, and hardening afterwards.
E.g. for crbug.com/994957 I’d really like to see a fuzz target as a form of regression testing.

–kcc

Hi JF,

Thanks for putting up this proposal.

Regarding your question, which I answer both as an individual and with an Arm hat:

Should we create a security group and process?

Yes ! We believe it’s good to have such a group and a process. It may not be perfect for everyone, but that’s way better than nothing, and the current proposal has the necessary bits to evolve and adapt over time to the actual needs.

Do you agree with the goals listed in the proposal?

Yes !

At a high-level, what do you think should be done differently, and what do you think is exactly right in the draft proposal?

Dealing with security vulnerabilities is often a bit of a mess, done under time pressure, so having a “safe” place to quickly iterate / coordinate amongst interested parties and taking into account upstream LLVM is necessary.

We like that the role of this group is to deal / coordinate security-related issues, not to define an overall security roadmap for LLVM — this should happen in the open using the standard communication channels.

We think this group could work on proof of concept fixes or act as a proxy in case the work is done externally, providing (pre-)reviews to ensure the fixes are at the expected LLVM quality level, but the actual code reviews for committing LLVM upstream should be conducted using the standard community process (i.e. no special channel / fast lane for committing).

Our approach to this issue:

  1. Are you an LLVM contributor (individual or representing a company)?

I respond here both as an individual contributor and also on behalf of my employer, Arm.

  1. Are you involved with security aspects of LLVM (if so, which)?

I’m involved with security aspects in general, and have occasionally been involved in some LLVM specific aspects of security.

  1. Do you maintain significant downstream LLVM changes?

Yes we do, and a number of other companies using Arm also have downstream changes they maintain.

  1. Do you package and deploy LLVM for others to use (if so, to how many people)?

In our case, the situation is not as simple as “package & deploy”.

As a company, we care that all Arm users get the security fixes, wether this is thru software (tools or libraries) directly or indirectly shipped by Arm, or thru their own tool / library provider, or thru the vanilla open source channel.

  1. Is your LLVM distribution based on the open-source releases?

We don’t, but I’m sure there are distributions / users relying on the open-source releases.

We thus believe it’s important that backports are made and shared whenever possible.

  1. How often do you usually deploy LLVM?

We usually have about half a dozen releases a year, but then our downstream users have their own constraints / agenda. This will of course be different for other people providing Arm tools & libraries.

  1. How fast can you deploy an update?

On our end, we usually need about 4 weeks, and our downstream users have their own constraints / agenda. This will of course be different for other people providing Arm tools & libraries.

8.Does your LLVM distribution handle untrusted inputs, and what kind?
9. What’s the threat model for your LLVM distribution?

Given our large user base and usage models, answering this precisely now and here is impossible.

Kind regards,

Arnaud

And I do agree that if someone were to come in and put in the significant amounts of work to make LLVM directly usable in security-sensitive places, then we could support that. But none of that should have anything to do with the security group or its membership. All of that work and discussion, and the decision to support it in the end, should be done as a project-wide discussion and decision, just like anything else that’s worked on.

Here’s where we disagree: how to get from nothing being security to the right things being security.

I want to put that power in the hands of the security group, because they’d be the ones with experience handling security issues, defining security boundaries, fixing issues in those boundaries, etc. I’m worried that the community as a whole would legislate things as needing to be secure, without anyone in the security group able or willing to make it so. That’s an undesirable outcome because it sets them up for failure.

Of course neither of us is saying that the community should dictate to the security group, nor that the security group should dictate to the community. It should be a discussion. I agree with you that, in that transition period from no security to right security there might be cases where the security group disappoints the community, behind temporarily closed doors. There might be mistakes, an issue which should have been treated as security related won’t be. I would rather trust the security group, expect that it’ll do outreach when it feels unqualified to handle an issue, and fix any mistakes it makes if it happens. Doing so is better than where we are today.

My worry is actually the inverse – that there may be a tendency to treat more issues as “security” than should be. When some bug is reported via the security process, I suspect there will be a default-presumption towards using the security process to resolve it, with all the downsides that go along with that.

Agreed, that polarity is also a risk. I don’t see how to fix this issue either, except to trust the security group. Its members will be more competent at doing the right thing than the general LLVM community because they’ve dealt with this stuff before.

Again, I find it entirely reasonable to place trust in a small subset of the members of the LLVM community to do the right thing in response to security issues which must remain temporarily secret. It’s infeasible to allow the entire community to participate. I just don’t want to entrust anything else to the Security Group, as an organization, because it’s unnecessary (despite that they would likely be entirely worthy of that trust).

What I want is for it to be clear that certain kinds of issues are currently explicitly out-of-scope.

Yes I want this list, but I don’t think we need it now. Once we’ve got a group of experts looking at security issues they can incrementally figure out that list. Do you think that’s acceptable?

We know now, even before any issues have been reported through this process, what some of the areas of concern are going to be. Some have been mentioned before on this thread, and others likely have not. I would like to see it explicitly called out, up front, how we expect to treat certain issues without waiting for them to be reported.

Why do I want that? Because I want the security group’s mission statement and mandate from the community to be clear. If there’s disagreement about which sorts of things should or should not be treated as security issues (which I suspect there may well be), I’d like that to be hashed out in the open now, rather than delaying any such debate until such a time as it must be hashed out in private by the Security Group in response to a concrete private vulnerability report.

However, I agree it’s not necessary for you to define this immediately. If you’d like to attempt to find other volunteers to author those policies, rather than doing it yourself, I see absolutely no problem with that. But I would still like to see such a document get proposed and reviewed via the project’s usual open discussion forum (mailing lists, code reviews on new policy docs, etc), as soon as possible, in order to reduce surprises as much as possible. (Recognizing that it cannot and should not attempt to cover every eventuality.)