[RFC] Improve binary security

FOR THE LATEST PROPOSAL SEE MY COMMENT BELOW

Original post

In the forum topic about changing the criteria to gain commit access to LLVM, there has been extensive discussion about how to improve security for the LLVM project in general and how to avoid supply chain attacks on our releases and attached assets. This RFC is my attempt at proposing changes that I think are necessary to become less of a obvious target for these kinds of attacks. This RFC doesn’t try to solve every single problem with security in the LLVM project, it focuses very narrowly on the release binaries and sources and tries to make a dent in that problem, not completely remove the problem.

The current situation

Currently when we make a release of LLVM, a release manager will run an export script that will output the split source archives and sign them with their own PGP key. The public part of these keys can be found on releases.llvm.org. A tag will be created on GitHub, this tag is also signed with one of the keys above.

The release testers will download the source or checkout the tag from GitHub and run the test-release script. This will create a binary distribution of LLVM and this is then signed (not all testers do this) and uploaded to the release on GitHub.

What is a supply chain attack

Most of the discussion around this came out of the recent XZ/LZMA attack, where a long time contributor was able to insert backdoor code into the release source tarballs, the backdoor was not added to Git but just to the release artifacts. It’s not known if the contributor was playing a long game with the result of being able to insert the backdoor or if he was himself compromised in order to gain access to insert the backdoor. I think both scenarios are worth keeping in mind while crafting a proposal to fight an attack like this.

There are many more supply chain attacks and I won’t spend to much time on that here, you can read up on a lot more attacks on Wikipedia.

The problems

Becoming a release tester

Currently there is no process for becoming a release tester, most likely the release testers are people that want to make sure that their platform is tested for each release and they step up in the forum and start running the test script and upload binaries. To upload to a release you only need to have commit access to LLVM. This means that in very very short time someone could start posting (legit or non legit) release binaries directly to LLVM release page without any scrutiny.

GitHub permissions

GitHub offers no ACL or limitations on who can upload or create a release for a repository. This means that anyone with commit access to llvm-project can create a release and upload assets. Not only can they upload new artifacts, they can also overwrite older assets.

To add some insult to this injury - there is no way to disable this functionality in GitHub. I really hope that recent events will make GitHub think about prioritizing these issues. If you work at GitHub or known anyone there, please forward this post to them to stress how important it is for bigger open source communities to have more control over the release assets.

The exotic platform problem

Some of the binaries uploaded to the assets are for “exotic” platforms that are not accessible for the release managers nor the community at wide. This means that verification of these binaries would be almost impossible.

Becoming a release manager

Currently there are two release managers for LLVM: @tstellar and me. While Tom has been around for quite some time in the community, I was invited to do releases by Tom in a private message on Discord and then accepted to be a release manager for every other release. By now I have been at EuroLLVM and meet with people and hopefully shown that I am here for good reasons, it shows that it would be far from impossible for people to get elevated like this with bad intentions.

Proposal

Stop publishing release assets on the main repo

Since GitHub doesn’t allow any ACL for creating assets I think our only option is to remove the assets from the main repository. The release assets should be moved to a new repository on GitHub where we can control who has commit access.

We will continue to publish releases on the main repo, but they will only link people to the second repo in order to do the actual download.

We would need to create github workflow to delete any assets uploaded on the main repo and delete any releases that are not tagged with the release managers PGP keys.

Only publish approved assets

The release assets that are uploaded to the release repo should be created by the CI and the workflow to create these binaries and source packages should only be accessible by the release managers. The repo should be open for PR’s from third parties that want to contribute, but we should apply some heavy scrutiny for accepting updates to these workflows. All assets should be signed with a trusted key.

Third-party binaries should not be published to any official LLVM release page

I think the best solution for third-party binaries would be that each release on the main repo would leave a link to a discourse post where third party binaries could be linked. These posts should make it REAL clear that we can’t verify the validity of these binaries or it’s contents.

Implement a verification routine for Release Managers

Since we remove the “official” status of release testers and not having them upload binaries to the release page anymore, I think we can be liberal in who can do that. The community should report bad / malicious binaries and we can then ban these people from posting to our forums.

But we need some kind of new routine for accepting a release manager, these people will hold the keys to the kingdom so to speak. In person meeting (or Zoom whatever) with current release managers or the foundation board would be a minimum. But I will leave this a bit more open to feedback for now since we don’t accept new release managers that often.

When to implement

The LLVM 18.x release window is soon done and while I think this is an issue we should address hastily, I think it makes more sense to implement the most of the stuff above for LLVM 19. Meanwhile I think all recent release pages should be updated with a disclaimer that the release binaries shouldn’t be trusted and only trust packages signed with the release managers keys.

Alternative if the community feels like this is not enough for the threat, we could delete all release binaries from the release pages and ask people to post third-party binaries to forum post that we can make ready. But I fear this approach will have negative consequences.

Potential downsides

Implementing these changes will lead to some negative consequences:

  • People rely on the third-party binaries for their own use of LLVM. We will never be able to serve a the same breath of official binaries as we currently offer with the help of third-parties. But I think relying on these binaries has always been risky.
  • More work for release managers. I was worried about this at first, but I think that if we can get the automation to work it shouldn’t add to much to the burden of the RM’s. We would need community help to maintain the workflows on the other hand.
  • Confusing that no binaries are available on the official repo. I am not sure we can work around this until GitHub implements ACL for this. When they do, the binaries can migrate back.

All in all - I think the improved security is worth the downsides listed above.

Thanks

Thanks to everyone that have engaged in this topic in various forums. I hope we can reach a consensus on this fast so that we can move forward and make the necessary changes to improve the security.

14 Likes

Thanks Tobias! It’s about time we have this effort off the ground!

YES!

This is not necessary as long as the release managers are code owners and only approvers and you forbid changes merging without approval.

YES!

Trust is complicated. Don’t create security theatre, or we’ll relax and create opportunities for bad actors.

We all know you, @tstellar, @akorobeynikov, @kbeyls, @tonic for many many years and I personally “trust” you to vouch for people in this capacity.

I’d make it much simpler: make (some of) you owners of the release repo and make you responsible (and accountable) for vetting and removing the people that can write to that repo. Force you to have two-factor authentication on Github.

Sure, your accounts can be compromised and used in a malicious ways, but anything more complicated escalates the infra, and complex infra is ripe for security issues (and theatre).

Disagree. There is no requirement for anyone at the foundation board to be (more of) an expert than anyone else in our community. This is a fallacy known as “appeal to a false authority”.

We trust the people who have been doing this for ages. There is no need to bring in some named group. How you vet it (video call) is up to you.

I’d do this with the current release managers we haven’t had contact with yet (through conferences, etc).

YES!

I’d do this regardless. Continue hosting binaries that could have been compromised and we have no way of confirming is asking for trouble. If we’re trying to secure our platform, we must do it right. No more hosting unsigned, unsanctioned, untestable binaries anywhere officially belonging to LLVM community.

Doesn’t have to be now, but it has to be soon.

Yep. Build your own binaries. Our release scripts are pretty good nowadays.

By the time we get the CI building, it will be less work.

Confusing at first, much clearer in the long run.

The recipe is simple: install from packages (we can add examples), or stores (screenshots), instruction how to build (link to the release script), etc.

It’s not because we have been doing it for so many years (for the wrong reasons) that we need to continue doing it (“sunk cost fallacy”).

We could also move the releases back to the web site.

At least for Windows, the binaries we’ve been releasing are the de facto official LLVM Windows binaries. I don’t think they should be considered third-party, and I think we should continue publishing them on the official release page.

Yes. This is also a viable solution if we can automate it. The reason we moved off that was that the amount of work for the RM’s to maintain and add new binaries was just to much and very error prone. I am not opposed to this, but we need the people with admin access to this machine to help us automate it in that case.

I agree with this. I was planning to bring it up, but I forgot in my novella writing. I think ideally the Windows binaries could be built and published with the CI as well, if it can’t or we can’t get it done by time of LLVM 19, I think they should be the exception.

As long as they’re built by CI and released on the separate repo, I see no problem with this either.

1 Like

I can’t help contrasting with e.g. cmake which provides prebuilt binaries for Windows (3 targets), MacOS (2 targets), and Linux (2 targets).

OTOH there is also gcc which provides links but disclaims any responsibility. I guess we are taking this more as our model.

1 Like

I think there two separates topics. Testing rcs on exotic platforms is great. Providing binaries in any form: No Thanks!.

2 Likes

This is IMO an unfortunate (but necessary) drawback. Hosting the binaries anywhere with LLVM’s name implies some endorsement and we just can’t do that.

a link to a discourse post where third party binaries could be linked. These posts should make it REAL clear that we can’t verify the validity of these binaries or it’s contents.

This seems like a great compromise. So an individual tester could link to their “Google Drive” or some other similar storage location but no space for attachments is provided on discourse.llvm.org? That should put an appropriate amount of fear into those who might want to use the binaries provided. If they trust the tester’s reputation, they can use the binary as they did before.

Some of the binaries uploaded to the assets are for “exotic” platforms that are not accessible for the release managers nor the community at wide. This means that verification of these binaries would be almost impossible.

The release assets that are uploaded to the release repo should be created by the CI and the workflow to create these binaries and source packages should only be accessible by the release managers.

I wonder just where we’d draw the line for “exotic” / which ones would be in the CI?
I’m in favor of this RFC as-is but curious what our starting point for supported platforms would be when initially executed.

Or this could backfire when they come to us saying: “I downloaded because YOU linked to it”. Not many people read the fine print.

I would strongly recommend we don’t link to any individual file anywhere, and we don’t curate external content. No one knows what will have on that folder tomorrow.

Creating a user-curated space outside of LLVM umbrella with a massive warning would be a reasonable compromise. Then we can always point out that we’re not the ones creating the content an that usage is at “your own risk”.

Saying “at your own risk” while you’re the one hosting/linking the files is dodgy.

I’d say “exotic” are the ones that we don’t have CI. No manual builds allowed.

1 Like

The builds in question are going to be running on cloud hosting (probably Google?). So fundamentally limited to whatever we can build/test in a reasonable amount of time on those hosts. And we need to require that the build doesn’t download dodgy packages (or else we end up back in same situation of untrusted binaries). And we need someone to volunteer to maintain each recipe.

Not sure if we need to impose additional restrictions (for example, do we want to allow build recipes using qemu?).

This can further be improved by making the builds fully reproducible, meaning that anyone can independently recreate the builds at a later point in time and verify that they are bit-by-bit identical.

With non-reproducible builds a single bit-flip may be caused by a harmless determinism issue, or by a backdoor. Though, most end-users will have a hard time to determine whether a bit-flip in a binary is harmless or harmful.

Reproducible build will also mean that there can be more than a single trusted signature, because anyone can reproduce the build and create a signature over the (hopefully) same checksum.

I am mentioning it, because CI machines may at any time be (unintentionally) tampered with by the CI provider, or by dependent CI action script, or by a package installed on the machine. For example, if the xz was used during the build and a backdoored version of the package was only supplied on the day a release was built, it would be harder to find out with non-reproducible builds.

4 Likes

I’m not in favor of this change. Even if we do move the binaries to another repo, it will still be possible to upload binaries to the main repo, and this is where most people are going to look first for binaries. I think we’d be better off adding an audit job that runs periodically (once per hour?) to verify that all the release uploads meet our criteria.

What’s the process for signing binaries/tarballs created by CI?

I agree with this, and I think we can re-create the most popular binaries in GitHub actions. We just may need to increase our GitHub Actions budget for this.

I think in the past, we’ve talked about doing key signings at the LLVM dev meetings, that seems like a pretty good option.

1 Like

I did consider this - but was worried about the lag between a upload and our workflow running to delete any bad artifacts. There is also the question about how to correctly verify the files, the best way to do this would be to compare the artifact to the .sig file and make sure the sig is from a list of verified keys. I believe a job that runs like this could take a while and any problems with the script could leave a bad asset up for to long. By moving the artifacts from the github main page to a secondary page (or releases.llvm.org) we teach our users to not trust anything from the github main repo. It just have a lower threshold for problems, IMHO.

I am no expert in how to do this securely, at a former work place we did have a PGP key specifically for the CI that was signed by the release managers keys. It’s imperfect, but I think it’s pretty good.

Yeah it’s probably a good idea, but it relies on the RM’s showing up at the meetings from time to time, so we probably need to have another option if that’s not possible. I think face to face meetings are the best way to avoid some of the trust issues, but considering how expensive it can be to travel and attend these conferences and the fact that many companies now are hesitant to spend on that (at least my employer is), it’s a limiting factor.

While yes - certainly, this is not a easy task. I work with reproducible builds at my $dayjob and it requires a lot of engineering effort to make sure they are not broken and we have a lot of guards around that. It will take a very dedicated group of people in the community to get it done, and I am not sure we have that considering that we already have trouble making sure the CI is maintained.

I agree that this requires effort and I understand that the resources may not be available right now. Looking at llvm in debian, it appears reproducible right now. I am not sure how similar the debian build script is with the build script in this repo, but it may be possible to do reproducible builds on a “best-effort” basis (“Community patches welcome, if something is broken”).

I think that (the audit job) is a good stopgap solution, while in parallel lobbying GitHub to improve their permissions model. I know previous attempts to make workflow improving suggestions haven’t been fruitful, but hopefully with the heightened interest in security there’s more hope of improvements here.

1 Like

Not necessarily.

Github actions allow us to add external builders, so in theory we can add CI for any platform that we want to plug into CI. In practice, adding a self-hosted builder where clang is replaced with a malicious tool that injects a backdoor into the binaries is actually quite easy.

So, not only the recipes need to be vetted (via Github PR review and approval), but also the self-hosted builders (and their systems, packages, etc). This is not a trivial infrastructure to maintain. (I believe this is what you were trying to say).

In the end, the story is the same: liability. If this is not your core expertise, and you’re not going to spend the effort to make it safe, it’s best not to do it at all.

My recommendation is still that we do not host any binary, not even Windows, on LLVM owned infrastructure. I’ll repeat what I said over 10 years ago in a similar discussion: We are not distro people. We do not know how to build that stuff, and more importantly, we cannot differentiate between genuine platform volunteers (like @sylvestre, @bero, @hansw2000) and bogus (incompetent or malicious) ones.

I have done this before, and I definitely fall into the bogus (incompetent) category. I would not like a mistake I made to fall into the news as malicious. But I’m sure the community would accept my contribution if I offered. (Well, probably not anymore, now :smile:).

The release process should be just a validation and release of the sources as tar-balls, a robust build script & documentation. Beyond that, we offer links to third party projects that build and host LLVM on their own infrastructure (at your own risk, reviewed every release) and support to build your own.

Windows builds are on a class of their own (OS, stdlib, tools, env) and I’d make it a different project altogether. I wouldn’t shy of asking Microsoft to help us, here, and put that on their store.

(Edited to add: Windows has this for a long time Chocolatey Software | LLVM 18.1.2)

Maybe I was unclear about what I meant. If discourse can’t host the links then that’s a good thing because – like you say, it would create a user-curated space and we don’t want that. But did you mean that we would bar testers from posting links to their own hosted toolchains in discussions? A moderator would strike their post if it showed up, for example?

I wouldn’t go that far, no, but I see why you were confused. I would only “moderate” links in forums if they’re shown to be malicious / dangerous, not as a rule of thumb.

But I would not put those links there as an official LLVM release, because that’s endorsing and taking liability for, and we’re not ready for that.

Note that this isn’t true. Github Actions only Supports Windows/MacOS/Linux on X86 and Aarch64 (About self-hosted runners - GitHub Docs), mostly because the runner binary has a .Net dependency from what I understand. Quite a few of the releases put up by volunteers currently do not fall into this list of supported platforms.

I don’t think this changes arguments too much, but is something to consider.