PSA: the future of compiler-rt’s Scudo

Greetings,

compiler-rt hosts a hardened usermode memory allocator, named Scudo (https://llvm.org/docs/ScudoHardenedAllocator.html). It aims at providing additional mitigation against heap-based vulnerabilities, while maintaining good performance. It leverages sanitizer_common code, and provides allocation primitives via the usual C/C++ functions.

Up until now, Scudo was mostly used (as far as I can tell) by linking the library (dynamically or statically) to binaries, thus overriding the platform’s C/C++ library allocation functions.

A new usage scenario has emerged: replacing the actual libc allocator on a platform (namely Fuchsia). The current organization of the code, and some design choices, do not fit the requirements for such a use case (as expressed by Fuchsia, but legitimate points for all use): the code should be “production ready”, ideally small and self contained, carefully reviewed for potential performance impact, and obviously security as well. With no disrespect intended towards sanitizer_common, this can’t be the case with such a dependency.

After multiple discussions with the stakeholders, a standalone (eg: no sanitizer_common dependency) version of Scudo appeared to be the solution that would allow us to move forward. This meant rewriting parts of sanitizer_common that are currently used by Scudo (thus: some code duplication).

An early plan was to move to our own googlesource repository (or the like), but Chandler suggested we stay in compiler-rt, as a separate directory that could be a slice of the new git monorepo. This appeared to be acceptable to everybody involved (Chandler, Kostya S., Petr, Roland, Julia), and is now the plan of record. Once the standalone version is in, the non-standalone Scudo will likely be deprecated, although this part hasn’t been formalized yet.

We recognize the full implications of the decision in terms of feature sharing with sanitizer_common (or lack thereof), potential further duplication, etc., but the benefits outweigh the disadvantages.

We are time constrained, and I would like to start committing code as soon as possible, but I am open to hearing opinions and feedback about the plan.

Thank you for reading,

Kostya

So now we have a project that is loosely coupled to LLVM, can be built without any other parts of LLVM, can be built as part of another shared library that takes no dependencies on LLVM, will have contributors that do not contribute to the rest of LLVM, but requires a clone of the entire LLVM project to be able to access?

This is precisely the use case that caused many objections to the everything-in-the-monorepo model.

David

FWIW, my reasoning for suggesting keeping it in compiler-rt is that I somewhat wish we had a more general structure like runtimes with all of our runtimes in it.

Maybe we will get there if/when we can restructure things easily (post git migration, in whatever form it takes).

For now, it just didn’t seem worth the cost of adding a repository sibling to compiler-rt, libcxx, libcxxabi just for Scudo, so I’d toss it under compiler-rt. If that causes problems, can always create a repo for it. Mostly, it complicates the github migration scripts I suspect.

-Chandler

I don’t think this is the thread to re-litigate this. We have plenty of examples in both directions, and we should instead focus on evaluating how well technical approaches to minimize the cost on standalone runtime libraries like this are effective with the prototype monorepo and whether that is adequate.

Greetings,

compiler-rt hosts a hardened usermode memory allocator, named Scudo (https://llvm.org/docs/ScudoHardenedAllocator.html). It aims at providing additional mitigation against heap-based vulnerabilities, while maintaining good performance. It leverages sanitizer_common code, and provides allocation primitives via the usual C/C++ functions.

Up until now, Scudo was mostly used (as far as I can tell) by linking the library (dynamically or statically) to binaries, thus overriding the platform’s C/C++ library allocation functions.

A new usage scenario has emerged: replacing the actual libc allocator on a platform (namely Fuchsia). The current organization of the code, and some design choices, do not fit the requirements for such a use case (as expressed by Fuchsia, but legitimate points for all use): the code should be “production ready”, ideally small and self contained, carefully reviewed for potential performance impact, and obviously security as well. With no disrespect intended towards sanitizer_common, this can’t be the case with such a dependency.

After multiple discussions with the stakeholders, a standalone (eg: no sanitizer_common dependency) version of Scudo appeared to be the solution that would allow us to move forward. This meant rewriting parts of sanitizer_common that are currently used by Scudo (thus: some code duplication).

An early plan was to move to our own googlesource repository (or the like), but Chandler suggested we stay in compiler-rt, as a separate directory that could be a slice of the new git monorepo. This appeared to be acceptable to everybody involved (Chandler, Kostya S., Petr, Roland, Julia), and is now the plan of record. Once the standalone version is in, the non-standalone Scudo will likely be deprecated, although this part hasn’t been formalized yet.

We recognize the full implications of the decision in terms of feature sharing with sanitizer_common (or lack thereof), potential further duplication, etc., but the benefits outweigh the disadvantages.

We are time constrained, and I would like to start committing code as soon as possible, but I am open to hearing opinions and feedback about the plan.

Yep, I’d rather have scudo in compiler-rt as Kostya K suggests it, than not have it.
It’s unfortunate that we have to essentially fork parts of compiler-rt into a separate directory (still in compiler-rt),
but all alternatives we looked at are not any better.

FWIW, partial-clone support in git is under active development. git
clone --filter was introduced in 2.19 with further improvements in
2.20. It will take a while to be fully fleshed out and support to be
added to public servers, but in time this should address concerns
about large clones needed for small sub-projects.
<https://www.git-scm.com/docs/partial-clone&gt;

Best,

Alex