Round Table about CAS and Compiler Caching in 2022 LLVM Dev Mtg

We had a pretty good turnout for the round-table, following the tech-talk about how we use a CAS in clang to do compilation caching and other examples/prototypes. Attendees present were from Apple, Google, Sony, Nintendo, Meta.

Object format

There’s a lot of interests for the CAS object file format presented in the talk. In addition to the information from the talk, we discussed several prototypes for CAS ObjectFile formats with different abstractions showing similar results. The presented object file produced from MC layer only works for MachO object but we have prototypes that built on JITLinkGraph that supports most of ELF. You can find all of them in our experimental branch.
There are also lots of experiments done for debug information which we didn’t have time to talk about. The long term goals are:

  • Possibility for an LLVM CAS ObjectFile that is optimized for size and speed for all targets (since only the linked product need to be in MachO/ELF/COFF)
  • Optimized debug info for CAS ObjectFile
  • Teach all tools to understand CAS artifacts (reduced disk traffic, only need to materialize necessary bits)

Clang Caching

  • Q: How does it handle something like the __TIME__ macro? A: Because we have compiler integration, we can skip caching the output object file output when lexer hits the macro. This is more user-friendly approach compared to forbidding use of the macro entirely. If you do define it to a fixed value as some projects do, it will be cached as expected.
  • Q: How does it compare with Bazel/work with Bazel for distributed build? A: For Bazel, it internally hashes everything and it would prefer no worker can see the entire source tree. Q: How does this work with a dependency scanning model we presented? A: We designed the system with build system integration in mind. We expose an API for build system to drive the scanning. We chose to have an accurate dependency scanning with full source tree in our example, which will also provide a more accurate dependency graph for distributed build. It is definitely possible to work with a less accurate but more distributed dependency scanning model. The tradeoff is less cache hits. The dependency scanning model is also what we have for clang modules so it is kind of a similar problem to solve.
  • Q: How do we verify our cached result is correct? A: First of all, we want to make sure our cache design is sound by default, that is if the code is following some very basic rules, for example, all file accesses go through file manager and do not access environmental variable during clang -cc1, it can’t have caching problems. We materialize the entire compilation from an immutable source so if some inputs are not captured, you will hit an error or such inputs will not have affect on the output. We do have some ideas to implement some more expensive rail guards to catch people violating those rules, like using a sandbox. Secondly, we do depend on our compilers to be deterministic. We run into some of those problems and we will have to address them one by one.

Future

We have a goal to move towards a finer-grained caching model. We are not talking about a caching model like ThinLTO caching, where we run a hasher on all the inputs we know that will have an effect on the output. That is very error prone. We would like to model different parts of the compiler into pure computation that we can cache. For example, for thinLTO, we can materialize the LTOBackend/ThinLTOCodeGenerator and all the inputs from CAS, then we can make sure the cache key we use is always sound. The same mechanics can be applied to other kind of computations too.

Road blockers

We need code reviewers for this new feature. It can be hard as we are painting a very big picture here and the first step is adding a new library to LLVM. We tried to split up the patches and have good documentation, please ping us if you are interested and want to know more!

If you are interested in helping with code reviews:
VirtualOutputBackend: ⚙ D133504 Support: Add vfs::OutputBackend and OutputFile to virtualize compiler outputs
CAS (initial patch): ⚙ D133716 [CAS] Add LLVMCAS library with InMemoryCAS implementation

If you want to try our prototypes:

1 Like