Hello Clang and MLIR folks, this RFC proposes CIR, a new IR for Clang.
TL;DR — We have been working on an MLIR based IR for Clang, currently called CIR (ClangIR, C/C++ IR, name-it). It’s open source by inception and we’d love to upstream it sooner rather than later. Our current (and initial) goal is to provide a framework for improved diagnostics for modern C++, meaning better support for coroutines and checks for idiomatic uses of known C++ libraries. Design has grown out of implementing a lifetime analysis/checker pass for CIR, based on the C++ lifetime safety paper. C++ high level optimizations and lowering to LLVM IR are highly desirable but are a secondary goal for now - unless, of course, we get early traction and interested members in the community to help :).
Motivation
In general, Clang’s AST is not an appropriate representation for dataflow analysis and reasoning about control flow. On the other hand, LLVM IR is too low level — it exists at a point in which we have already lost vital language information (e.g. scope information, loop forms and type hierarchies are invisible at the LLVM level), forcing a pass writer to attempt reconstruction of the original semantics. This leads to inaccurate results and inefficient analysis - not to mention the Sisyphean maintenance work given how fast LLVM changes. Clang’s CFG is supposed to bridge this gap but isn’t ideal either: a parallel lowering path for dataflow diagnostics that (a) is discarded after analysis, (b) has lots of known problems (checkout Kristóf Uman’s great survey regarding “dataflowness”) and (c) has testing coverage for CFG pieces not quite up to LLVM’s standards.
We also have the prominent recent success stories of Swift’s SIL and Rust’s HIR and MIR. These two projects have leveraged high level IRs to improve their performance and safety. We believe CIR could provide the same improvements for C++.
Case study: C++ Coroutines
Coroutines are a complex C++ feature and exemplifies quite well the consequences of lacking a higher level IR between Clang’s AST and LLVM IR. The interesting code generation parts are done at the LLVM IR pass level where necessary correctness work occurs — this is non-ideal, but there’s no better abstraction layer for this work yet! For a quick example, proper handling of symmetric transfer requires the presence of lifetime intrinsics which might not be available in a given LLVM IR bitcode file. It’s also common to hit subtle bugs given that these coroutine passes in LLVM are intermixed with other transformations which do not necessarily understand coroutines intrinsics nor properly consider frame allocation.
On the coroutines code analysis and diagnostics axis, we’ve attempted to combine both clang-tidy’s AST matchers and Clang’s CFG to reason about control-flow and lifetime. However, we have ran into CFG liveness accuracy issues, interprocedural capabilities limitations and aliasing problems. This raised the question of whether improving the current tools is enough to cover the problems we care within our codebase. An example of C++ coroutines usage (using folly::coro) we’d like to diagnose:
folly::coro::Task<int> byRef(const std::string& s) {
// do something with 's' ...
co_return 0;
}
folly::coro::Task<void> sillycoro() {
std::optional<folly::coro::Task<int>> task;
{
std::vector<std::string> v = {"foo", "bar", "baz"};
task = byRef(v[0]);
} // vector 'v' is destroyed here; references to it are dangling.
// do something with 's' effectively runs here.
folly::coro::blockingWait(std::move(task.value()));
co_return;
}
Additionally, something like CIR has been mentioned multiple times in hall chats, round tables, etc. The community interest for such an IR is a big motivator for us, and perhaps we can build something altogether. Two recent examples of such discussions are the discourse threads on HLSL support and Polygeist incubation.
Goals
This situation described above prompted us to look into other solutions and revisit existing limitations on our clang based tooling. This led to two main goals:
- Enable better diagnostics for correctness, security and performance.
-
Security / Bugs: The Google Chrome team notes that around 70% of their high-severity security bugs are memory unsafety problems. Half of which are use-after-free bugs. Using
std::optional
to illustrate, CIR could introduce instructions for optional derefs (cir.std.optional.deref
) and diagnose them as harmful if they are not dominated by the check on whether the object contains a value (cir.std.optional.has_value
). - Performance-driven diagnostics: expensive and potentially unintended C++ copies could be diagnosed using CIR by a compiler pass that consumes profile information and emits remarks over interesting copy ctor usage.
- Privacy: CIR could be used to check const-ness out of selected code paths, and to provide rich dataflow information on data access.
-
Security / Bugs: The Google Chrome team notes that around 70% of their high-severity security bugs are memory unsafety problems. Half of which are use-after-free bugs. Using
- Pave the way to CIR high-level transformations for optimizations.
- Recognizing idiomatic C++ usage could allow tools to suggest more elaborate source code modifications, e.g. CIR based code modification tools could suggest rewriting a ranged-based for into a loop form to better fit existing vectorizers. This is already a step towards using CIR transformations for optimization purposes.
How do we get there:
- Provide a way to express a contract between libraries and the compiler. Compiler passes operating on CIR can rewrite parts of code with more domain specific CIR operations and types, allowing it naturally recognize idioms and apply C++ aware code analysis.
- Cross translation unit (CTU) analysis. We’ve had simple bugs in production that could have been avoided by CTU analysis capabilities. Even though there are AST based approaches, we believe CIR is also a more natural place to move forward with this type of technology. It’s feasible to imagine something like ThinLTO summaries for CIR, enabling the propagation of lifetime, initialization, etc for both improved diagnostics and performance.
We are currently putting most of our effort into C/C++ (mostly C++) given our codebase demands. We plan to work on ObjC in the future, and would be happy to collaborate with contributors on this.
Related work
Several of the ideas presented in goals section are not new and some have even already been implemented, including an MLIR based IR for clang and similar. Not only have these projects moved the needle with tooling and overall compiler quality for C++, but they were also important in showing what tooling, features and bug mitigation the C++ community cares about. Let’s go over some of them and explain why we still think we need CIR:
- CIL: “a common MLIR dialect for Fortran/C/C++”, presented in LLVM Developer’s meeting in 2020 and open sourced in early 2021. This project seems promising, but it’s focused on optimizations (it’s not clear how much it does for diagnostics). To the best of our knowledge, CIL is designed around unstructured control flow, lacks a git history and upstreaming efforts, which are non-starters for us. Unfortunately when we tried to reach out and clarify more info, we didn’t hear back.
-
Polygeist: “is a new C and C++ frontend and compilation flow that connects the MLIR compiler infrastructure to cutting edge polyhedral optimization tool”. This project emits lower level dialects as well as their own custom dialect
polygeist
. ClangIR slots above this dialect in the lowering hierarchy and we believe these two projects could exist complementary of each other. Polygeist is in the the process to get incubated in the LLVM umbrella; perhaps CIR can be a bridge in between clang AST and polygeist goodness. -
Clang Dataflow framework: “a new static analysis framework for implementing Clang bug-finding and refactoring tools (including ClangTidy checks) based on a dataflow algorithm”. We are interested in similar goals and checks (e.g. stuff along the lines of
std::optional
example), but not a complete fit since part of our goal is to use the same representation to apply transformations and codegen LLVM at some point. - Clang’s Cross Translation Unit (CTU) Analysis is AST based (can be used with PCHs) and is a perfect fit for the current usage of analysis tools in Clang. As mentioned before, the AST representation limits the analysis potential.
C++ is hard. We are not going to solve all problems in the first year. But we do strongly believe that this is the way of the future.
Design decisions
Why MLIR?
MLIR provides a solid and tested framework to build custom IR and run passes. Among several other capabilities, it’s also in already in tree and used by Flang to build FIR.
High level language semantics
High level C/C++ semantics are better represented with custom and specific operations. A lifetime checker for C++ — based on the lifetime safety paper (P1179) by Herb Sutter — is a great example of an analysis that can greatly benefit from these richer operations. The design choices for the current form of CIR are mostly designed around elements that make a lifetime checker easy to express in the compiler.
Two examples to illustrate how such operations help:
- Scopes: the
cir.scope
defines a new MLIR region in CIR, which closely represents opening a new scope in C/C++. This means that:- New local (scope) variables (allocated by
cir.alloca
) and are always found in the closestcir.scope
region’s entry block. Lifetime of resources finishes at the end of the embracingcir.scope
region. - A points-to analysis on structured control-flow can rely on these properties to reason on a resource lifetime in a more natural way than a CFG with lifetime intrinsics. Further dialect lowering could unwrap
cir.scope
s if desirable (e.g. before emitting LLVMIR dialect). - In the example below, note how
x
is declared under the equivalentcir.scope
. Check our implemented lifetime checker pass around CIR for more information.
- New local (scope) variables (allocated by
int *may_explode() {
int *p = nullptr;
{
int x = 0;
p = &x;
*p = 42;
}
*p = 42; // oops...
...
}
func @may_explode() -> !cir.ptr<i32> {
%p_addr = cir.alloca !cir.ptr<i32>, cir.ptr <!cir.ptr<i32>>, ["p", cinit]
...
cir.scope {
// int x = 0;
%x_addr = cir.alloca i32, cir.ptr <i32>, ["x", cinit]
...
// p = &x;
cir.store %x_addr, %p_addr : !cir.ptr<i32>, cir.ptr <!cir.ptr<i32>>
...
// *p = 42
cir.store %forty_two, %x_addr : i32, cir.ptr <i32>
%p = cir.load deref %p_addr : cir.ptr <!cir.ptr<i32>>, !cir.ptr<i32>
...
} // 'x' lifetime ends, 'p' is bad.
// *p = 42
%forty_two = cir.cst(42 : i32)
%dead_x_addr = cir.load deref %p_addr : cir.ptr <!cir.ptr<i32>>, !cir.ptr<i32>
// attempt to store 42 to the dead address
cir.store %forty_two, %dead_x_addr : i32, cir.ptr <i32>
}
- Loops:
cir.loop
represents loops from C/C++. The loop form (for/while/do-while, and soon range-based for) must be explicitly provided to the operation. It also encompasses three different regions (condition, step and body) and must be enclosed by acir.scope
, where all possible init-statement declarations hold theircir.alloca
’s. This representation has interesting effects on diagnostics quality:- Accuracy: the form dictates the order the regions are executed when implementing MLIR’s
RegionBranchOpInterface
, allowing different MLIR based passes to retrieve the appropriate order for regions/blocks that are relevant for a particular loops. For example, the order to process regions are different between a do-while and a while (body then condition versus condition then body). - Extra analysis capabilities: the same mechanism that allows better accuracy can be combined with SCCP information to find more constants and reduce the amount of regions to be analyzed.
- Accuracy: the form dictates the order the regions are executed when implementing MLIR’s
Structured control flow
Current Clang’s codegen for CIR assumes a mostly structured control flow — goto
s are only supported intra-scope right now. C/C++ return
statements are represented with cir.return
, while break
and continue
are special forms of cir.yield
, an operation that represents returning control to the parent of a region. This has some advantages for code analysis (like lifetime) since it makes control-flow simple.
No inherent property of CIR prevents unstructured control flow from being used. It can be done by implementing a pass that flattens the CFG by merging scopes and moving alloca
s back to the function entry block. This effectively makes all goto
s intra-scope now that the function-level scope is the only one present. Any transformations or analysis that then prefer to work on such representations (like lowering to the LLVM IR dialect) can add this pass as required. This is probably the route we are taking when we get there.
Dialect tranformations
One great aspect of using MLIR is the ability to easily write transformations. This allows CIR to be morphed into an even higher level CIR (let’s say, when recognizing idioms for C++ containers) or lower CIR (when merging scopes for LLVM IR dialect generation). We mentioned in the Goals section the plan to improve diagnostics on top of C++ library usage, and CIR dialect transformations is the way to get there.
Clang’s codegen from AST to CIR is straightforward, using a subset of CIR, and it’s up to compiler options or extra tools to setup the necessary pass pipelines to achieve the desired CIR form for analysis. For instance, consider C++ lambdas: the naive codegen uses a method call to the appropriate internal struct callable. For lifetime analysis, a required transformation could inject a cir.lambda
operation at the original definition location, making it easier for the lifetime check pass to reason on captures and scope.
Verifiers
MLIR mechanisms for operation verification have been useful to codify the semantics of CIR. The implemented verifiers for CIR operations cover things like operand/result type matching, placement of certain operations (e.g. breaks need to be dominated by cir.loop
or cir.switch
). We have tests to exercise invalid constructs and the verifiers.
Status and Plans
CIR started towards the end of 2021; information about github repo, building instructions and others can be found in clangir.org.
The project is in early stages and we’re currently working on (a) the heavy lifting required to codegen CIR out of C++ sources from our codebase and (b) complete the lifetime checker, focusing on coroutines linting. Once we are ready to build full C++ projects, we also intend to measure compile time, memory usage, and check how it compares to other existing tools.
Open source & Upstream
This RFC proposes to incorporate CIR into the llvm-project as soon as possible, and we are ready to make any needed changes to make that happen. The early stages of development are especially attractive since the project could benefit from multiple eyes and instigate potential interested parties to join the effort early. This is highly dependent on the community buy-in and we also understand if some level of maturity is desired.
Project Layout
The layout so far is divided into a few main pieces:
- Clang codegen bits in clang/lib/CIR: this mostly mimics the file/class layout of LLVM codegen. To leverage on years of codegen improvements and fixes, we carefully try to track currently out-of-scope features with assertions and feature guarding whenever we don’t have needed data to assert. This has proven helpful when incrementally adding codegen pieces.
- The CIR dialect in mlir/lib/Dialect/CIR. This is different from what FIR does (dialect is part of Flang). We’d prefer it to live along side its other dialects buddies but we can see pros/cons either way - we are open to discussion.
- Testing: All tests are currently under clang/test/CIR, and divided into
CodeGen
,IR
,IRGen
andTransforms
. There are several flavors right now: C++ to warnings, C++ to CIR, CIR to CIR and CIR load/store to memref.
Technical Debt
If the community thinks it’s a good idea to incorporate this sooner rather than later, here are some current known technical debt we need to tackle:
- CMake: the project is currently hardcoded to link mlir into clang. This should be an extra build flag to compile CIR optionally.
- Dependencies: there’s a cycle between clang and CIR, not really needed and easy to break.
- Testing: some tests should probably live inside
mlir/test
On a list of other possible initial improvements:
- AST helpers: many LLVM codegen helper functions rely on AST queries that do not depend on LLVM IR, these can be factored out into common helpers.
- MLIR: the canonicalizer is a bit aggressive for code analysis usage since it may remove operations we might want to keep around for diagnostic purposes. We currently work around by adding our own rewrites, but perhaps we can add a new
GreedyRewriteConfig
mode for that. - Support a CIR version of Clang’s analysis based warnings, such as
-Wunreachable
,-Wunused
, etc.
Thank you for reading,
Bruno Cardoso Lopes <bruno.cardoso@gmail.com> (@bcardosolopes)
Nathan Lanza <nathanlanza@gmail.com> (@lanza)
Special thanks to Nadav Rotem, Eric Garcia and Shoaib Meenai for the support. And thank you for early feedback from Jez Ng, Nicholas Ormrod, Wenlei He, Puyan Lotfi, Ivan Murashko, Han Zhu, Matthias Braun and Yedidya Feldblum.