tl;dr: Move the language-independent parts of CGAtomic.cpp to a new LLVM library LLVMFrontendAtomic, so Flang can use it as well. Do so by first NFC refactor (Draft PR), then gradually use the same functions that LLVM’s AtomicExpandPass already uses.
Motivation
We need to implement lowering to IR for atomic access in Flang. Unfortunately it is not sufficient to just emit an IR instruction such as cmpxchg
. In particular, this fails on aggregate types such as complex.
Clang’s IR-gen of atomic construct is located in CGAtomic.cpp. Instead of cmpxchg
, it may fallback to __atomic_compare_exchange
. It is implemented in either GNU libatomic or compiler-rt.
Looking into how CGAtomic handles this, the conditions are a lot more complicated. Non-integer types may get coerced to integer values. Non-power-of-two-sized types are not allowed with cmpxchg
. Types with padding. Types larger then 16 always use libcall to avoid excessively large value types. Atomic operations that are composits of multiple other operations, e.g. #pragma omp atomic update
.
Whatever CGAtomic does not lower to a libcall may still be lowered by the AtomicExpandPass or AtomicLower.cpp if the target does not have a machine instruction it can be lowered. One way to look at what CGAtomic does is pre-lowering anything that processors don’t have instructions for anyway.
Proposal
Instead of rediscovering all the performance and correctness pitfalls for Flang, the idea is to move the language-independent parts of CGAtomic into LLVMFrontend, a module explicitly created for language frontends. This means other front-ends may be able to use it. The LLVM Front-end hints page already suggests to emit IR as similar to Clang as possible.
CGAtomic itself might benefit from a refactoring. A class called AtomicInfo
was already refactored-out in a8ec7e, but it does not handle all the atomic accesses. Specifically, AtomicExpr
have their own lowering logic separate and different from AtomicInfo
.
The libcall function calls and declarations emitted by CGAtomic use Clang’s generic ABI lowering mechanisms whereas AtomicExpandPass uses BuildLibCalls.cpp. While the libcall is defined by the C-ABI, this leads to differences in the function attributes, e.g. whether a call is marked as noundef
, convergent
, signextend
, zeroextend
, etc. IMHO such implementation-detail differences whether it was lowered by CGAtomic or AtomicExpandPass should be avoided. If they diverge, only one can be correct, lets make that into LLVMFrontend so non-C frontends can use it as well. BuildLibCalls.cpp also knows about about each individual libcall function and applies nocapture
, noalias
etc. attributes whereas CGAtomic only applies generic attributes for runtime functions.[1]
Approach
I sketched a refactor in Draft extracting CGAtomic to LLVMFrontend by Meinersbur · Pull Request #2 · Meinersbur/llvm-project · GitHub, but only for AtomicCompareExchange so far. It creates a new module LLVMFrontendAtomic
with the main class AtomicInfo
.
To not introduce correctness problems with the refactor, it is meant to be NFC.
Admittedly, this makes it ugly, but in further steps it can be improved:
-
Clang needs to forward individual information about the platform’s ABI (e.g. Calling convention) since Clang uses its own classes such as
clang::TargetInfo
that cannot be moved into LLVM. In contrast, BuildLibCalls.cpp usesllvm::TargetLibraryInfo
. Clang creates it only after its CodeGen phase when running the LLVM pass pipeline, but that could be changed. Creating theTargetLibraryInfo
object has already been refactored in LLVMFrontendDriver, also with Flang in mind. -
CGAtomic uses different codes to generate the
llvm::Value
for use withcmpxchg
compared to when needing a pointer to that value for__compare_atomic_exchange
. I currently implemented that as a lambda that materializes the one actually needed inEmitAtomicCompareExchange
, and a return value that can also be either the value or the pointer. I think for all cases, a pointer should be sufficient and therefore canonical. The raising to a register is a job for Mem2Reg/SROA. -
NFC is by best-effort only, some untested options/targets might still introduce different function/argument attributes. Since the goal is to replace it with BuildLibCalls.cpp anyway, I think this is sufficient.
[EDIT: llvm::AtomicInfo
is not a template anymore. The only customization Clang did was customizing the inserter, which persists in IRBuilderBase
through the IRBuilderDefaultInserter
virtual base class]
Would the community support the goal? If yes, is this the right approach?
LLVM has yet another codebase for handling libcall in RuntimeLibcallUtil.h, but for MachineIR. ↩︎