This is a proposal for some changes to LLVM’s instruction movement/insertion APIs to better describe the movement of debug-info. In order to remove debug intrinsics from IR we need to first improve the expressiveness of the instruction-moving APIs. This RFC proposes a way to achieve that, with a couple of remaining question marks.
A while back in Prototyping a not-an-instruction dbg.value I suggested a pathway towards storing variable location debug-info in something that wasn’t an intrinsic. The benefits are faster -g compiles, and more robust compilation as the presence of debug-intrinsics can interfere with optimisations. We’ve got a prototype at  that can build a clang-3.4 binary with no
dbg.value intrinsics and produces identical object files  – it’s not a complete solution, but we reckon it exposes all of the important design problems. Today I’d like to focus on a single issue: the fact that debug intrinsics have instruction iterators (
BasicBlock::iterator) which are functionally important to debug-info, and they won’t exist if we cease using intrinsics. We currently consider any instruction to be “a position in the function”, where a debug-info intrinsic is a position in between two “real” instructions. Without debug intrinsics, we won’t be able to describe “in between” positions any more, we would have to attach the debug-info to instructions themselves which raises more issues. To illustrate, consider this block:
block: dbg.value(...) %1 = add... dbg.value(...) %2 = sub... br label %exit
In today’s LLVM, each instruction has an iterator, therefore we can describe any range of “real” instructions and debug-info instructions to be spliced from one block to another. However, if the debug-info was attached to instructions, we can only have iterators that point at the
br. That’s strictly less expressive than what we have today, and this causes meaningful differences in debug-info.
I’ve got two problems to illustrate with examples: the movement of instructions with debug-info attached, and inserting new instructions at the start of blocks.
Instruction::moveBefore can be used to move any instruction from one place to another, without any consideration for debug-info. This isn’t a problem today: if code moves a single instruction then it usually doesn’t want debug-info to move too, while code like
moveBBContents  just moves every instruction from one place to another including debug instructions. However if we attach debug-info to instructions,
moveBefore would become responsible for moving the debug-info, and whether it should move or not can be ambiguous. Consider today’s IR:
bb1: dbg.value(...) %foo = add %1, %2 %bar = sub %3, %4
Assume that we can attach the
dbg.value information to the
add instruction in some way. Now imagine that we called
moveBefore, moving the
add instruction to another block. If for example we are hoisting the
add during CSE then the debug-info should not move with the instruction, it should stay in the source block. However if we’re in a loop like
moveBBContents  moving all instructions in a block, then it’s necessary for
moveBefore to move the debug-info to the destination block. Currently, there’s no way in LLVM to distinguish the two use cases.
The other scenario is inserting one instruction in front of an existing instruction. Should debug-info attached to the existing instruction come before or after the new instruction? We can describe both when debug-info has iterators, but will be forced to pick one if debug-info is attached to instructions. Unfortunately there is no one-size-fits-all solution: in the majority of situations, LLVM today inserts instructions after nearby debug-info intrinsics. However there are other situations where the opposite is needed: for example in
MergeBasicBlockIntoOnlyPred  where one block is spliced into another, the predecessors block’s instructions should always come before the successor’s debug-info: the blocks are effectively concatenated. Today, this is unambiguous because
getFirstInsertionPt will return iterators to debug-info intrinsics. However, if we don’t use intrinsics for debug-info, there are no means to describe inserting before the debug-info.
There’s a fairly simple solution for the first problem, which is to introduce a specialised
moveBefore method that moves debug-info too, and update all the call sites in LLVM that need that behaviour. This is achievable as there’s only about twenty call sites that behave in that way. In the prototype linked below the relevant method is called
moveBeforePreserving, we renamed
moveBeforeBreaking to make the differences explicit. I think it’s useful to have the developer express the “disposition” of the move they’re making, whether it preserves the flow of instructions and thus should transfer debug-info, or whether it breaks the flow.
However, it’s much harder for the insert-at-block-begin example: it’s a much more common code pattern, with between 100 to 200 call sites where this distinction has an actual effect on variable locations. We’ve explored and implemented a solution: store whether an iterator came from
getFirstInsertionPt in a bit inside the
BasicBlock::iterator object, aka
ilist_iterator . This feels bad because it’s taking a value type that’s a single pointer right now and adding more data; however I think it’s inescapable. Iterators are exactly the right design pattern for storing information about positions in a collection. We could:
- Add the distinction of an iterator being returned by
getFirstInsertionPtto the type returned: however that feels ugly and will make ugly compile errors,
- Add a flag parameter to instruction insertion methods, which is unergonomic and error prone,
- Have the fact that an iterator was intended to be at the start of the block (before debug-info) signalled at runtime, which requires information in the iterator object.
It’s the latter that we’ve used in our prototype – a variety of instruction insertion methods have had overloads added that take iterators, and then call sites that use
getFirstNonPHI) now pass an iterator for the position to insert instructions. This is straightforwards; for additional safety the instruction-accepting insertion APIs could be removed, forcing everyone to use iterators when inserting. That’s invasive, but IMHO a reasonable price to pay for clearer management of debug-info. We’re using two bits in the prototype: because a pair of instruction iterators are a half-open range of instructions, we use the extra bits to signal whether the iterators are inclusive of the debug info attached at the start and end. Coming back to our earlier example:
bb2: dbg.value(...) %1 = add... dbg.value(...) %2 = sub... br label %exit
Imagine we have an iterator range from %1 to %2. If we were to splice that range into another block, today LLVM would transfer the add instruction and the following dbg.value. If we attached debug-info to instructions and didn’t have debug intrinsics, that would be the only transfer we could describe. Thus, the two iterator bits signal:
- For the “first” iterator, whether debug-info attached to the first instruction should be transferred too,
- For the “last” iterator, whether debug-info attached to the last position should be transferred too,
respectively. That allows us to describe almost all the transfers that could be performed when debug-info is an intrinsic: both
dbg.values and the
add instruction, or just the
add instruction by itself. I think it’s noteworthy that with accessors like
getFirstInsertionPt returning iterators with the bits set already, we only found two sites in LLVM where those bits need to be manually updated.
In terms of performance: adding the bit to
BasicBlock::iterator has a small compile-time cost to no-debug-info builds (0.05%) according to . However this is going to be paid for by eliminating calls to functions like
getNextNonDebugInstruction, which recovers almost 0.3% of compile time . This is definitely not the final word on compile-time performance, but I want to indicate the costs aren’t going to be overwhelming. Change in memory usage is negligible too as we typically don’t store iterators in data structures.
As proof that this technique can work, I’d like to present our prototype  (based on llvm-15) as evidence – please ignore the implementation details of what we’re doing to blocks, instructions and storage of debug-info, those are all very changeable. My major concern is how the instruction API changes, how this fans out into the rest of the compiler, and whether it’s a tolerable burden for other compiler developers. Thus, I think it’s noteworthy that we’re “only” touching 200 files, and the vast majority of changes are changing the spelling of
moveBefore or passing an iterator into a splice/insertion method.
There’s an additional problem with eliminating iterators, blocks can be temporarily empty (no terminator) but still contain debug-instructions, that should then go in front of any terminator added later. Our solution to that so far is to special-case the insertion of a terminator into a block, which is sufficient to fixup any out-of-place debug-info.
Feedback most welcome: specifically for the two changes that would be needed for the instruction API:
- The need to use a
moveBeforemethod that explicitly states the intention of the developer wrt. whether debug-info should be moved too.
- Sticking some extra bits in
BasicBlock::iteratorto signal at runtime whether a position in a block comes before or after the nearby debug-info.
To provide context, the other things we’re working on are the storage model for variable-location information, and what kind of maintenance would need to be applied during optimisations. We haven’t thought about representing these things in bitcode / textual IR and I’d prefer to avoid thinking about it for now.
Cheers to @StephenTozer for ploughing through a lot of this. CC the usual debug-info folks: @adrian.prantl @dblaikie @rnk @pogo59 @cmtice @OCHyams @jryans @slinder1 , although I think this is probably relevant to the interests of all pass-authors.
 Publish our prototype "killing debug-intrinsics" diff by jmorse · Pull Request #1 · jmorse/llvm-project · GitHub
 CommandLine.cpp in clang-3.4 bakes the build time into itself, all the other files are identical.
 llvm-project/IROutliner.cpp at c42eda5d3692fcd67fc3d043ab37f1950ea653b9 · llvm/llvm-project · GitHub
 llvm-project/Local.cpp at a33f018b89c07e0728539b34c158e88a7db49982 · llvm/llvm-project · GitHub
 Cue gasps of horror from the audience!
 LLVM Compile-Time Tracker
 LLVM Compile-Time Tracker NOTE: the -g modes are not meaningful as all debug-info is dropped.