[RFC] Improving Clang's middle and back end diagnostics

Sorry the delays in responding here; I ended taking some time off of work recently.

I wanted to follow up on @dblaikie 's point:

you could probably do a hack-ish prototype by modifying IRBuilder to only attach the debug location to call instructions. (though all that checking might be expensive & make the comparison not representative… & maybe there’s a cheaper way to do it) - IRBuilder::AddMetadataToInst could check if the instruction is a call first, then inside the loop check if the metadata is debug info (or perhaps record the index of the debug info metadata when it’s added to MetadataToCopy - or store it in a separate member variable entirely for this experiment) and skip adding it if it’s not a call.

So I tried this diff quickly:

diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 7151fe9d6568..4c9ec1798af1 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -388,7 +388,7 @@ VALUE_CODEGENOPT(SmallDataLimit, 32, 0)
 VALUE_CODEGENOPT(SSPBufferSize, 32, 0)
 
 /// The kind of generated debug info.
-ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo)
+ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::LocTrackingCalls)
 
 /// Whether to generate macro debug info.
 CODEGENOPT(MacroDebugInfo, 1, 0)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index cd0fece34502..275800224e04 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -597,6 +597,7 @@ void CGDebugInfo::CreateCompileUnit() {
   llvm::DICompileUnit::DebugEmissionKind EmissionKind;
   switch (DebugKind) {
   case llvm::codegenoptions::NoDebugInfo:
+  case llvm::codegenoptions::LocTrackingCalls:
   case llvm::codegenoptions::LocTrackingOnly:
     EmissionKind = llvm::DICompileUnit::NoDebug;
     break;
diff --git a/llvm/include/llvm/Frontend/Debug/Options.h b/llvm/include/llvm/Frontend/Debug/Options.h
index c490508d3793..6645de40b7dd 100644
--- a/llvm/include/llvm/Frontend/Debug/Options.h
+++ b/llvm/include/llvm/Frontend/Debug/Options.h
@@ -21,6 +21,8 @@ enum DebugInfoKind {
   /// Don't generate debug info.
   NoDebugInfo,
 
+  LocTrackingCalls,
+
   /// Emit location information but do not generate debug info in the output.
   /// This is useful in cases where the backend wants to track source
   /// locations for instructions without actually emitting debug info for them
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index 7cc659721132..2db722c345e5 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -231,6 +231,9 @@ public:
 
   /// Add all entries in MetadataToCopy to \p I.
   void AddMetadataToInst(Instruction *I) const {
+    if (!isa<CallBase>(I)) {
+      return;
+    }
     for (const auto &KV : MetadataToCopy)
       I->setMetadata(KV.first, KV.second);
   }

Which looks like it will only attach metadata to call instructions. Here’s some measurements of Linux kernel builds:

w/o changes:
hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'      
Benchmark 1: make LLVM=1 -j128 -s
  Time (mean ± σ):     58.783 s ±  0.175 s    [User: 4534.909 s, System: 285.754 s]
  Range (min … max):   58.376 s … 59.137 s    30 runs
 
 
with changes:
hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
  Time (mean ± σ):     59.117 s ±  0.240 s    [User: 4578.583 s, System: 284.460 s]
  Range (min … max):   58.503 s … 59.838 s    30 runs

So that’s a 0.5% slowdown ((1 - 58.783/59.117)*100. Compare that with a 1.02% slowdown measured on the same codebase when always enabling Location Tracking for all LLVM IR Instructions. I did not measure the other source code bases as I did above but I’m guessing those are worse than 0.5%. Contrast this with 0.17% slowdown in my original implementation for kernel builds (and that’s only when such functions exist in a codebase (at the cost that the Note diagnostics don’t have line+col info)).

So I really think avoiding duplication provided by DILocation metadata nodes was possible, but too expensive, and that we should proceed with my original implementation using ad-hoc debug info: ⚙ D141451 [clang] report inlining decisions with -Wattribute-{warning|error}.


Regarding @efriedma-quic 's point:

Right, we already reserve 64 bits in each llvm::Instruction to specifically encode debug locations. The proposal would be to change the encoding.

I suspect this would require synchronization with other LLVM frontends, not just clang. Even if we changed the size from a pointer (likely 64b for most systems) to a SourceLocation (uint32_t), I’m not sure that would be a performance win (even after all that work).

1 Like