Sorry the delays in responding here; I ended taking some time off of work recently.
I wanted to follow up on @dblaikie 's point:
you could probably do a hack-ish prototype by modifying IRBuilder to only attach the debug location to call instructions. (though all that checking might be expensive & make the comparison not representative… & maybe there’s a cheaper way to do it) - IRBuilder::AddMetadataToInst could check if the instruction is a call first, then inside the loop check if the metadata is debug info (or perhaps record the index of the debug info metadata when it’s added to
MetadataToCopy
- or store it in a separate member variable entirely for this experiment) and skip adding it if it’s not a call.
So I tried this diff quickly:
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 7151fe9d6568..4c9ec1798af1 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -388,7 +388,7 @@ VALUE_CODEGENOPT(SmallDataLimit, 32, 0)
VALUE_CODEGENOPT(SSPBufferSize, 32, 0)
/// The kind of generated debug info.
-ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo)
+ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::LocTrackingCalls)
/// Whether to generate macro debug info.
CODEGENOPT(MacroDebugInfo, 1, 0)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index cd0fece34502..275800224e04 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -597,6 +597,7 @@ void CGDebugInfo::CreateCompileUnit() {
llvm::DICompileUnit::DebugEmissionKind EmissionKind;
switch (DebugKind) {
case llvm::codegenoptions::NoDebugInfo:
+ case llvm::codegenoptions::LocTrackingCalls:
case llvm::codegenoptions::LocTrackingOnly:
EmissionKind = llvm::DICompileUnit::NoDebug;
break;
diff --git a/llvm/include/llvm/Frontend/Debug/Options.h b/llvm/include/llvm/Frontend/Debug/Options.h
index c490508d3793..6645de40b7dd 100644
--- a/llvm/include/llvm/Frontend/Debug/Options.h
+++ b/llvm/include/llvm/Frontend/Debug/Options.h
@@ -21,6 +21,8 @@ enum DebugInfoKind {
/// Don't generate debug info.
NoDebugInfo,
+ LocTrackingCalls,
+
/// Emit location information but do not generate debug info in the output.
/// This is useful in cases where the backend wants to track source
/// locations for instructions without actually emitting debug info for them
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index 7cc659721132..2db722c345e5 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -231,6 +231,9 @@ public:
/// Add all entries in MetadataToCopy to \p I.
void AddMetadataToInst(Instruction *I) const {
+ if (!isa<CallBase>(I)) {
+ return;
+ }
for (const auto &KV : MetadataToCopy)
I->setMetadata(KV.first, KV.second);
}
Which looks like it will only attach metadata to call instructions. Here’s some measurements of Linux kernel builds:
w/o changes:
hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
Time (mean ± σ): 58.783 s ± 0.175 s [User: 4534.909 s, System: 285.754 s]
Range (min … max): 58.376 s … 59.137 s 30 runs
with changes:
hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
Time (mean ± σ): 59.117 s ± 0.240 s [User: 4578.583 s, System: 284.460 s]
Range (min … max): 58.503 s … 59.838 s 30 runs
So that’s a 0.5% slowdown ((1 - 58.783/59.117)*100. Compare that with a 1.02% slowdown measured on the same codebase when always enabling Location Tracking for all LLVM IR Instructions. I did not measure the other source code bases as I did above but I’m guessing those are worse than 0.5%. Contrast this with 0.17% slowdown in my original implementation for kernel builds (and that’s only when such functions exist in a codebase (at the cost that the Note diagnostics don’t have line+col info)).
So I really think avoiding duplication provided by DILocation metadata nodes was possible, but too expensive, and that we should proceed with my original implementation using ad-hoc debug info: ⚙ D141451 [clang] report inlining decisions with -Wattribute-{warning|error}.
Regarding @efriedma-quic 's point:
Right, we already reserve 64 bits in each llvm::Instruction to specifically encode debug locations. The proposal would be to change the encoding.
I suspect this would require synchronization with other LLVM frontends, not just clang. Even if we changed the size from a pointer (likely 64b for most systems) to a SourceLocation (uint32_t), I’m not sure that would be a performance win (even after all that work).