Took me a bit longer, but I wanted to be confident the implementation is decently correct, otherwise the data is pointless.
Here is a prototype implementation: [DO NOT MERGE][DebugInfo] Implement debug_names's IDX_parent attribute by felipepiovezan · Pull Request #75365 · llvm/llvm-project · GitHub
(the commits are meant to be looked one at a time)
I built a stage 1 compiler built with the commits above.
Then I built a stage 2 compiler at commit 40e2bb533084 twice: once with the parent attribute and once without (reverting [AsmPrinter][AccelTable] Make IDX_parent on by default in the sage 1 compiler)
Regarding size:
debug_names size in all object files:
With parent = 475,824,752 (28% increase)
Without parent = 370,166,960
size of all object files:
With parent = 11,547,074,344 (0.92% increase)
Without parent = 11,441,527,128
Commands used to measure
obj_files_with_parents=$(find build_stage2_Debug_assert_dwarf5 -name \*.cpp.o)
obj_files_without_parents=$(find build_stage2_Debug_assert_dwarf5_no_idx_parent -name \*.cpp.o)
echo "debug_names size in all object files:"
$stage1_build/bin/llvm-objdump --section-headers $obj_files_with_parents | \
grep __debug_names | \
awk "{s+=\"0x\"\$3} END {printf \"With parent = %'d\n\", s}"
$stage1_build/bin/llvm-objdump --section-headers $obj_files_without_parents | \
grep __debug_names | \
awk "{s+=\"0x\"\$3} END {printf \"Without parent = %'d\n\", s}"
echo "size of all object files:"
ls -la $obj_files_with_parents | \
awk "{s+=\$5} END {printf \"With parent = %'d\n\", s}"
ls -la $obj_files_without_parents | \
awk "{s+=\$5} END {printf \"Without parent = %'d\n\", s}"
Speed improvements:
I then used the stage 1 LLDB to debug both stage 2 clangs:
$stage1_build/bin/lldb \
--batch \
-o "b CodeGenFunction::GenerateCode" \
-o run \
-o "expr Fn" \
-- \
$clang_to_use/bin/clang++ -c -g test.cpp -o /dev/null &> output
grep "Finished expr in:" output
The measurements are consistent when repeated multiple times:
Finished expr in: 1402ms
Finished expr in: 5941ms
Without
-gsimple-template-name, it takes 50~60 seconds to finish. If we got similar scale of win as your estimation, it would drop to under 10 seconds
I think the simple-template-name is somewhat orthogonal to this experiment. For that, you might benefit from Greg’s patch, which improves CompilerDeclContext-based queries. The IDX_parent patch deals with your first performance report (very first post), which is about DWARFDeclContext-based queries. Both are complementary to each other.