[RFC] Improve Dwarf 5 .debug_names Type Lookup/Parsing Speed

Took me a bit longer, but I wanted to be confident the implementation is decently correct, otherwise the data is pointless.

Here is a prototype implementation: [DO NOT MERGE][DebugInfo] Implement debug_names's IDX_parent attribute by felipepiovezan · Pull Request #75365 · llvm/llvm-project · GitHub
(the commits are meant to be looked one at a time)

I built a stage 1 compiler built with the commits above.
Then I built a stage 2 compiler at commit 40e2bb533084 twice: once with the parent attribute and once without (reverting [AsmPrinter][AccelTable] Make IDX_parent on by default in the sage 1 compiler)

Regarding size:

debug_names size in all object files:
With parent    = 475,824,752   (28% increase)
Without parent = 370,166,960  
size of all object files:
With parent    = 11,547,074,344  (0.92% increase)
Without parent = 11,441,527,128
Commands used to measure
obj_files_with_parents=$(find build_stage2_Debug_assert_dwarf5 -name \*.cpp.o)
obj_files_without_parents=$(find build_stage2_Debug_assert_dwarf5_no_idx_parent -name \*.cpp.o)

echo "debug_names size in all object files:"
$stage1_build/bin/llvm-objdump --section-headers  $obj_files_with_parents | \
  grep __debug_names | \
  awk "{s+=\"0x\"\$3}  END {printf \"With parent    = %'d\n\", s}"
$stage1_build/bin/llvm-objdump --section-headers  $obj_files_without_parents | \
  grep __debug_names | \
  awk "{s+=\"0x\"\$3}  END {printf \"Without parent = %'d\n\", s}"

echo "size of all object files:"
ls -la $obj_files_with_parents | \
  awk "{s+=\$5}  END {printf \"With parent    = %'d\n\", s}"
ls -la $obj_files_without_parents | \
  awk "{s+=\$5}  END {printf \"Without parent = %'d\n\", s}"

Speed improvements:

I then used the stage 1 LLDB to debug both stage 2 clangs:

  $stage1_build/bin/lldb \
      --batch \
      -o "b CodeGenFunction::GenerateCode" \
      -o run \
      -o "expr Fn" \
      -- \
      $clang_to_use/bin/clang++ -c -g test.cpp -o /dev/null &> output
  grep "Finished expr in:" output

The measurements are consistent when repeated multiple times:

Finished expr in: 1402ms
Finished expr in: 5941ms

Without -gsimple-template-name , it takes 50~60 seconds to finish. If we got similar scale of win as your estimation, it would drop to under 10 seconds

I think the simple-template-name is somewhat orthogonal to this experiment. For that, you might benefit from Greg’s patch, which improves CompilerDeclContext-based queries. The IDX_parent patch deals with your first performance report (very first post), which is about DWARFDeclContext-based queries. Both are complementary to each other.