EuroLLVM 2024 Debug info round table?

Has anyone already submitted a form for a Debug Info round table at the upcoming EuroLLVM conference?

If not and anyone is interested in attending one, I’m happy to do it. There are a few things I think would be good candidates to discuss to start with:

  • Removing debug intrinsics. Various subtopics include deprecation timeline, how this has impacted downstream projects etc.
  • Enabling encoding address spaces in LLVM debug info and DWARF (AMDGPU folks’ work, cc @slinder1)
  • The future of line tables. Attribution (of instructions to line numbers) & stepping (marking some lines attributed a line number as “uninteresting” from a debugger’s perspective). Consideration for profilers (better attribution means we probably need a way to signal to profiling tools to ignore particular lines), etc.
  • I’m personally curious about others’ usage of call site info and entry_value too.

Does anyone want to be put down as a contact/co-organizer for the round table (please provide an email address)? And would anyone like to volunteer to take notes? :slight_smile:

Does anyone have anything else they’d like to discuss?

1 Like

cc @adrian.prantl, @dblaikie, @StephenTozer, @jryans, @slinder1 as possibly interested parties? Please do add others if you think they’d be interested (or might have submitted one already).

Can’t say I’ll be around for euroLLVM (I’ve never actually made it out for EuroLLVM - I really should some time… ) - look forward to reading teh notes, though :slight_smile:

2 Likes

@Michael137 and I will be there and would be interested in attending.

3 Likes

I’ll be there, and may be able to take notes!

1 Like

Ok, thanks all. I registered one with a note asking to avoid scheduling it during the debug info & LLDB sessions (including quick talks).

1 Like

I’m planning to attend (as long as it’s not the same time slot as my talk). :slightly_smiling_face:

I should be able to take notes.

1 Like

Thank you for requesting a slot, @OCHyams! I’ll also be attending.

1 Like

Thanks to everyone who attended. The session felt productive and… hopeful? :slightly_smiling_face:

Special thanks to @jryans and @StephenTozer for taking notes (please do share those when you get the time - it hasn’t been very long, I just wanted to say I’m happy to help if they need sifting through).

2 Likes

It was great to see both familiar and new people in the debug info space this year! I have included my notes below. Please let me know if you spot anything that should be edited or clarified.

Attendees

  • J. Ryan Stinnett (King’s College London)
  • Orlando Cazalet-Hyams (SN Systems / Sony)
  • Greg Bedwell (SN Systems / Sony)
  • Stephen Livermore-Tozer (SN Systems / Sony)
  • Lukáš Korenčik (Trail of Bits)
  • Artem G (Intel)
  • Djordje Todorovic (Syrmia)
  • John R (VMS)
  • Alexis Engelke (Technical University of Munich)
  • Mohamed Ismail Bennani (Apple)
  • Michael Buch (Apple)
  • Adrian Prantl (Apple)
  • Tom J (Siemens)
  • Keith Walker (Arm)
  • Walter Erquinigo (Modular)
  • Billy Zhu (Modular)
  • Matt Arsenault (AMD)
  • Hans Wennborg (Google)
  • Reid Kleckner (Google)

Notes

  • OCH: Replacing debug intrinsics with debug records
    • Very close to turning this on everywhere
    • Speed increase at compile time
    • SLT: Working everywhere in optimisation and code time time
    • In the process of using records everywhere front to back
    • JR: Should I change to records in my frontend?
    • SLT: DIBuilder should handle this for you automatically
    • DT: Is there anything backend developers need to know?
    • OCH: If have a downstream code gen, then perhaps, but otherwise should be okay
    • SLT: Docs for new IR syntax
    • RK: Do you have a measurement to show compile time benefit?
    • SLT: Don’t have a full benchmark yet…
    • OCH: On the order of 5% compile time improvement
    • OCH: Should generally avoid cases of debug info changing code gen (in optimisation)
    • SLT: Don’t have an equivalent in debug records in MIR yet
    • GB: Used to have so many bug where debug info would affect code gen, but this really helps with a lot of them
  • SLT: Instruction referencing
    • MIR feature that changes how debug info works there
    • Instead of creating virtual register, we reference the instruction that produces that value
    • Turned on for a while for x86, seems to work well
    • Is it something people are interested in for other targets?
    • OCH: Open work for multiple variables
    • AP: Apple looking at porting this to AArch64
    • SLT: Mostly target independent, but may need to about specific value moving
  • AP: Effort to integrate CAS
    • Want to build caching compilers
    • Deduplicate data naturally and easily
    • Debug info contains tons of redundant data
    • Looking at partitioning debug info to expose redundancy
    • Found a scheme that doesn’t need much DWARF changes
    • Built drop-in replacement for ccache on top of this CAS, much better in terms of performance and cache size
    • TJ: CAS upstreaming progress
    • AP: Needs more people to review
    • AP: Very sure LLVM CAS will make it upstream, just a matter of working through the process
    • LLVM CAS is a framework that could be used for these caching and debug info features
    • OCH: Do you have any numbers?
    • AP: Could go back to previous talk, some numbers there
    • RK: Seems like great technology
    • Windows linkers uses CAS for deduplicating
    • DT: Anything specific to Swift?
    • AP: No, it’s actually Clang first
  • DT: Mojo presentation about MLIR debug info
    • RK: No value tracking in all MLIR…?
    • BZ: Yes, only in LLVM dialect for now, but could be core
    • WE: Want all compiler engineers to care debugging
    • RK: How does debug info work in MLIR?
    • BZ: Uses dbg.value, DIExpression for now
    • RK: Worried about duplication
    • SLT: On the contrary, seems a bit hopeful to find same answers showing up in both
    • JRS: Hopefully we can see more sharing between communities
  • SLT: Location views
    • Could be helpful for a variety of line coverage improvements
    • AP: What about cases where you have one location or the other, how do you visualise it?
    • SLT: Hoisting locations out of block, could encode instruction belong to several lines
    • Could be doing much better in terms of line info with more expressivity
    • OCT: DWARF expressivity and debugger visualisation are the main obstacles
    • KW: Two level line tables not on the DWARF backlog
    • SLT: No entirely surprised, really inflates the line tables
    • SLT: Location views should be enough with better conceptual design
    • AP: If LLVM supports locations views, no point if we don’t consume it
    • RK: Would love to have it for PGO, more applications than just debugging
    • AP: Either or example: handle it properly
      • Different cases: Hoisting from different blocks vs. multiple source lines merged into one instruction
    • SLT: Use LBR in DWARF expression to work out where you came from
    • OCT: Needs DWARF extension to say these things need relocation
  • SLT: Misplaced instructions
    • With speculation, instructions are hoisting out of block
    • Currently we drop debug info for line table correctness
    • Would like to extend line table to say misplaced to not confuse profiling, debugging
    • This would at least allow stepping
    • AP: Frontier definitely being pushed forward
  • SLT: Key instructions
    • Looking at using this to give real meaning to is_stmt flag
    • Find instructions that primarily produced the value
    • JR: Had this kind of “semantic” event in Alpha compiler
    • GB: What do debuggers do currently with is_stmt?
    • AP: Debuggers currently use complex heuristics
    • RK: Which line do you use?
    • SLT: Line that produces user-visible state change?
    • SLT: Really mean source coordinates (including column info) for sub-expressions where they also produce user-visible state changes
  • SLT: Og
    • Extend lifetimes feature
      • Inserts fake uses to keep variable alive for better debug info
      • Seems to work well
      • Decent pay off in terms of debug info gained vs. performance change
    • “O2g”
      • A mode like O2 with a few passes removed plus extend lifetimes
      • TJ: We’ll certainly use it
      • JR: We used to have something like this, and would use it too
      • Sony has exposed it downstream since ~2016 or so
      • WM: How do you measure?
      • SLT: We have Dexter tool to integration test
      • More lines covered, more entry value availability
      • GB: Ryan’s coverage tool could also measure
      • GB: Please add our O2g to your measurements as well
      • JRS: Will do, please share exactly what O2g is
1 Like

lol. We need a “came from” instruction :smiley:

1 Like