help converting llvm metadata into dwarf tags

Dear all,

I’d like to find the memory location of certain instructions in a compiled/linked binary. During the IR phase, I tag instructions I’m interested in with LLVM’-2.7’s new metadata (MDNodes with an identifiable ID). I’d now like to propagate that data to the assembly via a custom DWARF tag I attach to each X86 instruction created from a tagged IR instruction. This will then find its way at assembly time into the binary from where I retrieve it (by locating my custom tags with a DWARF consumer and dumping the addresses of the instruction they’re attached to).

Does this sound reasonable?

I’ve completed the first part, attaching the MDNodes to IR instructions but I’m a bit overwhelmed by all the backend stuff.
How can I identify which IR instruction an X86 instruction came from (with a view to attaching an identifying DW_TAG to it)?

I’ve found the tag definitions in include/llvm/Support/Dwarf.h and added my own.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp seems to be the only place that emits dwarf data into the assembly stream. It also seems to create a DebugInfoFinder which accesses the IR instructions.

Thanks for any pointers that might slow my spinning head down,

rw

Hi Roger,

Dear all,

I’d like to find the memory location of certain instructions in a compiled/linked binary. During the IR phase, I tag instructions I’m interested in with LLVM’-2.7’s new metadata (MDNodes with an identifiable ID). I’d now like to propagate that data to the assembly via a custom DWARF tag I attach to each X86 instruction created from a tagged IR instruction. This will then find its way at assembly time into the binary from where I retrieve it (by locating my custom tags with a DWARF consumer and dumping the addresses of the instruction they’re attached to).

Does this sound reasonable?

I’ve completed the first part, attaching the MDNodes to IR instructions but I’m a bit overwhelmed by all the backend stuff.

How can I identify which IR instruction an X86 instruction came from (with a view to attaching an identifying DW_TAG to it)?

  1. LLVM IR instructions are converted into MachineInstr during instruction selection time. At this point you need to transfer your custom metadata to MachineInstr. See how DebugLoc is transfered (in CodeGen/SelectionDAG directory).

  2. At AsmPrint time, while emitting your assembly instructions you have access to coresponding MachineInstr and any custom metadata attached with it.

I’ve found the tag definitions in include/llvm/Support/Dwarf.h and added my own.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp seems to be the only place that emits dwarf data into the assembly stream.

See have DwarfDebug.cpp handles DebugLoc attached with each instruction (::beginScope() and ::endScope()).

It also seems to create a DebugInfoFinder which accesses the IR instructions.

This path will allow you to browse entire function and collect info which you can use later on.

hi Devang and thanks for the tips, i finally managed to fit all the pieces together into something that seems to work.

It’s probably not the best (or even correct!) way of doing it but here’s a brief overview for reference:

An instruction in the LLVM IR gets converted into an SDNode in the DAG then later into a MachineInstr.
I’d already attached my own MDNodes to IR instructions I was interested in. I wanted to propagate that info to the final binary.

I Added a field to the SelectionDAGBuilder holding the current metadata which I update in SelectionDAGISel::SetDebugLoc() for every IR instruction.
Next, in SelectionDAGBuilder::visit() i transfer the current instruction’s metadata from the DAGBuilder to the instruction’s SDNode.

In InstrEmitter::EmitNode() I copy the metadata from the SDNode to the MachineInstr. DwarfDebug::endModule() creates my user-defined DIE (after defining my own DW_TAG and DW_AT IDs in Dwarf.h) and adds it to the ModuleCU (for simplicity I’m adding my DIEs to the module’s debug_info section)

I Added a few lines to Dwarf.cpp for emitting the correct name for my new DW_TAG and AT (useful when looking at commented assembly)

Finally, in AsmPrinter::processDebugLoc() I grab the metadata from the MachineInstr and pass it to the DwarfWriter to add it to its DwarfDebug member. When the assembly is emitted, the debug_info section contains my new dwarf DIE which I’ve managed to retrieve from the binary with a dwarf consumer.

cheers,
rw.

woot!