Relating IR variables to generated machin ecode

I’m wondering if it’s possible to relate IR variables to where they are used in their generated machine code. For instance, for a given SSA variable knowing which assembly instruction produces or uses it. Are there any tools or methods that exist that are able to do this already?

Ah, I just remembered stack maps exist, this seems like a good start Stack maps and patch points in LLVM — LLVM 15.0.0git documentation

I’m not familiar with stackmap but it seems to be specialized for JIT or some special usages. Theoretically you can use debug info to carry line numbers, variable locations etc. from textual LLVM IR rather than source language, such that you can map from machine code to LLVM IR via a normal debugger. The Swift compiler actually has such feature for debugging SIL (Swift Intermediate Language) code: swift/SILDebugInfoGenerator.cpp at main · apple/swift · GitHub .

This sounds promising but how does the metadata at the IR level get carried down to machine code? How would you recover that?

You don’t need to worry about that, the codegen pipeline already knows how to lower debuginfo metadata (e.g. DILocation) into machine code, because that’s how debug information like DWARF is generated.

What I’m saying in the previous comment is that normally, these debug information carry line numbers of the original source code, like “test.c:1:10”. What you can do is changing it into line numbers for textual LLVM IR, like “your_ir.ll:2:3”. Then when you attach debugger on the generated executable, you can set breakpoints on textual LLVM IR lines, like (gdb) b your_ir.ll:2.

Ah I see. Well, how can I make DWARF do this for IR rather than source code?

There’s the “debugify” pass, although that’s really intended for debugging-the-compiler purposes, and IIRC it only adds line info to instructions. I’m not aware of anything that would add debug-info metadata for IR variables.

In addition to debugify, you can always write a Pass that modify the debuginfo metadata to contain lines of textual LLVM IR (and perhaps variable locations as well)
Regarding how to count lines in textual LLVM IR, the SIL link I posted earlier might can give you some inspiration: it created a pseudo asm printer and count the line number upon “printing” the lines.