Is there a way, using llvm, to relate LLVM instructions to the souce code statements (or line numbers) that they were generated from. So if compile some file test.c and get some binary test.bc can I relate statements in test.bc with source code line numbers in test.c and if so how would I go about doing that. Thanks for your time.
The debug intrinsics are intended for that. Please see:
for the details.
Please note that this is being worked on actively Jim Laskey at Apple.
He's working to get these intrinsics to generate DWARF output so that
LLVM generated code can be used with a debugger. However, the intrinsics
can be processed in whatever way you'd like via an LLVM pass.
If you use -g on the llvm gcc 4 command line and the C code gen from llc (-march=c) you will get this feature for free.
Thanks for your help. I took a look at http://llvm.org/docs/SourceLevelDebugging.html and it seems like this doesn’t give you much in the way of line number information. If you know what source line you are interested in then you can set a breakpoint, but suppose you want to know the line number in the source code for some arbitrary bytecode instruction. In my particular case, I have a pass that finds bytecode instructions that represent indirect calls, now I want to find out what line number and source file that indirect call came from. Is there currently not a way to get this type of information? Thanks again.
If you look at the stoppoint calls you’ll see that you can find the line number and if you follow the compile unit argument on the call you will find the file. The byte codes that follow the call would have been generated by the code on that source line.
If you look at the stoppoint calls you'll see that you can find the line number and if you follow the compile unit argument on the call you will find the file. The byte codes that follow the call would have been generated by the code on that source line.
I'd suggest an approach like this:
Given an instruction in the a basic block, scan up and/or down the basic block, stopping at the first llvm.dbg.stoppoint instruction. Something like this should work:
if (DbgStopPointInst *SP = dyn_cast<DbgStopPointInst>(Inst))
std::cerr << "Loc: " << SP->getFileName() << ":" << SP->getLine() << "\n";
If you look at IntrinsicInst, you'll see methods to get the filename,directory,line#,col#, all packaged up into an easy to use class (some of Jim's excellent work :).
I get it now, I can’t believe I didn’t understand that before. Thank you all for your help!
I would like to know how much effect these stoppoint calls have on the
optimization of the bytecode? DOes insertion of debugging info cause
opportunities for optimization (especially interprocedural dead code
elimination and interprocedural constant propogation) to be reduced?
The -g code is not very readable, so I am not able to confirm this by my
"LLVM optimizations gracefully interact with debugging information. If they are not aware of debug information, they are automatically disabled as necessary in the cases that would invalidate the debug info. "
-g currently disables many optimizations.