[JIT] Evaluating Debug-Metadata in bitcode

Hello LLVM-People,

I’m still a beginner with the LLVM, but I really like the concept and the possibilities with the JIT.

Currently I compile simple functions with clang-cl into bitcode files. After this I use another program to JIT this bitcode files and execute functions of it - like lli.
Thanks to a lot of mails and so on, I understood that a bitcode file is in fact still IR-Code, but with another representation - so when I JIT this file with my application, I’m acting like a compiler (like llc).

But what does happen to the debug information? When I generate the files in human readable code I can easily discover the metadata and understand them. So I want to extend my application, that it can debug the code, which was jitted. This seems difficult. When I get the address of a debug break, I can only identify the function where the exception occurred. I don’t see a way to connect the metadata to the offset I could calculate from the exception and the function. Also I don’t know how to progress the metadata when it’s in its bitcode form. So - how can I start with this task?

Kind regards
Björn Gaier

Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789
Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Hiroshi Kawamura, Takashi Nagano, Takeshi Fukushima.

Hi Björn,

I’m not sure I understand what you are actually trying to achieve.

Do you want to be able to debug from your main code, and step into the JITed code?

Do you want to be able to set breakpoints, list callstacks, etc in the JITed code?

Do you want to, from a crash, identify where in your JITed code it went wrong?

The first two are definitely one or more order(s) of magnitude harder to solve than the third one, since you basically have to tell your debugger that “I’ve now added some more code here, come here to see the symbols”, and would require a fair amount of extra work to make any existing debugger understand this concept, never mind the exact implementation details.

A post-mortem analysis is much less complicated, since all you’d have to do is walk through the debug symbols and relate the generated locations to the actual locations in the code.

Another aspect is local variables (and arguments into functions), although I’m not sure it’s that much of complication above and beyond the solution of the above parts.

I suspect there are parts of code that can be used, from llc and lld that can be used to generate dwarf info, but you will probably still have to deal with tying that back into your generated code.

In my Pascal compiler, which generates real executable code, adding debug symbols is trivial from the perspective of making it work in a debugger, but hard in the sense of generating the debug data for the line, file, function and variable information - all that I need to do to make it go into the executable file is append -g to the linker options when building. But since JIT-generated code isn’t an executable that you can just apply debug symbols into, this makes it a fair bit harder (I think, I haven’t actually tried).

Do you want to be able to debug from your main code, and step into the JITed code?

Do you want to be able to set breakpoints, list callstacks, etc in the JITed code?

Do you want to, from a crash, identify where in your JITed code it went wrong?

The first two are definitely one or more order(s) of magnitude harder to solve than the third one, since you basically have to tell your debugger that “I’ve now added some more code here, come here to see the symbols”, and would require a fair amount of extra work to make any existing debugger understand this concept, never mind the exact implementation details.

Conceptually this is not terribly different from having a debugger know what to do with a dynamically loaded library. The trick is having the debug info somewhere that the debugger can find it. I’m not sure our JITs even generate debug info from the metadata.

–paulr

They definitely can. This is what happens under the hood when you type in lldb:

(lldb) expr -g -- foo()
1 {
-> foo()
3 }
(lldb)

which puts you in the debugger, for the expression you just typed.

-- adrian

Hello Mats,

Do you want to be able to debug from your main code, and step into the JITed code?
Do you want to be able to set breakpoints, list callstacks, etc in the JITed code?
Do you want to, from a crash, identify where in your JITed code it went wrong?
I guess that the third point describes what I want at best. But I also want to dump local variables and identify the line of the crash - without the need of a external debugger. I would like to write it self, to understand it more - but I don’t know how the LLVM can help me with this.

A post-mortem analysis is much less complicated, since all you’d have to do is walk through the debug symbols and relate the generated locations to the actual locations in the code.
But walking through the debug symbols is my problem - I don’t know how to understand them. When the code gets jitted I get section with the name “debug” but I don’t know how to handle them. So I hoped that I maybe could convert the meta data from the BC files to DWARF as you mentioned. But I don’t know if there is a tool for this or maybe some functions. I looked at llvm::Module and llvm::ExecutionEngine because I use them for jitting - but there was no helpful function I thought…

Kind regards
Björn