MCJIT Mach-O JIT debugging


I’m finally getting back to getting JIT debugging work for MCJIT. This has worked for ELF for a while in LLVM and support in lldb was added in January (for ELF). I’m now trying to add support for Mach-O and would appreciate some feedback (though I’m fighting my way through learning the format, I’m still just a novice).

My current patchset for llvm is here: . I have a corresponding patch for lldb and I basically got this working (modulo line table information, though I’m sure I’m doing something stupid in lldb here).
The basic approach is to, when a section gets allocated rewrite the sections addr and update every symbols n_value correspondingly. This is very much in line with what is done for ELF, but I’m not sure if it’s the right approach, so I’d appreciate if somebody who has more experience with Mach-O could look at the above patch and give some feedback. If this approach looks sane in general, I’ll finish up and post both the LLVM and the LLDB patch for formal review.


The one thing you might want to look into is the n_value only needs to be updated "if ((N_TYPE & n_type) == N_SECT)" (the symbol is in a section and therefore is has a address value). Other symbols have values that usually don't need to be modified. You might also need to watch out for absolute symbols (if ((N_TYPE & n_type) == N_ABS)) as there are a few that sometimes don't claim to be a symbol that has a valid address, but they actually do point to an address. The symbol named "mach_header" is one such absolute symbol.

If this is all new code, get it as close as you can and then we can work the kinks out once it is in the codebase.


I didn’t get to work on this more last week, but I’ll look at incorporating that suggestion.

The other question of course is how to do this in LLDB. Right, now what I’m doing is going through and adjusting the load address of every leaf in the section tree. That basically works and gets me backtraces with the correct function names and the ability to set breakpoints at functions in JITed modules. What it doesn’t get me yet is line numbers. I suspect that is because the DWARF still refer to the old addresses. I thought relocations should take care of that, but apparently they don’t so I’ll have to look at whether to solve this in LLDB or in LLVM. Suggestions are most welcome.

I think I’m getting closer. The debug_info section is being relocated correctly (I think):

0x00000000: Compile Unit: length = 0x00000045 version = 0x0003 abbr_offset = 0x00000000 addr_size = 0x08 (next CU at 0x00000049)

0x0000000b: TAG_compile_unit [1] *
AT_producer( “julia” )
AT_language( DW_LANG_C89 )
AT_name( “string.jl” )
AT_stmt_list( 0x00000000 )
AT_comp_dir( “.” )
AT_APPLE_optimized( 0x01 )
AT_low_pc( 0x0000000112f5f1c0 )
AT_high_pc( 0x000006fb )

0x0000002b: TAG_subprogram [2]
AT_low_pc( 0x0000000112f5f1c0 )
AT_high_pc( 0x0000000112f5f8bb )
AT_frame_base( rbp )
AT_MIPS_linkage_name( “julia_parseint_nocheck;18749” )
AT_name( “parseint_nocheck” )
AT_external( 0x01 )
AT_accessibility( DW_ACCESS_private )

0x00000048: NULL

but lldb is still showing it at the original location:

0x7ff3afca9280: SymbolVendor
0x7ff3afcafa20: Type{0x0000002b} , name = “parseint_nocheck”, clang_type = 0x00007ff3ab548df0 void (void)
0x7ff3afca93e0: CompileUnit{0x00000000}, language = “Language(language = 0xafca93e0)”, file = ‘./string.jl’
0x7ff3afcafe20: Function{0x0000002b}, mangled = julia_parseint_nocheck;18749, type = 0x7ff3afcafa20

even though the section seems to be loaded correctly:

Sections for ‘JIT(0x7fc4230f4e00)(0x00007fc4230f4e00)’ (x86_64):
SectID Type Load Address File Off. File Size Flags Section Name

0x00000100 container [0x0000000112efccf8-0x0000000112f5f8fb)* 0x000003b0 0x00000950 0x00000000 JIT(0x7fc4230f4e00).__TEXT
0x00000001 code [0x0000000112f5f1c0-0x0000000112f5f8fb) 0x000003b0 0x0000073b 0x80000400 JIT(0x7fc4230f4e00).__TEXT.__text
0x00000009 eh-frame [0x0000000112efccf8-0x0000000112efcd68) 0x00000c90 0x00000070 0x6800000b JIT(0x7fc4230f4e00).__TEXT.__eh_frame
0x00000200 container [0x0000000000000784-0x0000000112efce75)* 0x00000aeb 0x00000160 0x00000000 JIT(0x7fc4230f4e00).__DWARF
0x00000002 dwarf-info [0x0000000112efcd68-0x0000000112efcdb1) 0x00000aeb 0x00000049 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_info
0x00000003 dwarf-abbrev [0x00007fc4230f5934-0x00007fc4230f595f) 0x00000b34 0x0000002b 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_abbrev
0x00000004 dwarf-line [0x0000000112efcdc9-0x0000000112efce75) 0x00000b5f 0x000000ac 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_line
0x00000005 dwarf-str [0x00007fc4230f5a0b-0x00007fc4230f5a4b) 0x00000c0b 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_str
0x00000006 dwarf-loc 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_loc
0x00000007 dwarf-ranges 0x00000c4b 0x00000000 0x02000000 JIT(0x7fc4230f4e00).__DWARF.__debug_ranges
0x00000300 container [0x0000000112efce80-0x0000000112efcec0)* 0x00000c50 0x00000040 0x00000000 JIT(0x7fc4230f4e00).__LD
0x00000008 regular [0x0000000112efce80-0x0000000112efcec0) 0x00000c50 0x00000040 0x02000000 JIT(0x7fc4230f4e00).__LD.__compact_unwind

(the relocated address is

datapointer(filter(s->s.sectname == “__debug_info”,sects)[1])
Ptr{Uint8} @0x0000000112efcd68


so it seems like despite knowing the correct load address for the __debug_info section, it’s still somehow picking up on the old addresses. I’ll keep looking, but if something springs to mind, please let me know.

We don't currently apply any relocations (that I know of) for debug info in LLDB.

We do for ELF (ObjectFileELF::RelocateSection), because LLVM doesn’t do the debug info relocation for us in that case. It currently does for Mach-O so that shouldn’t be an issue yet, the only question is whether lldb correctly loads the relocated section (which I think it should since the load address is being set correctly), or whether it loads the section directly from the object file.

Hmm, nevermind, it seems to be working just fine now. I’ll clean it up and submit a patch.

Now on Phabricator as (LLVM) (LLDB)