Parsing Line Table to determine function prologue?

Zachary_Turner1 · October 7, 2018, 3:05am

While implementing native PDB support I noticed that LLDB is asking to parse an entire compile unit’s line table in order to determine if 1 address is a function prologue or epilogue.

Is this necessary in DWARF-land? It would be nice if I could just pass the prologue and epilogue byte size directly to the constructor of the lldb_private::Function object when I construct it.

It seems unnecessary to parse the entire line table just to set a breakpoint by function name, but this is what ends up happening.

Even if we do need to parse the line table, could it be done just for the function in question? The debug info tells us the function’s address range, so is there some technical reason why it couldn’t parse the line table only for the given address range?

Leonard_Mosescu1 · October 8, 2018, 7:28pm

Even if we do need to parse the line table, could it be done just for the function in question? The debug info tells us the function’s address range, so is there some technical reason why it couldn’t parse the line table only for the given address range?

My understanding is that there’s one DWARF .debug_line “program” per CU, and normally you’d need to “execute” the whole line number program.

jingham · October 8, 2018, 7:41pm

A single sequence in the line table needs to be run from beginning to end to make sense of it. It doesn't really have addresses, it generally has a start address, then a sequence of "increment line, increment address" instructions. So you have to run the state machine to figure out what the addresses are.

However, the line table does not have to be one continuous sequence. The DWARF docs state this explicitly, and there is an "end_sequence" instruction to implement this. I can't see any reason why you couldn't get the compiler to emit line tables in per-function sequences, and have the debugger optimize reading the line table by first scanning for sequence ends to get the map of chunks -> addresses, and then reading the line table in those chunks. I don't think anybody does this, however. clang emitted the whole CU as one sequence in the few examples I had sitting around.

Jim

Zachary_Turner1 · October 8, 2018, 7:48pm

I see. It’s not the end of the world because I can just parse the whole line table when requested. It’s just that in PDB-land the format is such that a) I know the exact address of the prologue and epilogue at the time I parse the function record, and b) When parsing the line table, I can quickly scan to the address range for the function making the whole table parsing less efficient than necessary. But it’s definitely sufficient.

pogo59 · October 10, 2018, 4:07pm

If you compile with –ffunction-sections you get one sequence per function. But DWARF does let you have multiple functions described by one sequence, so you need to accommodate that. And, the end-prologue flag is an opcode in the line-number program, so if you’re looking for prolog/epilog instructions you need to parse the whole thing anyway.

–paulr

Topic		Replies	Views
[RFC] New DWARF attribute for symbolication of merged functions LLDB debuginfo , llvm	30	1213	June 11, 2025
Prologue instructions having line information LLDB	10	227	September 23, 2017
Question about building line tables LLDB	8	144	March 8, 2016
Function start address LLVM Dev List Archives	10	306	June 4, 2018
[LLDB][RFC] Add inline info into line table in Dwarf symbol file plugin LLDB	2	867	December 1, 2023

Parsing Line Table to determine function prologue?

Related topics