[Disclaimer: This is work I started already – at least #115730 is already committed – but it is turning out to be trickier than I expected, so I’m writing to raise awareness as much as to get feedback.]
Hello all,
I’ve recently learned that lldb has trouble dealing with functions produced by the Propeller post-link optimizer. The biggest issue is incorrect/missing backtraces, but the nature of the problem means it can manifest itself in different ways in almost any part of lldb.
Propeller optimizes functions by splitting them into smaller pieces (potentially putting every basic block into its own chunk) and then arranging these pieces according to the profile data – e.g. grouping the “hot” pieces of all functions into a single location. This means that parts of a function may end up very far from each other – many megabytes apart.
Now, debug info (DWARF) is perfectly capable of representing functions like this (via the DW_AT_ranges attribute on the function DIE), which is probably why this wasn’t though to be an issue. However, it turns out lldb isn’t – it has a fairly deeply engrained assumption that functions are continuous. The core problem is that the the internal function representation only uses a single address range (not entirely true after #115730, but the new representation introduced there isn’t used anywhere yet). Then there are a couple of other assumptions, which are sort of a natural consequence of the first one, but I’m listing them here because they all need to be revisited to make this work:
- all parts of a function are in a single section (i.e., doing arithmetic on section offsets of parts of the function is well defined). I’ve found several places which subtract section offsets of various parts of the function. #116777 tries to fix some of them. This is not true because the hot parts of the function will typically end up in a different section (and maybe a different ELF segment) than the cold parts.
- The function entry point (its “address”) is its lowest numbered address. This isn’t always true, as it depends on how the linker chooses to organize the function. Since the function blocks can be moved around independently, the linker could very well place it after some other part of the function (e.g. because the entry block is cold and the linker is placing cold parts higher in the address space)
- addresses of function-scoped entities (blocks, variables, etc.) can be defined as offsets from the start of the function. This one can still be true, but we need to careful specify what they are relative to.
I believe the solution to the second issue is obvious – we need to decouple the notion of “the (entry) address of the function” from “the range of addresses belonging to the function”. This is (among other things) what I’m trying to do in #115836. The PR is currently in draft status because I’ve learned that I need to stop sorting the address ranges so I can actually extract the function address – (one of the ways) DWARF stores the function address is via the base address of the first range of the function. I’m doing this in #116379 (merged) and #116620 (open).
For the third issue, I basically see two choices for the base address: use the lowest address (i.e. status quo) or use the entry point address (as defined in the previous paragraph). The problem with the first issue is that it essentially requires defining another key address in a function (“function address used for offset calculation” vs. “function address for calls”), which creates a lot of space for confusion. In particular, we’d need to be careful not to use the second address for user-facing output – we don’t want to symbolize 0x1003
as foo+3
just because 0x1000
is the lowest address of foo
, when the entry point is 0x2000
. The problem with the second option is that some of the offsets can come out negative – in the example above, 0x1003
would be represented as foo-4093
, not just in the user facing output, but in all the internal calculations as well. This means auditing the offset calculations and changing (some of) them to use signed types. Despite of this, I think this is a better option because it avoids creating to similar-but-distinct concepts, and because the first option (due to the user-facing symbolization) doesn’t completely avoid the need for an audit.(*)
The first issue is the trickiest. In #116777 I resolve it by switching to file address (virtual address for ELF people) arithmetic. This works for my use case, but it breaks the assumption held by (some parts of?) lldb – that file section can be loaded into memory independently (and in arbitrary locations). While we do have this affordance in some places, I am not sure if its upheld universally, and I’m not aware of it being used anywhere. AFAIK, all three major object file formats (ELF, MachO, (PE)COFF) only support relocating the file as a whole (basically, adding an offset/bias to all the addresses in the file). The DWARF format also doesn’t support it as it uses a single address field to identify parts of the file. (†)
For the overarching problem of “representing multiple address ranges of a function”, I think the answer is obvious – introduce a new function returning a list of ranges (done in #115730). The hard part is migrating the callers to the new API. The callers are spread out over the code base, so I don’t have an inventory of all of them. I’m sure that converting some of them should be easy, but I expect that some will be fairly tricky. For example, teaching the instruction disassembly unwinder about this will probably be nontrivial, and is probably not something I want to do as a part of this project. The nice part about this is that once we expose the information about the function discontinuity, we should be able to make this component fail gracefully. I think that the decision on what to do with individual callers would be best done on their respective PRs.
Thank you for making it to the end of this email, and let me know what you think,
Pavel
(*) I suppose a higher-order question here might be whether we actually want to symbolize this as foo-4093
. I don’t think this output is particularly nice (the offset can end up being many megabytes), but I don’t see a better alternative.
(†) This ignores segments (unsupported by our DWARF parser) and unlinked object files (supported by llvm but not lldb DWARF parser). Segments might be used to implement something like this, but they’re mainly targetting the i386 segmented address space. Object files support this by storing the offsets outside of the DWARF section (in a relocation entry), but the expectation is that these will be resolved during the linking process.