[RFCish] Support for discontinuous functions

[Disclaimer: This is work I started already – at least #115730 is already committed – but it is turning out to be trickier than I expected, so I’m writing to raise awareness as much as to get feedback.]

Hello all,

I’ve recently learned that lldb has trouble dealing with functions produced by the Propeller post-link optimizer. The biggest issue is incorrect/missing backtraces, but the nature of the problem means it can manifest itself in different ways in almost any part of lldb.

Propeller optimizes functions by splitting them into smaller pieces (potentially putting every basic block into its own chunk) and then arranging these pieces according to the profile data – e.g. grouping the “hot” pieces of all functions into a single location. This means that parts of a function may end up very far from each other – many megabytes apart.

Now, debug info (DWARF) is perfectly capable of representing functions like this (via the DW_AT_ranges attribute on the function DIE), which is probably why this wasn’t though to be an issue. However, it turns out lldb isn’t – it has a fairly deeply engrained assumption that functions are continuous. The core problem is that the the internal function representation only uses a single address range (not entirely true after #115730, but the new representation introduced there isn’t used anywhere yet). Then there are a couple of other assumptions, which are sort of a natural consequence of the first one, but I’m listing them here because they all need to be revisited to make this work:

  • all parts of a function are in a single section (i.e., doing arithmetic on section offsets of parts of the function is well defined). I’ve found several places which subtract section offsets of various parts of the function. #116777 tries to fix some of them. This is not true because the hot parts of the function will typically end up in a different section (and maybe a different ELF segment) than the cold parts.
  • The function entry point (its “address”) is its lowest numbered address. This isn’t always true, as it depends on how the linker chooses to organize the function. Since the function blocks can be moved around independently, the linker could very well place it after some other part of the function (e.g. because the entry block is cold and the linker is placing cold parts higher in the address space)
  • addresses of function-scoped entities (blocks, variables, etc.) can be defined as offsets from the start of the function. This one can still be true, but we need to careful specify what they are relative to.

I believe the solution to the second issue is obvious – we need to decouple the notion of “the (entry) address of the function” from “the range of addresses belonging to the function”. This is (among other things) what I’m trying to do in #115836. The PR is currently in draft status because I’ve learned that I need to stop sorting the address ranges so I can actually extract the function address – (one of the ways) DWARF stores the function address is via the base address of the first range of the function. I’m doing this in #116379 (merged) and #116620 (open).

For the third issue, I basically see two choices for the base address: use the lowest address (i.e. status quo) or use the entry point address (as defined in the previous paragraph). The problem with the first issue is that it essentially requires defining another key address in a function (“function address used for offset calculation” vs. “function address for calls”), which creates a lot of space for confusion. In particular, we’d need to be careful not to use the second address for user-facing output – we don’t want to symbolize 0x1003 as foo+3 just because 0x1000 is the lowest address of foo, when the entry point is 0x2000. The problem with the second option is that some of the offsets can come out negative – in the example above, 0x1003 would be represented as foo-4093, not just in the user facing output, but in all the internal calculations as well. This means auditing the offset calculations and changing (some of) them to use signed types. Despite of this, I think this is a better option because it avoids creating to similar-but-distinct concepts, and because the first option (due to the user-facing symbolization) doesn’t completely avoid the need for an audit.(*)

The first issue is the trickiest. In #116777 I resolve it by switching to file address (virtual address for ELF people) arithmetic. This works for my use case, but it breaks the assumption held by (some parts of?) lldb – that file section can be loaded into memory independently (and in arbitrary locations). While we do have this affordance in some places, I am not sure if its upheld universally, and I’m not aware of it being used anywhere. AFAIK, all three major object file formats (ELF, MachO, (PE)COFF) only support relocating the file as a whole (basically, adding an offset/bias to all the addresses in the file). The DWARF format also doesn’t support it as it uses a single address field to identify parts of the file. (†)

For the overarching problem of “representing multiple address ranges of a function”, I think the answer is obvious – introduce a new function returning a list of ranges (done in #115730). The hard part is migrating the callers to the new API. The callers are spread out over the code base, so I don’t have an inventory of all of them. I’m sure that converting some of them should be easy, but I expect that some will be fairly tricky. For example, teaching the instruction disassembly unwinder about this will probably be nontrivial, and is probably not something I want to do as a part of this project. The nice part about this is that once we expose the information about the function discontinuity, we should be able to make this component fail gracefully. I think that the decision on what to do with individual callers would be best done on their respective PRs.

Thank you for making it to the end of this email, and let me know what you think,
Pavel

(*) I suppose a higher-order question here might be whether we actually want to symbolize this as foo-4093. I don’t think this output is particularly nice (the offset can end up being many megabytes), but I don’t see a better alternative.

(†) This ignores segments (unsupported by our DWARF parser) and unlinked object files (supported by llvm but not lldb DWARF parser). Segments might be used to implement something like this, but they’re mainly targetting the i386 segmented address space. Object files support this by storing the offsets outside of the DWARF section (in a relocation entry), but the expectation is that these will be resolved during the linking process.

1 Like

At first glance I’d think that foo-4093 was outside foo, but this is only because of this lowest address == function entry assumption that I’m also used to.

So we have an annotation that’s too accurate in some sense, a user might not believe it. We could disable them completely, but I do like them for the “this pointer points to somewhere inside this function” hint at least.

Or when we annotate a discontinuous function we add something extra to say its discontinuous, or take away the offset, (in foo). Might annoy folks looking at disassembly though, b 0x1234 ; (in foo) is hardly helpful.

Maybe you could make it clear that foo is the entry point of foo, (foo[entry]-4093)? Only doing so when we know function entry != lowest function address.

If they do, how do llvm-objdump/objdump deal with this?

The radical option is to say that function start == lowest address has never been an actual rule, and if folks get strange results with these binaries, educate them on why that might be. I see rising interest in BOLT from my side, so these sort of binaries will become more common.

Thinking about places on the command line that have function arguments, this one will be interesting:

(lldb) help dis
Disassemble specified instructions in the current target.  Defaults to the
current function for the current thread and stack frame.

Syntax: dis <cmd-options>
<...>
       --force
            Force disassembly of large functions.
<...>
       -f ( --frame )
            Disassemble from the start of the current frame's function.

I wonder if this can handle a discontinuous function already?

Edit: and there is also this option:

       -n <function-name> ( --name <function-name> )
            Disassemble entire contents of the given function name.

I’m not sure if the [entry] would help here. If we wanted to more explicit about this, I’d consider attaching something to the end of the expressions, maybe foo-4093 (outline) (“outline” isn’t exactly correct, but its shorter than “discontinuous”).

These tools don’t read the debug info, so they would just give you the symbol name of the chunk (if there is one). So, something like:

  201127: eb f9                        	jmp	0x201122 <foo.__part.1+0x1>

if the file has symbol information, or

  201127: eb f9                        	jmp	0x201122 <.text+0x2>

if it does not.

I suppose we could do something similar, but I’m not sure how awkward would that get as we normally prefer debug information over data from the symtab. We also wouldn’t be able to completely rely on it being present, so we’d need some fallback.

I’m pretty sure it does not handle that option yet (as in, it would try to dump everything “between” the function chunks as well), as we don’t even have a way to get the discontinuous ranges yet, but I think the way it should behave is fairly obvious – dump out the individual parts of the function and skip everything in between.

Here’s an interesting subproblem of this. Using file addresses for these types of operations introduces a new failure mode to some operations – namely that the new file address does not resolve back to any sections (this should only happen with corrupted debug info). For example, Block::GetRangeAtIndex could fail due to this. The function has a bool return value, so it is possible to return some indication of an error, but currently that function can only fail for a very mundane, and I think – preventable, reasons (block having no parent function – aside from funny blocks created in the pdb plugin our blocks are always contained within a function).

So, there are basically two ways this can go: either we allow creation of these kind-of-invalid blocks – and then fail in GetRangeAtIndex (and elsewhere); or we disallow the creation of these blocks (check/prune their ranges during construction). Then, if we’re able to guarantee that Blocks are always parented (I’m like 95% certain that’s doable), then we can have a bunch of functions which don’t fail.

I’m inclined to try the latter, because I like functions that don’t fail, and because we’re already doing some amount of validation when creating Blocks. :slight_smile: