Design: gathering locations of instructions to emit into a section

I’m currently working on implementing MSVC’s /d2ImportCallOptimization in LLVM, and it involves:

  1. Find all call (branch) instructions where the target is an imported function symbol.
  2. Emit a new section in the object file that lists out the addresses of these instructions and the symbol id of the called function.

I’m not sure how to design this: it would be easy in a MachineFunctionPass to find calls to imported functions, but then I’m not sure how to get that information all the way to MC so that I can create the new section.

Once I’m in MC, identifying calls to import function is much more difficult: this needs to support AArch64, so loading the address of the called symbol has been split into multiple instructions which are separate from the branch itself.

Any suggestions on how one would implement this? Or other examples of similar things?

I believe there are few places to access both MachineFunction info and MCStreamer (for emitting custom sections) other than AsmPrinter, so you might want to implement your feature there.

A feature similar to yours might be basic block address map (-fbasic-block-address-map) which emits a special section with all the BBs and their addresses. You can check out how they do in AsmPrinter::emitBBAddrMapSection.

1 Like

Yeah, I realized yesterday that AsmPrinter was going to be the correct place to put this but thank you for the pointer to emitBBAddrMapSection!

Skimming the proposal, at first glance is looks like you can do everything you need inside the AsmPrinter with existing assembler directives.

If you do need to do some sort of late processing with the offsets in MC, though (if they’re compressed or something like that), you can implement a new assembler directive. An assembler directive is a textual command in assembly like “.section”, but they’re also used as the in-memory API for communicating between CodeGen and MC.

1 Like

In your MachineFunctionPass, you can create a series of MCSymbol labels, and attach them to the call instructions with MachineInstr::setPostInstrSymbol on the call instructions. This is a somewhat dangerous API, since there isn’t a strong guarantee that subsequent machine patches don’t dead code eliminate or merge arbitrary MachineInstrs, but it’s pretty safe for calls. The main risk would probably be tail merging. This API powers other label call site tracking features, like S_HEAPALLOCSITE debug information and tracking longjmp call sites for CFGuard, so you’re in reasonably good company. You can search the codebase for examples of usage. Hope that helps.

1 Like

Thanks, this is exactly what I need!

Question: it seems like each instruction can only have one post symbol attached to it, so is it possible that I’m going to replace an existing one? Should the API be “get or set” instead?

See ⚙ D50833 [x86/MIR] Implement support for pre- and post-instruction symbols, as well as MIR parsing support for `MCSymbol` `MachineOperand`s. for why the current API looks the way it does. Not sure anyone considered the possibility of needing multiple post-instruction symbols.

Ugh, this turned out to be much more annoying than I originally thought.

The big complication is that there’s no guarantee that the branch instruction is immediately preceded by the adrp+ldr pair: if the same function is immediately called twice then LLVM will reuse the address it just loaded (this gets even more complicate if that value gets spilled to the stack).

So, the last place that I have all the information that I need is AArch64ISelLowering, but at that point I don’t have MIs to attach a post-symbol to.

Seems like I have two choices at this point:

  1. Add a new psuedo-instruction and emit it just before the branch (and then hope it stays in the correct location).
  2. Expand NodeExtraInfo to add a way to note the GlobalValue that a given branch will be calling (if it is calling one).

I’d prefer option 2, since this is already being used for things like call-site info and I don’t have to worry about a psuedo-instruction drifting away from the real instruction, but I’m not sure if it’s considered ok to expand NodeExtraInfo, or if it’s something that folks were hoping to reduce/kill.