Address of instruction in codegen

Hi everyone,

I have a need to generate the address of an instruction during codegen.
Is there any way to do this? I was hoping that I could split the basic
block very late and then use an MO_BlockAddress MachineOperand, but
apparently generating such a thing requires a BlockAddress object, which
in turn would force splitting the basic block very early, at the IR
level.

I don't actually need the address until just before regalloc. In fact,
generating a BlockAddress early would probably result in a dead
Instruction as I wouldn't have anything to actually connect it to as a
user. I suppose I could use a dummy target intrinsic as a user but that
seems very hacky.

I also thought about doing this extremely late and encoding the address
generation during binary streaming in the MC layer but that also seems
very hacky and difficult to maintain, requiring various
pseudo-instructions to live until asm or object writing.

Any brilliant ideas out there?

                        -David

Check if MachineInstr::[sg]etPreInstrSymbol do what you need.

I have generated enough awareness of this API that other people mention it now. Success. :slight_smile:

A word of caution: depending on where you are in codegen, there are late passes that can delete or duplicate MachineInstrs (tail duplication comes to mind). You need to be reasonably confident that the annotated instruction won’t be affected by such passes, or you will get errors from the assembler about an undefined label or duplicate label definitions.

Krzysztof Parzyszek via llvm-dev <llvm-dev@lists.llvm.org> writes:

Check if MachineInstr::[sg]etPreInstrSymbol do what you need.

It does! As it happens I found the API just a little bit later. :slight_smile:

             -David

Reid Kleckner <rnk@google.com> writes:

A word of caution: depending on where you are in codegen, there are late
passes that can delete or duplicate MachineInstrs (tail duplication comes
to mind). You need to be reasonably confident that the annotated
instruction won't be affected by such passes, or you will get errors from
the assembler about an undefined label or duplicate label definitions.

I actually ran into that. :slight_smile: Would a patch to fix the tail merging
issue be of interest? As composed now I have it not allowing merging
where a pre- or post-symbol differs but I can also imagine enhancements
to update uses of the symbol.

                        -David

You know, maybe it would be OK. Initially I was going to say, one of the existing applications is debug info to mark heap allocation call sites, and we don’t want debug info to affect codegen. However, if such a call site were to be duplicated, we’d end up with assembler errors from MC, so it sort of already doesn’t work. Maybe there is a better way to handle this application.

The other applications are marking setjmp calls for control flow guard and something in speculative load hardening, both security features where powering down control flow optimizations would be fine.

Would a patch to fix the tail merging
issue be of interest? As composed now I have it not allowing merging
? where a pre- or post-symbol differs but I can also imagine enhancements
to update uses of the symbol.

Indeed having the symbol uses updated would be appreciated. In our downstream side I’ve also did preventing tail duplication, but that did have very negative effect in multiple scores.

Diogo Sampaio <dsampaio@kalray.eu> writes:

Would a patch to fix the tail merging
issue be of interest? As composed now I have it not allowing merging

? where a pre- or post-symbol differs but I can also imagine enhancements

to update uses of the symbol.

Indeed having the symbol uses updated would be appreciated. In our
downstream side I've also did preventing tail duplication, but that
did have very negative effect in multiple scores.

We'd have to teach each pass how to do it, which obviously will take
some time. Of course we'll have to teach each pass not to remove
post-instr-symbol instructions and that takes time too. :slight_smile:

Right now I just have a patch to BranchFolding that prevents tail
merging because that's all we've run into so far.

                         -David