Tracking all prologue and epilogue insertions through codegen/lowering

Hi all,

I’m doing some program analysis research within LLVM. One thing that I’d like to be able to
do is track LLVM’s generation of function prologues and epilogues, particularly as
Functions are lowered to MachineFunctions and eventually through target backends.
I only need to do this for x86, so I’ve been focusing my attention on that target.

So far, this is what I’ve done:

* Created two new pseudo instructions in Target.td, named PROLOGUE_ANCHOR and
EPILOGUE_ANCHOR. I’ve made sure that these instructions inherit PseudoInstruction,
and are marked as not having side effects.

* Modified X86FrameLowering::emitPrologue and X86FrameLowering::emitEpilogue
to unconditionally emit new PROLOGUE_ANCHOR and EPILOGUE_ANCHOR MachineInstrs.

* Modified AsmPrinter::EmitFunctionBody to include a case statement for
PROLOGUE_ANCHOR and EPILOGUE_ANCHOR, converting each into an MCSymbol that I emit
using the output streamer.

All of this “works,” in the sense that my output assembly (via llc) contains labels
everywhere that I expect. However, the binaries themselves are broken -- anything
compiled above -O0 immediately segfaults very early on in process initialization.
My best guess is that this has something to do with higher optimizations including
frame pointer elision and thus my pseudos are messing with that,
but I’m at a little bit of a loss for how best to debug this
(or whether my approach is better replaced with something else).

Could anybody offer some advice/pointers for this approach?

Best,
William Woodruff

I’m doing some program analysis research within LLVM. One thing that I’d like to be able to
do is track LLVM’s generation of function prologues and epilogues

Answering my own question with this: I managed to completely overlook the fact
that DWARF (since at least v3) has supported fields within the line program state
machine that explicitly record whether a particular entry indicates a prologue end
(DW_LNS_set_prologue_end) and epilogue begin (DW_LNS_set_epilogue_begin).

- William

> I’m doing some program analysis research within LLVM. One thing that I’d
like to be able to
> do is track LLVM’s generation of function prologues and epilogues

Answering my own question with this: I managed to completely overlook the
fact
that DWARF (since at least v3) has supported fields within the line
program state
machine that explicitly record whether a particular entry indicates a
prologue end
(DW_LNS_set_prologue_end) and epilogue begin (DW_LNS_set_epilogue_begin).

Well, yes, DWARF has that. LLVM does not do a spectacular job of
identifying those points, although it has been getting better at it
over the years.
--paulr

Well, yes, DWARF has that. LLVM does not do a spectacular job of
identifying those points, although it has been getting better at it
over the years.

Thanks for the response! I went down this route, but I’ve had some trouble getting clang
to actually emit DW_LNS_set_epilogue_begin states in the line program table.
The prologue end states do get emitted, so it’s only epilogue beginnings that are missing.

Does anybody know why this is? I found an older thread1 with the same question,
but not obvious resolution. If it’s a matter of the responsible code just missing, then
this is something I could take a personal stab at (with some pointers in the correct
direction).

  • William