Questions on MachineFunctionPass and relaxation of pcrel calls (ARM/thumb2)

While implementing a MachineFunctionPass that runs as part of the ARMTargetMachine::addPreEmitPass(), I’ve run into a problem.
This particular MFP can drastically increase the size (in MachineInstr count) of the MachineFunction that it processes, so much so that
there is a real danger of pcrel calls and branches that use immediate offsets to not be sufficient.

A naive test confirmed that under extreme duress, LLVM does not produce correct BL/B sequences, as it defaults to using pcrel immediates which do not have enough bits to encode the offset.

How do I convince LLVM to revert to materializing an offset in a register and use register indirect BL/B sequences? Is there some magic sequence of getAnalysisUsage() which will tell the outer loop to replace the immediate form of branches in lieu of a register indirect BL/B ?

Or is the answer to use some other framework (ModulePass?) Or is this a case where I need to carry out ‘manual’ (un)relaxation?

In either case, pointers and advice would be greatly appreciated.

Thanks!
-Jason

Hi Jason,

Branch relaxation for out of range immediates is handled by the constant island pass. That pass currently doesn't consider those instructions (they're assumed to always be in range), but there's no reason it couldn't.

That natural solution is, as you indicate, relaxing them to use indirect branches. That could be tricky, however, as you'd need a scratch register to compute the branch target into and there may not be one available. The register scavenger could spill one for you, but that would further increase the code size and I worry about that causing the constant island pass to not always converge.

An alternative to using indirect branches is to instead insert intermediate branch islands along the way. These would be new basic blocks that would be placed similarly to how the constant islands are (i.e., try to have minimal impact on other control flow). I'm still a bit concerned about convergence w/ that approach, but it does effectively avoid the register scavenging problem.

A completely different approach would be to detect very early (before isel) that the function is very large and force all unconditional branches to be indirect. That, however, is almost certainly huge overkill and will have pretty nasty performance characteristics as a result. I'd suggest leaving that to a last resort.

Regards,
-Jim