Use of callbr for non-asm goto

The IR documentation says this about the callbr instruction:

This instruction should only be used to implement the “goto” feature of gcc style inline assembly. Any other usage is an error in the IR verifier.

We have some instructions that can act like branches but aren’t branches. They essentially perform some normal arithmetic operation but if certain conditions are met they will cause a transfer of control to some non-fallthrough target.

callbr seems to be the only way to express this behavior in LLVM. I don’t think we can mark them as branches in TableGen because they don’t always act like branches. It depends on runtime state. 95% of the time we will emit these as ordinary instructions. We do know enough to emit them as a callbr when we need to. That is, we know a priori which uses will act like branches.

Can a “branch” as understood by LLVM do more than simply redirect control flow, like perform arithmetic, do a load/store, etc.? I don’t think I’ve come across such a thing in other targets but maybe someone else has run into this need. I considered the possibility of creating a separate class of instruction (with a unique internal LLVM opcode) and mark them as branches in TableGen but I don’t know if that’s allowed.

The above restriction on callbr makes me uneasy about our current solution.

David

If you needed to use callbr for an intrinsic, it’s probably not hard to add, but I haven’t seen any compelling use-case for that on any existing target. (Usually it’s enough to do pattern-matching on IR that contains conventional branches.)

Once you’re talking about MachineFunction/MachineInstruction, the definition of a “branch” is pretty loose; if you want to make your branch do arithmetic, or load from memory, or conditionally fall through, that’s fine. (For example, load-and-branch is a native instruction on x86; grep for JMP64m.)

Once you’re talking about MachineFunction/MachineInstruction, the definition of a “branch” is pretty loose; if you want to make your branch do arithmetic, or load from memory, or conditionally fall through, that’s fine. (For example, load-and-branch is a native instruction on x86; grep for JMP64m.)

Usually branches are terminator instructions ending a basic block. Of course there is cases where we don’t need to track things in the CFG like “call” jumping and returning later or the sort of instructions that immediately aborts your program. But anything that continues execution in a different basic block needs to be marked as a terminator.

As far as I know regalloc and other passes in CodeGen are not well prepared for terminator instructions writing to registers (because for example you cannot just place a spill instruction immediately behind a terminator). People did add enough code to suppress/workaround these problems to make the assembly goto with INDIRECT_BR work. But I’d be wary of arbitrary terminator opcodes writing to registers, I don’t think it works out-of-the-box…

Currently we’re using callbr with inline asm because that is what callbr requires. Using it with an intrinsic would present some nice benefits though.

Ok, that makes sense. But what to do at the IR level? LLVM needs to understand the control flow and we can’t easily express the branch condition statically because constructing the instruction sequence to check for it would involve multiple instructions. I’m not confident we could match all of that in isel given what the optimizer might do with it. I can do the experiment though. Since the instructions to compute the condition should just be deleted eventually, maybe we can devise some kind of dummy condition that doesn’t actually look at the runtime state so it could be a single instruction or at least greatly simplified.

Thanks for the confidence booster on Machine IR. That gives me some hope.

David