building a jump table in LLVM IR

Hi,

I'm currently writing an opt module for fast indirect call checks
using a table of allowed indirect call targets. The code replaces
function pointers with offsets into the table then masks the offset
for the table size and restores the function pointer before the call.
I have some ways of dealing with some kinds of external code that are
sufficient for my use case but not for more general use.

I'd like to generalize the table to be a table of jump instructions (a
classic jump table) rather than just a table of function pointers,
since this would solve some of my problems for the more general cases.
I know how to do this using inline assembly in LLVM, but this ties the
module to a particular architecture or set of architectures.

Is there a way to build a table of jump instructions like this in pure
LLVM IR? My limited understanding of LLVM says no, and I haven't been
able to find a way to do this, but I would appreciate any pointers you
can give me.

Thanks,

Tom

You can use a jump table using indirectbr, although note that this instruction requires you to label all the possible target blocks and that the targets can only be blocks in the same function.

Thanks for the followup.

If I understand the suggestion correctly, this doesn't solve the
problem of building a jump table to call into other functions, since,
as you note, indirectbr can only call into blocks in the same
function. Is the conclusion then that there is no way to do this in
LLVM IR? It looks like these kind of restrictions (no branching
between functions and no instructions outside of functions) are
designed into the structure of LLVM IR. Is this correct?

Thanks,

Tom

Thanks for the followup.

If I understand the suggestion correctly, this doesn’t solve the
problem of building a jump table to call into other functions, since,
as you note, indirectbr can only call into blocks in the same
function. Is the conclusion then that there is no way to do this in
LLVM IR?

How about creating a new basic block in the current function and placing a “call” instruction to the desired function call inside it?

AFAIK, this won't work: the way I want to use a jump table requires me
to get a pointer into the table that I can use as a function pointer
to call the original function in a normal call instruction. If I just
add a new basic block in some containing function with a call
instruction and somehow get a pointer to that instruction, then this
does satisfy the goal of putting the new instructions somewhere, but a
call through that pointer will not do the right thing, since it will
be calling into the middle of the containing function, then calling
the original function (whereas I want it to just jump to the original
function directly so it doesn't mess up the call stack). The
fundamental stumbling block is that it seems to be impossible (by
design) to jump in this manner in LLVM IR.

Thanks,

Tom

Note that LLVM does do tail-call optimization, even if the called function is an indirect function call. E.g.:

extern void (**jmptbl)();

void foo(int idx) {
   jmptbl[idx]();
}

will produce the LLVM IR:
@jmptbl = external global void (...)**

define void @foo(i32 %x) nounwind uwtable {
   %1 = sext i32 %x to i64
   %2 = load void (...)*** @jmptbl, align 8
   %3 = getelementptr inbounds void (...)** %2, i64 %1
   %4 = load void (...)** %3, align 8
   tail call void (...)* %4() nounwind
   ret void
}

and the x86-64 assembly:
     .type foo,@function
foo: # @foo
     .cfi_startproc
# BB#0:
     movslq %edi, %rcx
     movq jmptbl(%rip), %rdx
     xorb %al, %al
     jmpq *(%rdx,%rcx,8) # TAILCALL
.Ltmp0:
     .size foo, .Ltmp0-foo
     .cfi_endproc

This does require optimization levels to be applied (-O1 does it, -O0 does not), and it won't guarantee that a stack frame will not be produced for a glue function.