Indirect branch instruction

Hi,

I was looking for an indirect branch instruction in llvm, which would not take
a BasicBlock as argument, but a value. Reid told me on IRC that there is no such
instruction in llvm.

Is this deliberate? Or did you never face the need of this instruction yet?

Cheers,
Nicolas

The switch instruction serves the same functional role. In llvm-gcc, the GCC "Address of label" and "indirect goto" extensions are compiled into a switch instruction.

-Chris

I was looking for an indirect branch instruction in llvm, which would
not take a BasicBlock as argument, but a value. Reid told me on IRC that
there is no such instruction in llvm.

Is this deliberate? Or did you never face the need of this instruction
yet?

The switch instruction serves the same functional role. In llvm-gcc, the
GCC "Address of label" and "indirect goto" extensions are compiled into a
switch instruction.

I think the question is how to branch to a location of which you do not have a label.
Presumably to avoid a stack overflow when you know that you only want to return the result of an indirect call.

- Eric

Hi Chris, Hi Eric,

Eric van Riet Paap wrote:

I was looking for an indirect branch instruction in llvm, which would
not take a BasicBlock as argument, but a value. Reid told me on IRC that
there is no such instruction in llvm.

Is this deliberate? Or did you never face the need of this instruction
yet?
      

The switch instruction serves the same functional role. In llvm- gcc, the
GCC "Address of label" and "indirect goto" extensions are compiled into a
switch instruction.
    
I think the question is how to branch to a location of which you do not have a label.
  

Yes, that was the original question. But actually, for my purpose, Chris' solution
is functional enough. However, I looked into llvm-gcc code and the switch
instruction switches for all possible labels defined in the method. Does
LLVM is smart enough to realize that it switches on the address of a label, therefore
only has to generate a jmp in x86 or a {mtctr, bctr} couple in powerpc?

Nicolas

I'm not sure what you mean. We do reasonable optimization of switch statements, but I'm sure there are cases we miss. If so, please let us know, so we can add them to lib/Target/README.txt

-Chris

The switch instruction serves the same functional role. In llvm- gcc, the GCC "Address of label" and "indirect goto" extensions are compiled into a switch instruction.

I think the question is how to branch to a location of which you do not have a label.

I'm not sure what you mean. The compiler needs to have an accurate CFG to be successful with dataflow analysis (i.e. not miscompile your code).

Presumably to avoid a stack overflow when you know that you only want to return the result of an indirect call.

It sounds like you just want tail call elimination? Jumping to the address of an indirect call is a great way to get a crash, except in the most trivial cases.

-Chris

Chris Lattner wrote:

The switch instruction serves the same functional role. In llvm- gcc, the GCC "Address of label" and "indirect goto" extensions are compiled into a switch instruction.
      

I think the question is how to branch to a location of which you do not have a label.
    
I'm not sure what you mean. The compiler needs to have an accurate CFG to be successful with dataflow analysis (i.e. not miscompile your code).

It would be to the programer responsability that there are no "phis"
with an indirect branch. So there is no dataflow analysis needed in
such case.

Presumably to avoid a stack overflow when you know that you only want to return the result of an indirect call.
    
It sounds like you just want tail call elimination? Jumping to the address of an indirect call is a great way to get a crash, except in the most trivial cases.

Again, it would be the programer responsabilty to not jump to an
invalid address. But maybe this is undesirable for llvm (and I would
understand, it's better to have a compiler that can certify most things
won't go wrong).

-Chris

Nicolas

Chris Lattner wrote:

Yes, that was the original question. But actually, for my purpose, Chris' solution is functional enough. However, I looked into llvm-gcc code and the switch instruction switches for all possible labels defined in the method. Does LLVM is smart enough to realize that it switches on the address of a label, therefore only has to generate a jmp in x86 or a {mtctr, bctr} couple in powerpc?
    
I'm not sure what you mean. We do reasonable optimization of switch statements, but I'm sure there are cases we miss. If so, please let us know, so we can add them to lib/Target/README.txt

Two jumps are executed when using a switch statement for indirect branch:
the first jump goes to the right switch case, the second jumps to the corresponding
label. However, indirect branches need only one jump: jumping to the value contained
in a register for example (which contains the address of a label).

Nicolas

Can you give a compilable C function as an example?

-Chris

Chris Lattner wrote:

Two jumps are executed when using a switch statement for indirect branch:
the first jump goes to the right switch case, the second jumps to the
corresponding
label. However, indirect branches need only one jump: jumping to the
value contained
in a register for example (which contains the address of a label).
    
Can you give a compilable C function as an example?

Well I'm not sure on how to do this in C, but in x86 assembly a simple jmp %eax
does it.

I don't understand. You're making a claim that the C compiler isn't producing optimal code for some case. Can you give an example of a C function that llvm compiles to something suboptimal? LLVM certainly does generate stuff like "jmp %eax", but presumably not in the way you want. Without an example to see what you mean, we won't be able to fix the deficiency that you've encountered.

Thanks,

-Chris

Chris Lattner wrote:

I don't understand. You're making a claim that the C compiler isn't producing optimal code for some case. Can you give an example of a C function that llvm compiles to something suboptimal? LLVM certainly does generate stuff like "jmp %eax", but presumably not in the way you want. Without an example to see what you mean, we won't be able to fix the deficiency that you've encountered.

Sorry, I've been misunderstood. I'm claiming nothing. I'm just
assuming that when using a SwitchInst, the just in time compiler
generates two jumps, one to jump to the right case, and then one
to jump to the correct label.

Typically, I understood that, to make indirect branches, the
(pseudo-) code generated is:

label1:
    instructions
label2:
    instructions
label3:
    instructions

my_value = @ of a label
switch(my_value)
{
    case label1: jump label1
    case label2: jump label2
    case label3: jump label3
    default: abort("invalid label")
}

What I was asking is : what is the assembly code generated by llvm for this
kind of code? My guess was 2 jumps. But it might be 1 jump, based on
the fact that llvm knows it switches on labels.

I hope it's clearer now.

Thanks Chris,

Nicolas

Ah, I see what you're saying. I believe LLVM will currently make two jumps for that. If that seriously bugs you, it can always be fixed :).

-Chris