Optimizing assembly generated for tail call

Hello,

I recently found that LLVM generates sub-optimal assembly for a tail call optimization case. Below is an example (https://godbolt.org/z/ao15xE):

void g1();
void g2();
void f(bool v) {
if (v) {
g1();
} else {
g2();
}
}

The assembly generated is as follow:

f(bool): # @f(bool)
testb %dil, %dil
je .LBB0_2
jmp g1() # TAILCALL
.LBB0_2:
jmp g2() # TAILCALL

However, in this specific case (where no function epilogue is needed), one can actually change ‘je .LBB0_2’ to ‘je g2()’ directly, thus saving a jump.

Is there any way I could instruct LLVM to do this? For my use case, it is acceptable to do this at any level (C++ level /IR level /MachineInst level is all fine).

Thanks!

Best,
Haoran

That translation doesn't normally work as conditional jumps have a much
more restricted range.

Joerg

I checked Intel instruction guide and it seems like conditional jumps can take rip-relative 32-bit operands, same as unconditional jumps. Did I miss something?

Joerg Sonnenberger via llvm-dev <llvm-dev@lists.llvm.org> 于2020年10月6日周二 下午4:14写道: