Optimizing assembly generated for tail call


I recently found that LLVM generates sub-optimal assembly for a tail call optimization case. Below is an example (https://godbolt.org/z/ao15xE):

void g1();
void g2();
void f(bool v) {
if (v) {
} else {

The assembly generated is as follow:

f(bool): # @f(bool)
testb %dil, %dil
je .LBB0_2
jmp g1() # TAILCALL
jmp g2() # TAILCALL

However, in this specific case (where no function epilogue is needed), one can actually change ‘je .LBB0_2’ to ‘je g2()’ directly, thus saving a jump.

Is there any way I could instruct LLVM to do this? For my use case, it is acceptable to do this at any level (C++ level /IR level /MachineInst level is all fine).



That translation doesn't normally work as conditional jumps have a much
more restricted range.


I checked Intel instruction guide and it seems like conditional jumps can take rip-relative 32-bit operands, same as unconditional jumps. Did I miss something?

