armv7 pc-rel bx thumb instruction

Hi everyone,

I’m working on some custom transformation passes that have the side-effect of
significantly increasing the code size. While testing it on some larger,
real-world code bases, I run into a linker error for armv7 thumb code. The
particular error I get from ld64 is that “armv7 has no pc-rel bx thumb
instruction.” I’ve been able to reproduce the problem by taking a random
thumbv7 bitcode file and cloning functions until the linker fails.

From looking at the ld64 source code it seems that the problem is caused by
the relocation for a thumb 22-bit pc-rel branch. I’m guessing that the linker
is unable to perform the relocation because the new address doesn’t fit in the
instruction’s 22 bits.

I know very little about the arm backend, but I’m wondering if there’s anything
I can do to prevent this from happening during compilation, before the linker
is involved?

Jonas

Hi Jonas,

I'm working on some custom transformation passes that have the side-effect
of
significantly increasing the code size. While testing it on some larger,
real-world code bases, I run into a linker error for armv7 thumb code. The
particular error I get from ld64 is that "armv7 has no pc-rel bx thumb
instruction." I've been able to reproduce the problem by taking a random
thumbv7 bitcode file and cloning functions until the linker fails.

Interesting. It looks like you've got a tail call from Thumb code to
ARM code. The linker would normally turn a BL into a BLX to make this
work, but it's (rightly) reporting that there's no "BX some_func"
instruction (you have to load the destination into a register and jump
there).

If you have control over both functions you probably just want to
compile the destination in Thumb mode (there's hardly ever reason to
use ARM mode these days). But given your circumstances there's a
pretty good chance the ARM code is actually a branch island ld64 is
trying to insert.

Other than that Clang has a "-fno-optimize-sibling-calls" which should
disable tail calls and make things work. I'd suggest reporting a bug
against ld64 too, it should be able to handle this case really.

Cheers.

Tim.

Hi Tim,

Thank you for clarifying what the error actually means! I did read something about the BLX instruction but since I’m compiling strictly for thumb, it didn’t make much sense to me. Adding -mdisable-tail-calls as a cc1 command indeed allowed me to link the generated binary.

After looking some more at the ld64 source code, I came across the following comment:

// The tail-call optimization may result in a function ending in a jump (b)
// to another functions. At compile time the compiler does not know
// if the target of the jump will be in the same mode (arm vs thumb).
// The arm/thumb instruction set has a way to change modes in a bl(x)
// instruction, but no instruction to change mode in a jump (b) instruction.
// In those rare cases, the linker needs to insert a shim of code to
// make the mode switch.

So it seems that a branch island is glue code added by the linker to do the actual mode switch if necessary. But why would we need a mode switch for a jump to a function that is also in thumb mode? And why is the branch island arm code and not thumb? Would you mind helping me understand how these branch islands work? I’d love to comprehend what’s actually going on here.

Thanks again for your help!

Jonas

Hi Jonas,

So it seems that a branch island is glue code added by the linker to do the
actual mode switch if necessary. But why would we need a mode switch for a
jump to a function that is also in thumb mode?

We wouldn't unless shim is in ARM mode; that's what the code actually
has to jump to. But it's just speculation, I haven't read the ld64
code nearly enough to pinpoint the error there.

And why is the branch island arm code and not thumb?

If that really is the issue, it'll just be an oversight.

Would you mind helping me understand how these branch islands work?

The basic idea is that if a call destination is too far away for the
instruction to make it there in one step the linker inserts a code
sequence roughly like this:

    ldr ip, Laddr
    bx ip
Laddr:
    .word real_function_dest

that is in range and converts the original call to jump there instead.
This allows the jump to reach anywhere in the 32-bit address since the
pointer at Laddr can be anything it wants.

There are bells and whistles for PIC code, and obviously linker
internal details get involved, but for those you're probably better
off just looking at the code.

Tim.

Thanks a lot for the explanation!

I've done some more testing and while -mdisable-tail-calls does solve
the problem for some samples, there are others where the error
remains. Any chance you or anyone else has another idea what might
cause this? Some sample show a different error "unknown ARM scattered
relocation type 11" which also seems to be related to jump islands
(being out of range?).

Thank you,
Jonas