Can't step over __sync_bool_compare_and_swap on ARM


we are running into a problem with the __sync_bool_compare_and_swap intrinsic on ARM. Trying to step over the call to __sync_bool_compare_and_swap will put LLDB into a sort of infinite loop.

This is reproduceable in the latest XCode (6.1.1 (6A2008a)). Create an iOS project with the following main function:

int main(int argc, char * argv[]) {
int c = 0;
while(1) {
if(__sync_bool_compare_and_swap(&c, 0, 1)) {
printf("%d\n", c);

The intrinsic compiles down to:

0xb7010: dmb ish
0xb7014: movs r0, #0x1
0xb7016: movs r1, #0x0
0xb7018: add r2, sp, #0x14
0xb701a: str r0, [sp, #0x10]
0xb701c: str r1, [sp, #0xc]
0xb701e: str r2, [sp, #0x8]
→ 0xb7020: ldr r0, [sp, #0x8]
0xb7022: ldrex r1, [r0]
0xb7026: ldr r2, [sp, #0xc]
0xb7028: cmp r1, r2
0xb702a: str r1, [sp, #0x4]
0xb702c: bne 0xb703a ; main + 62 at main.m:15
0xb702e: ldr r1, [sp, #0x10]
0xb7030: ldr r2, [sp, #0x8]
0xb7032: strex r0, r1, [r2]
0xb7036: cmp r0, #0x0
0xb7038: bne 0xb7020 ; main + 36 at main.m:15
0xb703a: dmb ish

When stepping over, LLDB will first set a breakpoint on 0xb702c: bne 0xb703a. Next it executes a single step, moving PC to 0xb702e: ldr r1, [sp, #0x10] as the condition is not meet.

LLDB then sets a breakpoint on the next branch instruction at 0xb7038: bne 0xb7020. It single steps the instruction, the condition is meet, and we end up at 0xb7020 again.

The code never breaks out of this loop, LLDB will continue to set the breakpoints indefinitely.

Any idea how to fix this?


Using the OS atomic cmp/swp works (step over/into does not go into infinite loop):

#include <libkern/OSAtomic.h>

int main(int argc, char * argv[]) {
int c = 0;
while(1) {
if(OSAtomicCompareAndSwap32(0, 1, &c)) {
printf("%d\n", c);

This means we have a fix for our use case, but i assume that others may use the intrinsic and be surprised at LLDB’s behaviour.

Does this work if you first type:

(lldb) setting set target.use-fast-stepping false

Then step?

Using “slow-stepping” doesn’t fix the issue. It appears that the debugger is setting some flags when hitting the breakpoint and stepping which prohibit the intrinsic to evaluate the conditionals properly.

If we are single stepping then we set the BVR/BCR regs to say "stop when the PC is not equal to its current value". If we are settings breakpoints, we just write a trap into memory and continue. Jim pointed out we are setting a breakpoint after the atomic instruction and not on it, so I don't know how this would affect things...