Incorrect placement of an instruction after PostRAScheduler pass

Hi,

I’m facing a crash issue (--target=arm-linux-gnueabi
-march=armv8-a+crc -mfloat-abi=hard) and debugging the problem, I
found that an intended branch was not taken due to bad code generation
after the Post RA Scheduler pass. A CMPri instruction after an
INLINEASM block (which inturn contains a cmp, bne instruction) is
being moved before the INLINEASM block incorrectly resulting in two
consecutive bne instructions.

I do not have a small convenient test case and there are several
inline functions involved, but I hope to explain the problem with
relevant code snippets. Any suggestions on what/where to look for the
problem within llvm would be very useful.

In the generated code, there are two consecutive bne instructions
which appears to be the problem; the cmp instruction at offset 198 is
actually associated with the bne at offset 1bc. Instructions from 1ac
to 1b8 are coming from an inline asm. After the “Post RA top-down list
latency scheduler” pass, the cmp instruction has been moved before the
inline asm causing the issue.

198: e35c0001 cmp ip, #1 <<<<<<<<<<<<<<<<<<
19c: e023700e eor r7, r3, lr
1a0: e0022003 and r2, r2, r3
1a4: e0087007 and r7, r8, r7
1a8: e1820007 orr r0, r2, r7
1ac: e1b42e9f ldaexd r2, r3, [r4]
1b0: e1a46f90 strexd r6, r0, [r4]
1b4: e3560000 cmp r6, #0
1b8: 1afffffb bne 1ac <xxxxx+0x1ac>
1bc: 1a000002 bne 1cc <xxxx+0x1cc> <<<<<<<<<<<<<<<<<<<
1c0: e1a00005 mov r0, r5
1c4: e3a01001 mov r1, #1

This is the relevant C code:

       if (__builtin_expect((lock_flag == LOCK),1)) { // First
comparison using lock_flag, which is a function argument and an enum
with values 0, 1, 2.
           lock=foo_lock(vaddr);
        }
        old = func(flags, vaddr, data);
        if (__builtin_expect((lock_flag == LOCK),1)) { // Second
comparison using lock_flag
            foo_unlock(lock);
        }

The inline asm comes from the below function that is called within func().
static inline void write64 (volatile void *p, uint64_t val)
{
    uint64_t tmp=0,prev=0;
    asm volatile(
        "3: ldaexd %[prev], %H[prev], [%[p]];"
        " strexd %[tmp], %[val], %H[val], [%[p]];"
        " cmp %[tmp],#0;"
        " bne 3b;"
        : [prev] "=&r" (prev), [tmp] "=&r" (tmp)
        : [p] "r" (p), [val] "r" (val) : "memory");
}

Hi Bharathi,

Thanks Tim, that worked.

Bharathi