A bug in LLVM-GCC 4.2 with inlining __exchange_and_add

Hi,

I have encountered an issue which seems to be a serious reproducible bug in LLVM-GCC 4.2.
It can be reproduced by compiling the following C++ file that uses boost:

#include “boost/statechart/event.hpp”

using namespace std;

class EvActivate : public boost::statechart::event< EvActivate >

{

public:

EvActivate(){}

private:

};

extern “C” const void* activate()

{

return (EvActivate()).intrusive_from_this().get();

}

The problem is that the generated assembler looks like:

_activate:
00000000 b5f0 push {r4, r5, r6, r7, lr}
00000002 af03 add r7, sp, #12
00000004 e92d0d00 stmdb sp!, {r8, sl, fp}
00000008 ed2d8b10 vstmdb sp!, {d8-d15}
0000000c b094 sub sp, #80
0000000e f2405088 movw r0, :lower16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc
00000012 2300 movs r3, #0
00000014 f2c00000 movt r0, :upper16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc
00000018 f2407140 movw r1, :lower16:0x770-0x2c+0xfffffffc
0000001c f2c00100 movt r1, :upper16:0x770-0x2c+0xfffffffc
00000020 f24052c8 movw r2, :lower16:__ZTV10EvActivate-0x34+0xfffffffc
00000024 4478 add r0, pc
00000026 f2c00200 movt r2, :upper16:__ZTV10EvActivate-0x34+0xfffffffc
0000002a 9304 str r3, [sp, #16]
0000002c 4479 add r1, pc
0000002e 9005 str r0, [sp, #20]
00000030 a803 add r0, sp, #12
00000032 9006 str r0, [sp, #24]
00000034 447a add r2, pc
00000036 f8ddc018 ldr.w ip, [sp, #24]
0000003a 3004 adds r0, #4
0000003c 9001 str r0, [sp, #4]
0000003e 6808 ldr r0, [r1, #0]
00000040 f1020108 add.w r1, r2, #8 @ 0x8
00000044 f8cc1000 str.w r1, [ip]
00000048 f3bf8f5a dmb ishst
0000004c 9901 ldr r1, [sp, #4]
0000004e e8512f00 ldrex r2, [r1]
00000052 9200 str r2, [sp, #0]
00000054 441a add r2, r3
00000056 e8412c00 strex ip, r2, [r1]
0000005a f1bc0f00 cmp.w ip, #0 @ 0x0
0000005e d1f6 bne.n 0x4e

What happens in the code between 4e and 5e is an atomic check of a variable by the inlined __exchange_and_add. The problem is that the result read by ldrex is stored by the inline optimization on the stack for further use. However, as the atomically read variable is also on the stack and resides very close to this compiler-induced intermediate storage - the write hits the ERG. On Apple’s A6X devices this reproduced consistently - the code entered a perpetual loop, as the str instruction at 0x52 caused the srtex at 0x56 to always fail and always return 1 and the following branch started it all over.

Generating such code violates the ARM recommendation:
"For these reasons ARM recommends that:

  • the Load-Exclusive and Store-Exclusive are no more than 128 bytes apart

  • no explicit cache maintenance operations or data accesses are performed between the Load-Exclusive and the Store-Exclusive."

    I’ve encountered this issue in a real code and would be glad to get the feedback on it. Please let me know if I need to submit a bug somewhere to get it resolved. I’ve found out that clang does not have this problem.

    Moshe

I've encountered this issue in a real code and would be glad to get the
feedback on it. Please let me know if I need to submit a bug somewhere to get it resolved. I've found out that clang does not have this problem.

Then use clang. llvm-gcc is deprecated.

I'd go farther: llvm-gcc is quite dead now.

-Chris

>> I've encountered this issue in a real code and would be glad to get the
>> feedback on it. Please let me know if I need to submit a bug somewhere to get it resolved. I've found out that clang does not have this problem.
> Then use clang. llvm-gcc is deprecated.

I'd go farther: llvm-gcc is quite dead now.

Chris,
  At the risk of hijacking the thread, are there any plans to make -traditional-cpp
in clang completely emulate the behavior in llvm-gcc such that imake can be built
with it (45509 – imake is failing with clang). This will be an issue
on darwin since llvm-gcc disappears after Xcode 4.6). Hopefully this can be resolved
before that happens.
            Jack

No plans that I'm aware of, but cfe-dev would be a better place to ask about that. -traditional-cpp is emulating pre-ansi c preprocessor behavior which is *over* 20 years old by now. It would be much better to work with the imake folks to fix their code.

-Chris

...

  At the risk of hijacking the thread, are there any plans to make -traditional-cpp
in clang completely emulate the behavior in llvm-gcc such that imake can be built
with it (45509 – imake is failing with clang). This will be an issue
on darwin since llvm-gcc disappears after Xcode 4.6). Hopefully this can be resolved
before that happens.

No plans that I'm aware of, but cfe-dev would be a better place to ask about that. -traditional-cpp is emulating pre-ansi c preprocessor behavior which is *over* 20 years old by now. It would be much better to work with the imake folks to fix their code.

I don't think there any 'upstream' to speak of anymore in case of imake.
Wikipedia even says "imake was a build automation system used by the X
Window System from X11R1 (1987) to X11R6.9 (2005)." :slight_smile:

That said, there is still quite a lot of old software requiring imake to
build, and most of that software falls over when it attempts to use
clang as a traditional preprocessor. Mostly, because clang turns tabs
into spaces, confusing GNU make. The FreeBSD ports guys now use either
gcc to preprocess such software, or use ucpp, a minimal traditional
preprocessor.

In any case, it is a bit strange clang actually accepts the -traditional
flag, instead of producing a fatal error. PR 4557 talks about such an
error being implemented in r75690, but it seems to have been reverted
later on? I think it would be better to just outright refuse the flag.

-Dimitry

Again, the right mailing list to discuss this is cfe-dev. Please start a thread there, thanks!

-Chris