sys::MemoryFence() using __sync_synchronize() with GCC on ARM does not generate a memory fence

Andrew Haley brought up this interesting issue on the GCC mailing-list
that directly affect the stability of the ARM llvm target when using
multi-threading.
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg00600.html

basically using __sync_synchronize() with GCC on ARM does not generate
any code for the fence.

For what I know:
The only working fix for this issue on Linux would be to create a call
to a high address Linux kernel helper named __kernel_dmb located at
0xffff0fa0 that performs the memory fence correctly dependent on what
kind of ARM CPU the Linux kernel are built against.

I belive ARM Darwin might have a similar issue but i dont know how to
fix it on that platform. ARM Darwin gurus please enlighten me how memory
barriers are performed for ARM on Darwin.

The kernel helper are implemented in
http://kernel.ubuntu.com/git-repos/rtg/linux-2.6/arch/arm/kernel/entry-armv.S
of the Linux sourcetree.

/*
* Reference prototype: