Creating sanity in LLVM/Clang's atomics impl

I’ve uploaded a few changes to review for implementing Step “A1” in the atomic cleanup plan:

http://reviews.llvm.org/D18200
Add _atomic* lowering to AtomicExpandPass.

http://reviews.llvm.org/D18201

Switch over targets to use AtomicExpandPass, and clean up target atomics code.

There’s also this, semi-related cleanup:

http://reviews.llvm.org/D17933
Clang: Set MaxAtomicInlineWidth properly for i386, i486, and x86-64 cpus without cmpxchg16b. Also, remove all the manual definition of
GCC_HAVE_SYNC_COMPARE_AND_SWAP_* macros.

There’s some kinda annoying issues in LLVM I’ve run into so far:

  1. The required duplication of logic for which platforms have what-sized atomics between LLVM and Clang is unfortunate. This falls into the general problem of clang’s lib/Basic/Targets.cpp file being mostly an awful repetition of information available in the LLVM backends.

We really ought to have Clang start depending on the LLVM backends.

  1. The inability to access C ABI information in LLVM, whilst creating the libcalls is somewhat of a problem.

The atomic libcalls need the following information:

  • What “size_t” and “int” mean? (DL.getIntPtrType() seems okayish for the former (malloc already does that), and can use getInt32() for the latter…except that MSP430 seems to use 16-bit “int”)
  • What sizes of integers actually exist? You need to know which functions of the form “T __atomic_load_N(T*, int)”, with some integral type T, can actually exist in the atomics library. (e.g. if the platform has no __int128, there won’t be an “__int128 __atomic_load_16(__int128*, int)” you can actually call.

All that information is available in Clang, but not in LLVM. I think such low-level C ABI details would be a good idea to have in LLVM.

I also ran into a couple of other issues whilst on this quest:

  1. The C11 _Atomic ABI is incompatible between GCC and Clang . As is libc++'s std::atomic. (But not libstdc++: that uses the same ABI in both compilers).

Also, in Clang, _Atomic and libc++'s std::atomic use unfortunate choices of “maximum size to promote” that depend on the platform, and don’t allow for easy future extension of the set of supported atomic sizes.

https://llvm.org/bugs/show_bug.cgi?id=26462

  1. LLVM may not have a strong enough memory model to support GCC’s _sync* builtins properly (this may affect only ARMv8/AArch64 at the moment?). GCC has (recently) decided that __sync_fetch_and_add(&val, 4); needs stronger memory barriers than “__atomic_fetch_add(&val, 4, __ATOMIC_SEQ_CST);” . In particular, the former should not allow the processor to reorder a subsequent store before the store-release instruction at the end of the LL/SC sequence generated by atomic_fetch_add.

The C11, and LLVM, definition of a seq_cst memory operation only makes it ordered with other seq_cst operations, NOT with other non-seq_cst atomics. But, apparently GCC feels that the definition of its __sync_fetch_and_add builtin requires that it act like it act like a seq_cst FENCE, preventing non-seq_cst memory operations from moving across it. And thus, it now emits an explicit barrier instruction after the store-release.

https://gcc.gnu.org/PR65697

Should we do similar to GCC, and add a 6th kind of ordering, “seq_cst_fence” to model this?