atomic (memory ordered) operations


what's the current status of the memory-ordered operations described in
i.e. the ones for "load acquire", "store release" etc. for C++0x atomics,
not the older ones for the __sync intrinsics? The specification looks good -
is it just waiting to be implemented?


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

I just reread the memory model proposal myself. Initially I wasn't excited by an implementation that represents memory operations as intrinsics, but I now see the this footnote: "atomic load and store should be done with some SubclassData bits on the existing instructions"

Representing atomic loads and stores and load/store instructions makes the whole thing worthwhile to me. That said, I will not personally be able to work implementing this proposal any time in the near future.

Generally, the spec looks great. There is only one thing I would like to change. The definition of sequential consistency is not quite right. The LLVM proposal claims that sequentially consitent operations have aquire+release semantics. Here is an excerpt from C++0x n3242, section 29.3:

  The enumeration memory_order specifies the detailed regular
  (non-atomic) memory synchronization order as defined in 1.10 and may
  provide for operation ordering. Its enumerated values and their
  meanings are as follows:

  - memory_order_relaxed: no operation orders memory.
  - memory_order_release, memory_order_acq_rel, and memory_order_seq_cst:
    a store operation performs a release operation on the affected
    memory location.
  - memory_order_consume: a load operation performs a consume
    operation on the affected memory location.
  - memory_order_acquire, memory_order_acq_rel, and memory_order_seq_cst:
    a load operation performs an acquire operation on the affected
    memory location.

The upshot is that sequentially consistent stores have release semantics, and sequentially consistent loads have acquire semantics. Additionally, sequentially consistent operations have a total order among themselves. By reading between the lines, you can see that sequentially consistent stores do not need acquire semantics.

I'm not just being pendantic. The only memory_order that matters much to me is memory_order_seq_cst, which I believe is the default mode for atomic types and the only sane model for programmers to use. Giving stores acquire semantics effectively forces full fences following every atomic store. Even if the target architecture were to provide a "store-acquire", implementing a store with acquire semantics is hideously expensive on modern microarchitectures. Yes, the implementation still needs a way to implement the total order among sequentially consistent operations. But that can potentially be optimized.