To start the discussion of how libcxx should go about implementing this feature, I’ve prepared a cross-platform implementation of atomic_wait/atomic_notify_* for Mac/Linux/Windows/CUDA/unidentified-platform. I don’t claim that it’s fully tuned for your platform, nor do I claim that it’s perfect for every possible use, but it should not be terribly bad for any use either. It has various knobs to turn paths On/Off, so you can choose a different path on each platform, so long as it’s supported at all on that platform.
You can find the implementation here: https://github.com/ogiroux/atomic_wait/.
It has these strategies implemented:
Contention table. Used to optimize futex notify, or to hold CVs. Disable with __NO_TABLE.
Futex. Supported on Linux and Windows. For performance requires a table on Linux. Disable with __NO_FUTEX.
Condition variables. Supported on Linux and Mac. Requires table to function. Disable with __NO_CONDVAR.
Timed back-off. Supported on everything. Disable with __NO_SLEEP.
Spinlock. Supported on everything. Force with __NO_IDENT. Note: performance is too terrible to use.
The strategy is chosen this way, by platform:
Linux: default to futex (with table), fallback to futex (no table) → CVs → timed backoff → spin.
Mac: default to CVs (table), fallback to timed backoff → spin.
Windows: default to futex (no table), fallback to timed backoff → spin.
CUDA: default to timed backoff, fallback to spin. (This is not all checked in in this tree.)
Unidentified platform: default to spin.
The unidentified platform support could be better. For instance, we should probably assume that is implemented and use the sleeping/yielding facilities there. It should not fall all the way back to __NO_IDENT, it should instead fall back to about where CUDA is expected to be.
One of the main discussion points I’d like to drive with this, is the design of the contention management table, to go along the sharded lock table that backs _atomic* and _c11_atomic* built-ins. Ideally this would be handled the same way, meaning that it should live in libatomic.a or your substitute, and be shared with other C++ standard libraries on your platform.