After discussing this proposal with Arm/Qualcomm engineers and maintainers of concurrency libraries, I’d like to get your thoughts on my proposal to strengthen the implementation of relaxed atomic loads behind a command line flag. The full proposal is here:
In short, I propose we implement a proposal Hans Boehm made some time ago - putting a dummy branch after load instructions for Armv7, Armv8, IBM PowerPC, and RISC-V. This would restore the ability to reason about concurrent code by restricting load-load/store ordering for these architectures. No work is needed for x86 or MIPS backends.
Of course I don’t want to outlaw existing uses of relaxed atomics, so I propose to guard the implementation behind a command line flag, such as
-mstrict-rlx-atomics. This gives users the choice between performance (at the cost of reasoning) and the reasoning (vice versa). This provides a subset of the behaviours allowed by C23 and so few, if any, changes would be needed to the standards model.
This proposal follows a decade of work on understanding the cost of relaxed atomics, and follows work that proves this implementation is sound under compilation schemes, has promising performance characteristics (relative to other proposals), and restores the ability to reason about atomics should the user need it.
I propose to test this using the Telechat a compiler testing tool I have developed. Telechat can test the compilation of concurrent programs that use C/C++ atomics for all of the architectures mentioned. Our paper is detailed here.
Please let me know what you think.