Dear All,
I try to generate some atomic ptx with memory synchronizing effect, such as:
atom.acquire.sys.add.u64
in which “.acquire” and “.sys” is optional qualifier for Atomic ptx:
atom{.sem}{.scope}{.space}.op.type d, [a], b, c;
However, when I try to use NVPTX backend to generate ptx from llvm-ir, it seems all of SyncScope/AtomicOrdering is discard.
For example:
define i64 @atom1(i64* %addr, i64 %val) {
%ret = atomicrmw add i64* %addr, i64 %val acquire
ret i64 %ret
}
ptx output is:
// %bb.0:
ld.param.u32 %r1, [atom1_param_0];
ld.param.u64 %rd1, [atom1_param_1];
atom.add.u64 %rd2, [%r1], %rd1;
st.param.b64 [func_retval0+0], %rd2;
ret;
// – End function
Does nvptx backend support SyncScope/AtomicOrdering for atomicrmw?