PHI nodes for atomic variables

Hi everyone,

I want to track the def-use chain for atomic variables.
However, it seems that LLVM will not generate PHI nodes for atomic variables.

I present the generated LLVM bytecode for the next code snippet as follow.
I found that it only generated PHI node (%8 = phi i32 [ %4, %3 ], [ %6, %5 ]) for non-atomic variable ‘data2’ but not for atomic variable x? Why?
With PHI node “%8 = phi i32 [ %4, %3 ], [ %6, %5 ]”, we can easily know that data3 dependents on data4.

However, if no such PHI node can be generated, how to catch the information that data1 dependents on data4?

Thank you all in advance.

int data1, data2, data3, data4;
std::atomic x;

void f1()
{
if (data1 > 0) {

x = data4;
data2 = data4;
}

data3 = data2;
data1 = x;
}

; Function Attrs: uwtable
define void @_Z2f1v() #3 personality i32 (…)* @__gxx_personality_v0 {
tail call void @checker_thread_begin(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0))
%1 = load i32, i32* @data1, align 4, !tbaa !1
%2 = icmp sgt i32 %1, 0
br i1 %2, label %5, label %3

; :3: ; preds = %0
%4 = load i32, i32* @data2, align 4, !tbaa !1
br label %7

; :5: ; preds = %0
%6 = load i32, i32* @data4, align 4, !tbaa !1
store atomic i32 %6, i32* getelementptr inbounds (%“struct.std::atomic”, %“struct.std::atomic”* @x, i64 0, i32 0, i32 0) seq_cst, align 4
store i32 %6, i32* @data2, align 4, !tbaa !1
br label %7

; :7: ; preds = %3, %5
%8 = phi i32 [ %4, %3 ], [ %6, %5 ]
store i32 %8, i32* @data3, align 4, !tbaa !1
%9 = load atomic i32, i32* getelementptr inbounds (%“struct.std::atomic”, %“struct.std::atomic”* @x, i64 0, i32 0, i32 0) seq_cst, align 4
store i32 %9, i32* @data1, align 4, !tbaa !1
tail call void @checker_thread_end()
ret void
}

LLVM IR does not contain variables, it contains (immutable) registers. Some of these registers refer to memory locations and are usable with loads and stores. The notion of ‘atomic’ doesn’t make sense in the context of a register, because registers are implicitly immutable (i.e. not updated, so atomic updates don’t make sense) and non-shared (so there’s nothing for accesses to them to be atomic with respect to). As such, the atomic qualifier makes sense only with memory locations. The memory address itself may be stored in a register, but updates to the location must be performed by loads and stores (or atomicrmw instructions).

In the absence of any ordering constraints imposed by atomic memory operations (including fences), it is often safe to promote a sequence of loads and stores of a memory location to a single load / store pair with a sequence of registers storing temporary values. This is allowed because, in the absence of something that establishes a happens-before relationship between threads, a thread is allowed to make writes to memory visible to other threads in any order.

David

Thanks for your explanation.

Do you mean that LLVM will not maintain the def-use chain for atomic variables?So it is impossible to directly catch the fact that the load of x at the statement 'data1 = x; ’ dependents on data4 (because of the statement x=data4 )?

If I want to get such information, may be the only solution is to traverse all the predecessors of the statement ‘data1 = x;’.

Let me try to help.

I used -O3. I will try MemorySSA analysis. Thanks!

Best regards,

Qiuping Yi
Institute Of Software
Chinese Academy of Sciences

Let me try to help.

Thanks for your explanation.

Do you mean that LLVM will not maintain the def-use chain for atomic
variables?

It is not a variable at the LLVM level.
At the source level, it is a variable.
At the LLVM IR level, it is lowered into memory operations.
All of your operations were. None of them are in llvm registers.
Some were then promoted or partially promoted into registers.
(it looks like your output is -O1 or something and uses libstdc++ instead
of libc++)

The loads are loads from a memory location. It does not have any more data

Dear Daniel Berlin,

I just tried MemorySSA analysis and get the next IR.
However, I feel confused by the result.

Specifically, why instruction %3 relates to a MemoryDef. According to my understanding,
I think %3 should be related to a MemoryUse, right?

; Function Attrs: uwtable
define void @_Z2f1v() #3 personality i32 (…)* @__gxx_personality_v0 {
entry:
; 1 = MemoryDef(liveOnEntry)
tail call void @checker_thread_begin(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i64 0, i64 0))
; MemoryUse(1)
%0 = load i32, i32* @data1, align 4, !tbaa !1
%cmp = icmp sgt i32 %0, 0
br i1 %cmp, label %if.then, label %entry.if.end_crit_edge

entry.if.end_crit_edge: ; preds = %entry
; MemoryUse(1)
%.pre = load i32, i32* @data2, align 4, !tbaa !1
br label %if.end

if.then: ; preds = %entry
; MemoryUse(1)
%1 = load i32, i32* @data4, align 4, !tbaa !1
; 2 = MemoryDef(1)
store atomic i32 %1, i32* getelementptr inbounds (%“struct.std::atomic”, %“struct.std::atomic”* @x, i64 0, i32 0, i32 0) seq_cst, align 4
; 3 = MemoryDef(2)
store i32 %1, i32* @data2, align 4, !tbaa !1
br label %if.end

if.end: ; preds = %entry.if.end_crit_edge, %if.then
; 8 = MemoryPhi({if.then,3},{entry.if.end_crit_edge,1})
%2 = phi i32 [ %.pre, %entry.if.end_crit_edge ], [ %1, %if.then ]
; 4 = MemoryDef(8)
store i32 %2, i32* @data3, align 4, !tbaa !1
; 5 = MemoryDef(4)
%3 = load atomic i32, i32* getelementptr inbounds (%“struct.std::atomic”, %“struct.std::atomic”* @x, i64 0, i32 0, i32 0) seq_cst, align 4
; 6 = MemoryDef(5)
store i32 %3, i32* @data1, align 4, !tbaa !1
; 7 = MemoryDef(6)
tail call void @checker_thread_end()
ret void
}

We call getModRefInfo on each instruction, and see what it says.
Here, atomic loads are said to modify memory.
This is because of ordering/other constraints, whicch are currently modeled by saying they are defs. This is done to be conservatively correct in LLVM and generally prevent passes from reordering them.
Otherwise, the use/def chains would say it was fine to move them in ways it may not be.
If we wanted to be super-precise, we would split the aliasing def-use chain and the ordering def-use chain.

If we did that, it would be a MemoryUse and an OrderingDef (or whatever we called them)
So far, that has not been worth doing.

You will also see the memoryssa, by default, will not disambiguate def->def chains (IE 5->6 in the above case. It requires introducing a multiple phi/variable form of memoryssa. Experience showed us this was not worth it. GCC now uses a single variable/phi form like we do). Use getClobberingMemoryAccess to disambiguate def-def chains if you need it.