Why does Clang compile short-circuit evaluation to phi nodes instead of stores before mem2reg?

materia · January 24, 2025, 11:59pm

I was looking at something like this:

void example(bool a) {
    bool b = (a || a) || (a || a);
}

Which if compiled with -S -emit-llvm -Xclang -disable-llvm-optzns gives us:

define dso_local void @example(bool)(i1 noundef zeroext %0) #0 !dbg !10 {
  %2 = alloca i8, align 1
  %3 = alloca i8, align 1
  %4 = zext i1 %0 to i8
  store i8 %4, ptr %2, align 1
    #dbg_declare(ptr %2, !16, !DIExpression(), !17)
    #dbg_declare(ptr %3, !18, !DIExpression(), !19)
  %5 = load i8, ptr %2, align 1, !dbg !20
  %6 = trunc i8 %5 to i1, !dbg !20
  br i1 %6, label %18, label %7, !dbg !21

7:
  %8 = load i8, ptr %2, align 1, !dbg !22
  %9 = trunc i8 %8 to i1, !dbg !22
  br i1 %9, label %18, label %10, !dbg !23

10:
  %11 = load i8, ptr %2, align 1, !dbg !24
  %12 = trunc i8 %11 to i1, !dbg !24
  br i1 %12, label %16, label %13, !dbg !25

13:
  %14 = load i8, ptr %2, align 1, !dbg !26
  %15 = trunc i8 %14 to i1, !dbg !26
  br label %16, !dbg !25

16:
  %17 = phi i1 [ true, %10 ], [ %15, %13 ]
  br label %18, !dbg !23

18:
  %19 = phi i1 [ true, %7 ], [ true, %1 ], [ %17, %16 ]
  %20 = zext i1 %19 to i8, !dbg !19
  store i8 %20, ptr %3, align 1, !dbg !19
  ret void, !dbg !27
}

attributes #0 = { mustprogress noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

What’s the advantage of doing this instead of having sometlhing like:

  br i1 %in_var, label %is_true, label %is_false

is_true:
  store i8 1, ptr %res_var
  br label %end

is_false:
  store i8 0, ptr %res_var
  br label %end

end:

I’m guessing the constants in the phi nodes give us some kind of advantage on some kind of pass, but I don’t see how.

ayokunle321 · February 4, 2025, 4:02pm

While the control flow in your example seems clearer, using phi nodes allows the compiler to reason about the code a lot better. I believe LLVM tries to keep values in registers as much as possible because it can understand the code better and figure out what it’s trying to do. This makes optimizations such as constant propagation easier to apply- would be a hassle with variables as the compiler is not as sure of the value in memory (can infer this as well but definitely a lot easier with registers). Also introducing additional variables with explicit stores would increase memory traffic, making the code less optimal. You can check out some things such as and data-flow analysis and optimizations to get a better idea of these things.

Topic		Replies	Views
where are my phi-nodes? LLVM Dev List Archives	8	170	September 13, 2010
How to optimize the LLVM IR to generate phi node by opt tool IR & Optimizations llvm	2	256	February 1, 2024
phi instuction example LLVM Dev List Archives	2	119	April 18, 2014
eliminate phi nodes, reduce unnecessary loads / stores , reg2mem, mem2reg LLVM Dev List Archives	1	158	May 17, 2011
PHI nodes in machine code LLVM Dev List Archives	7	172	July 10, 2004

Why does Clang compile short-circuit evaluation to phi nodes instead of stores before mem2reg?

Related topics