Basic question on using phi operation for if-then statements

I’m trying to understand if-then statements using the tutorial at 5. Kaleidoscope: Extending the Language: Control Flow — LLVM 17.0.0git documentation but not understanding the Phi operation.

Firstly I think I may have got the SSA part wrong because I’m assigning the global variable “i” multiple times (that’s not correct right)? Please see my example below.

As for the Phi operation, the problem I’m having is that in the tutorial it’s using call instructions for example %calltmp1 = call double @bar() which have a named value like %calltmp1 which is passed to the phi operator like %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ].

In my example below I didn’t use the phi operator yet but how does that work with the store instructions I’m using? Are you supposed to wrap those in basic blocks or something so they have a name that can be referenced? This also begs the question, what happens if you have more than one instruction, how do you group them and use them with “phi”?

@i = global i32 0

define i32 @main() {
  store i32 2, ptr @i, align 4
  %i = load i32, ptr @i, align 4
  %sgttmp = icmp sgt i32 %i, 1
  br i1 %sgttmp, label %then, label %else

then:                                             ; preds = %entry
  store i32 1, ptr @i, align 4
  br label %ifcont

else:                                             ; preds = %entry
  store i32 0, ptr @i, align 4
  br label %ifcont

ifcont:                                           ; preds = %else, %then
  %i1 = load i32, ptr @i, align 4
  ret i32 %i1

Nope, your code looks fine (and even more importantly is accepted by opt --passes=verify which won’t tolerate any SSA wrongness). The important fact is that memory is not required to be in SSA form, you can store/load to a location as many times as you like and it won’t cause any problems. SSA rules only apply to values on the left hand side of an instruction: %val = ....

To convert your code to use phi, the optimizer would detect which values got stored to the variable along each path and use those in the phi. So you’d get something like %i1 = phi i32 [ 1, %then ], [ 0, %else ]. Running opt --passes=instcombine on your code will do this (and some following tidy-ups like merging the stores into a single one in ifcont).

I don’t think I understand this question, each instruction in a single block would need its own separate phi instruction. If that doesn’t help, do you have an example of what you mean?


Wait, according to the tutorial you’re expected to create the phi’s yourself, right? You make it sound like you’re supposed to run an optimizer pass and they will be generated for you.

Maybe I misunderstand now but here’s a better/more complex example with multiple instructions.

then:                                             ; preds = %entry
  %x = load i32, ptr @x, align 4
  store i32 %x, ptr @i, align 4
  %x1 = load i32, ptr @x, align 4
  store i32 %x1, ptr @i, align 4
  br label %ifcont

How do you create the phi based on that? I understand your snippet where “1” and “0” were passed to the phi but now it’s not so clear. I suspect those instructions need to be part of a named value/block so they can be passed via name to the phi.

The tutorial is probably trying to teach about how phis work, which is definitely useful knowledge to have so that you can understand real world LLVM IR if nothing else.

Most real front-ends do just get them from the optimizer though; it’s a lot easier to just store a new value into a local variable and let the mem2reg pass work out what control flows are actually possible and insert the needed phis.

If ifcont contains %v = load i32, ptr @i and you’re trying to replace that load with a phi then either %x1 or %x would be valid choices (assuming no other threads are messing things up at the same time).

The most obvious one would be %x1 since that’s always valid: the last value stored to a variable in a block is the one that later blocks will see. And that’s the key point, the phi has to replicate the value the original load would have obtained.

But a more detailed analysis might notice that %x1 is always going to have the same value as %x so that would work too. And maybe it’d prefer that so more dead instructions can be removed, but both are correct.

1 Like

These are certainly choices I don’t want to be making myself as I know basically nothing. :slight_smile: I think you’re right that you’re supposed to run an optimizer pass and then it figure this out for you, especially as a beginner who’s just learning. Thanks for explaining, I appreciate it.