Generating IR for a bitfield-typed-conditional operator

I am trying to rundown a bug that is essentially: Compiler Explorer

When a conditional is a ‘normal’ LValue type, we generate this using a ‘phi’ node here:

  br i1 %tobool, label %cond.true, label %cond.false, !dbg !27

cond.true:                                        ; preds = %entry
  %bf = getelementptr inbounds %struct.S, %struct.S* %s, i32 0, i32 0, !dbg !28
  store i32 5, i32* %bf, align 4, !dbg !29
  br label %cond.end, !dbg !27

cond.false:                                       ; preds = %entry
  %bf2 = getelementptr inbounds %struct.S, %struct.S* %s, i32 0, i32 1, !dbg !30
  store i32 10, i32* %bf2, align 4, !dbg !31
  br label %cond.end, !dbg !27

cond.end:                                         ; preds = %cond.false, %cond.true
  %cond-lvalue = phi i32* [ %bf, %cond.true ], [ %bf2, %cond.false ], !dbg !27
  store i32 5, i32* %cond-lvalue, align 4, !dbg !32
  %3 = load i32, i32* %retval, align 4, !dbg !33
  ret i32 %3, !dbg !33

HOWEVER, since at least one side is a bitfield, we give up with the error in the godbolt link.

I am looking at the implementation, however the fact that the bitfields CAN be mixed types (OR mixed with normal ints?) means that doing something similar with the ‘phi’ there isn’t really possible. I somehow need to come up with a way to generate an LValue (so it can be assigned to/etc), from a mixed type, where they may/may not be different bitfields.

I’m just not sure how to capture this in our LValue/Address mechanism, nor in IR. Does anyone else have a suggestion on what we can do here? GCC seems to switch this to an if/else branch based on cond (based on looking at their RTL on godbolt), but I’m not sure how we can do that with our insistence of having things return an ‘LValue’ type along the way, as the RHS-use is not available at that time. GCC basically does:

//Turns this:
(cond ? s.bf = 1 : s.bf2= 2) = 5;
// into this:
if (cond) {
    s.bf = 1; s.bf = 5;
   }
else {
  s.bf2 = 2; s.bf2= 5;
}

SO, does anyone have any suggestions on this? Or is this just a situation that isn’t going to be well supportable by our codegen implementation?

In general, you could represent the “lvalue” as a pair of lvalues, plus an LLVM IR value indicating which one to assign to. Then you can use a branch to dynamically choose the right codepath to use to read/write.

But it seems simpler to just avoid the lvalue codepath: just use a dedicated codepath for assignment operators where the LHS is a conditional operator and one of the operands is a bitfield. (This does mean you need to do a separate walk of the LHS to detect this, but there’s a limited set of expressions that can actually produce a bitfield, so probably not a big deal in practice.)

In general, you could represent the “lvalue” as a pair of lvalues, plus an LLVM IR value indicating which one to assign to. Then you can use a branch to dynamically choose the right codepath to use to read/write

I considered something like that, but the differing type potential seemed like it would cause further problems.

But it seems simpler to just avoid the lvalue codepath: just use a dedicated codepath for assignment operators where the LHS is a conditional operator and one of the operands is a bitfield. (This does mean you need to do a separate walk of the LHS to detect this, but there’s a limited set of expressions that can actually produce a bitfield, so probably not a big deal in practice.)

Can you clarify that bit? There are a TONS of places we call EmitLValue (roughly 170?) on an expression, so making them ‘detect’ whether they are working on a bitfield seems like we’d have to get into a LARGE number of different places, right?

I THINK you’re suggesting we do what GCC does here, and detect the presence of a ‘bitfield’ ConditionalOperator, and instead double-emit the RHS of the assignment?

The number isn’t as big as it looks; most of those places immediately call getAddress(), which asserts that it’s dealing with a “simple” lvalue. The relevant code should just be handling assignment and compound assignment operators.

You don’t really double-emit the RHS of the assignment, just the assignment itself. But yes, similar code to what gcc emits.

Ok, thanks! I realized after the fact (went into weekend + a snow day, so took a while to respond!), that I actually don’t CARE about most cases. My customer only really cares about the ‘root’ lvalue (that is, no assignment op at all), so I can at least start there to clear up the immediate problem.

Thanks again for your help!