Missing aggregate bitcast - or lowering existing SSA to LLVM IR

Rust supports data types known as enums, which are represented at runtime and in LLVM as tagged unions. We’ll consider as a very simple example the following code:

pub enum E {
    A(u8, u8),
    B(u32, u32),
}

pub fn foo(e: E) -> bool {
    let f = match e {
        E::A(x, y) => E::B(x as u32 + 1, y as u32),
        E::B(x, y) => E::A(x as u8 + 1, y as u8),
    };

    match f {
        E::A(x, y) => x == y,
        E::B(x, y) => x == y,
    }
}

We currently generate code for this via allocaing for f and then using the following typedefs and suitable pointer punning, GEP, etc.:

%E = type { i8, [11 x i8] }
%"E::A" = type { [1 x i8], i8, i8 }
%"E::B" = type { [1 x i32], i32, i32 }

rustc internally uses an IR known as MIR, and I am in the process of investigating what the costs, benefits, challenges, etc. of adding SSA support to MIR are. The code below is one option for what this might look like. Please do not think that it is the only one, if there are other representation of these ideas that make the question below easier, that is ok too.

// This is greatly simplified and exclusively there to get the point across
f: {i8, [11 x i8] } = phi[...] // previous branches omitted
tag: i8 = f.0
switch tag [0 => bbA, 1 => bbB]

bbA:
f_A: { [1 x i8], i8, i8 } = downcast f to E::A
retA: bool = f_A.1 == f_A.2
goto out

bbB:
f_B: { [1 x i32], i32, i32 } = downcast f to E::B
retA: bool = f_B.1 == f_B.2
goto out

out:
ret: bool = phi[bbA => retA, bbB => retB]

This works nicely for MIR, but the problem is that there is no good strategy for codegening this to LLVM IR. Specifically the downcast statements have no analogue - they would be most naturally represented by a bitcast with support for aggregates, but LLVM does not support this.

Question: Is there a good strategy for codegen-ing the above pseudo-IR to LLVM IR? And if not, what is the chance of LLVM acquiring the necessary support at some point in the future?

allocaing for f in LLVM after we have converted to SSA is an alternative I had considered, but this is somewhat painful as it seems that it is in general as hard as stack coloring. Besides that, I don’t really have any great ideas.

1 Like

MIR could go to MLIR with a Rust specific dialect and then to LLVM :slight_smile: