I heard that the context for this is something to do with XLA, and not so much that the types are needed in LLVM but in MLIR’s LLVM dialect to distinguish between the different 8-bit floats when lowering to target-specific matrix multiply ops.
I’d suggest that, instead, you’d define MLIR ops that wrap the lower-level froat8 operations and lower first to nvgpu.our_matmul_thing : ... e5m2
and then nvvm.our_matmul_intrinsic_e5m2 : i8
by defining a type conversion from e5m2
and e4m3
to i8
during the *ToLLVM
p4rocess.