Motivation
The MLIR ecosystem contains the NVVM dialect which extends the LLVM dialect to provide operations useful for programming NVIDIA GPUs. Similarly the ROCDL dialect extends the LLVM dialect and provides operations for programming AMD GPUs.
In order to provide similar functionality for programing Intel GPUs, we propose the addition of a new LLVM target dialect (GEN) to act as a counterpart to the NVVM and ROCDL dialects. The GEN dialect will provide operations for exposing to the MLIR ecosystem selected Xe ISA (codename GEN) assembly instructions. Hierarchically, the GEN dialect sits below the XeGPU dialect and it is our intention to supports lowering from the latter to the former where it makes sense.
Initially the GEN dialect will contain operations to:
- query GPU properties such as threads ID, block ID, block and grid dimensions, etc…
- emit barrier and group shuffle operations
- emit instructions useful to access systolic array HW for matrix operations
Example
The GEN dialect has been implemented in a branch in the Intel LLVM monorepo.
Below is an example of how an operation in the GEN dialect looks like:
def GENX_BarrierOp : GENX_Op<"barrier"> {
let summary = "Workgroup barrier";
string baseDescription = [{
The `genx.barrier` operation performs a workgroup barrier and ensures all outstanding memory
transaction using local or global memory are complete.
}];
string llvmBuilder = [{
llvm::Type *retType = builder.getVoidTy();
llvm::Type *argType = builder.getInt32Ty();
llvm::Value *arg = llvm::ConstantInt::get(argType, 3 /*memfence*/);
createDeviceFunctionCall(builder, "_Z7barrierj", retType, {argType}, {arg});
}];
let assemblyFormat = "attr-dict";
}