[RFC] Add NV-GPU dialect (HW specific extension of GPU dialect for Nvidia GPUs)

herhut · April 7, 2022, 10:09am

Can you explain this aspect a little more? As it is proposed, I see that we only have an operation to load a matrix but no way to compute on these. The new formulation is incompatible with the gpu dialect ops, as it uses a different type.

So how would you model the computation itself? By exposing the result of the load as a vector, the IR creates the impression that you can actually access it like any regular vector. A similar approach is taken in the AMX dialect with its tiles, so maybe we should not treat this as related to the gpu dialect with its aim to abstract over hardware and instead make it part of the vector dialect family.

I do not see how this is exposed in the IR. How is this conceptually different to optimizing memory layout for your cache hierarchy in tiling, even though the ops do not expose the specifics of the hardware you are targeting.
Is the goal to use a specific operation so that it is visible which target the IR is being compiled for by choice of operation?

Topic		Replies	Views
[RFC] Cleaning the GPU dialect MLIR gpu	54	1891	February 17, 2026
[RFC] Add XeGPU dialect for Intel GPUs MLIR	21	12635	February 22, 2024
[RFC] Add GEN dialect for Intel GPUs MLIR	64	3832	May 3, 2024
[Abandoned][RFC] AVX512-specific Dialect for implementing and benchmarking XNNPack in MLIR MLIR	22	1991	March 4, 2020
Move / Remove `vcix` dialect MLIR riscv	48	1240	July 3, 2025

[RFC] Add NV-GPU dialect (HW specific extension of GPU dialect for Nvidia GPUs)

Related topics