[RFC] Extending MLIR GPU device codegen pipeline

Hi @csigg, thank you!

  1. There are a couple of ways to specify different paths for the CUDA toolkit at runtime, for example, the NVVM target will query the following environment variables:
CUDA_ROOT
CUDA_HOME
CUDA_PATH

and if any of those are non empty, then the compilation mechanism will use that path for searching for the tools. You can also use --gpu-module-to-binary=toolkit=/path/to/toolkit to specify the path, that one always takes precedence.

  1. Ok, there are multiple answers to this question. If you use --gpu-module-to-binary=format=bin, then the target and the GPU must be an exact match, if you use --gpu-module-to-binary=format=fatbin, then the NVVM target produces a fatbin with the PTX embedded -this option is not available in the nvptx-compiler lib, so the driver should be able to JIT the code if there’s an arch mismatch, that’s how I got working the tests irregardless of the platform. The default behavior is trying to produce the fatbin.

Please let me know if 1 works for you and if it solves the needs of your setup.

Yes, I’m working on a fully JIT version, it should be in trunk in the next few days. Also, I was planning to push today the deprecation of the old passes do you feel that you need more time?

1 Like

Hi Fabian, thanks for the quick answer. I will investigate whether we can have the CUDA toolkit binaries at a predetermined path. That would allow using the fatbin approach (which I overlooked).

From our side, you don’t need to hold back landing the deprecation.

1 Like

JIT compilation for NVIDIA targets is now in trunk https://github.com/llvm/llvm-project/pull/66220 . To enable it for testing one needs to specify the CMake flag -DMLIR_GPU_COMPILATION_TEST_FORMAT=isa at build time, note this flag should not trigger any recompiles as it only affects the testing infra.

2 Likes

Hi,
I’m wondering if adding support for --gpu-module-to-binary=format=spirv makes sense.
I was looking for a existing pass performing the spirv serialization but can’t find any and wonder this could be a right place.

There’s currently a set of PRs implementing SPIR-V support,

Oh, I can see. Great to know, thanks!