[RFC] Extending MLIR GPU device codegen pipeline

fabianmc · September 11, 2023, 8:12pm

Hi @csigg, thank you!

There are a couple of ways to specify different paths for the CUDA toolkit at runtime, for example, the NVVM target will query the following environment variables:

CUDA_ROOT
CUDA_HOME
CUDA_PATH

and if any of those are non empty, then the compilation mechanism will use that path for searching for the tools. You can also use --gpu-module-to-binary=toolkit=/path/to/toolkit to specify the path, that one always takes precedence.

Ok, there are multiple answers to this question. If you use --gpu-module-to-binary=format=bin, then the target and the GPU must be an exact match, if you use --gpu-module-to-binary=format=fatbin, then the NVVM target produces a fatbin with the PTX embedded -this option is not available in the nvptx-compiler lib, so the driver should be able to JIT the code if there’s an arch mismatch, that’s how I got working the tests irregardless of the platform. The default behavior is trying to produce the fatbin.

Please let me know if 1 works for you and if it solves the needs of your setup.

Yes, I’m working on a fully JIT version, it should be in trunk in the next few days. Also, I was planning to push today the deprecation of the old passes do you feel that you need more time?

csigg · September 11, 2023, 8:25pm

Hi Fabian, thanks for the quick answer. I will investigate whether we can have the CUDA toolkit binaries at a predetermined path. That would allow using the fatbin approach (which I overlooked).

From our side, you don’t need to hold back landing the deprecation.

fabianmc · September 14, 2023, 10:35pm

JIT compilation for NVIDIA targets is now in trunk https://github.com/llvm/llvm-project/pull/66220 . To enable it for testing one needs to specify the CMake flag -DMLIR_GPU_COMPILATION_TEST_FORMAT=isa at build time, note this flag should not trigger any recompiles as it only affects the testing infra.

jungpark · October 30, 2023, 12:16pm

Hi,
I’m wondering if adding support for --gpu-module-to-binary=format=spirv makes sense.
I was looking for a existing pass performing the spirv serialization but can’t find any and wonder this could be a right place.

fabianmc · October 30, 2023, 2:27pm

There’s currently a set of PRs implementing SPIR-V support,

jungpark · October 30, 2023, 2:35pm

Oh, I can see. Great to know, thanks!

Topic		Replies	Views
[PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation LLVM Dev List Archives	1	138	May 15, 2012
GSoC 2012 Proposal: Automatic GPGPU code generation for llvm LLVM Dev List Archives	11	116	April 4, 2012
[PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation LLVM Dev List Archives	51	317	May 10, 2012
LLVM for JIT only use LLVM Dev List Archives	7	112	November 9, 2004
How to generate nvidia cuda bin (cubin) from MLIR? Beginners mlir	12	907	December 5, 2023

[RFC] Extending MLIR GPU device codegen pipeline

Related topics