[MLIR][GPU] Linking precompiled code (e.g., fatbin)

I’m trying to find a proper way of linking precompiled device functions to gpu.module in MLIR.
More specifically, assume I have compiled devicefuncs.cu with nvcc to produce a fatbin at somepath/file.fatbin.
Inside my gpu.func, I want to add calls to the compiled device functions (implemented in devicefuncs.cu).
What should be the right approach?
After I have generated the code that creates device function calls, I suppose I have to add device function declarations in the form of llvm.func @deviceFuncName(!llvm.ptr) -> !llvm.ptr to the relevant gpu.module. At which stage in the pipeline do I have to add the declarations, right after the first pass that produced the function calls or after -gpu-to-llvm?
And then I suppose I should supply somepath/file.fatbin to -gpu-module-to-binary as linkFiles option.
Is it that easy or am I missing some additional steps?

Short answer:

Linking ELF/Fat libraries is not yet supported. The linking field in gpu-module-to-binary is for LLVM bitcode libraries.

It’s in my TODO list to add that capability, but it will take a while before I get time to actually do it.

Patches are always welcomed.

Longer answer:

You could hack your way around:
First option:

  1. Compile to binary instead of Fatbin and pass the opts=-c option in gpu-module-to-binary.
  2. Create a pass that goes through each binary invoking the linker and fatbin packager to obtain the full binary.

Second option:
Override the implementation of gpu::TargetAttrInterface, you just need to add the implementation and register that interface instead of the default one.