I have been playing with the clang driver to see how it performs the compilations for an OpenMP program with target offloading (for NVidia GPU).
I noticed the use of the tool ‘clang-offload-bundler’ which seems to bundle together the object files for the host and for the device.
In particular, if I use -c my foo.o will be a bundle of foo-x86_64.o and foo-cuda.o, and at link time it will unbundle the foo.o to obtain back the two object files.
Now, if I put my foo.o for example in a static library clang does not work because it does not know how to unbundle the object files from the library.
Other compilers, such as IBM XL create a specific section in the ELF to store the device code, in this way we always have one object file and there is no need to bundle/unbundle.
Also if a object file/library was compiled with XL it can not be linked with clang and/or viceversa.
So, am I doing something wrong, or is this the status of the clang driver?
Is the clang-offload-bundler the official choice to manage device code?
If my analysis is correct, what’s the workaround?