Build issues and limitations with OpenMP on Arch Linux

Dear OpenMP-developers,

I maintain a custom LLVM-toolchain for Arch-Linux. My build configuration is specified in the PKGBUILD file over in this repo: archpkgbuilds/toolchain-experimental/llvm-git at main · ms178/archpkgbuilds · GitHub

Arch Linux traditionally splits up various LLVM components and builds OpenMP as a seperate package, however I find it more convenient to have a unified build. I use the custom toolchain primarily to optimize several performance-sensitive packages from the AUR and want to use Polly for further optimizations (but nothing of the fancy GPU offloading support for now), OpenMP is needed for certain Polly features. I’d like to do this for both 64-bit and 32-bit packages.

Along the way to get there, I saw two different issues:

  1. Distributions used to build OpenMP with the LLVM_ENABLE_PROJECTS configuration option (e.g. Clear Linux, see: llvm/llvm.spec at c2bbd6f68351b005eef7801034da6b2427a42bc2 · clearlinux-pkgs/llvm · GitHub). This however does no longer work for me as of Clang-15, and got several different build errors since May 2022, here is one example: Build error: '__nvvm_bar_warp_sync' needs target feature ptx60|ptx61|ptx63|ptx64|ptx65|ptx70|ptx71|ptx72|ptx73|ptx74|ptx75 · Issue #55446 · llvm/llvm-project · GitHub

The discussion there let me to believe that this build configuration was not supported in the first place and that OpenMP should be always built via the LLVM_ENABLE_RUNTIMES route. However, this complicates and limits the build process quite a bit, as the OpenMP runtime script currently hardcodes CMAKE to use the just-built LLVM/Clang and LLD for the OpenMP build. First, this limits the compiler and linker choices for the whole LLVM-build as the compiler/linker along with the CFLAGS on Arch are usually set globally in the /etc/makepkg.conf file and only when LLVM/Clang and lld is set there, the whole build is able to finish successfully when OpenMP is built as a runtime. If I want to build LLVM with GCC/ld or Clang/mold, and set the flags in /etc/makepkg.conf or within the PKGBUILD accordingly, the OpenMP runtime always errors out during the CMAKE check for building the OpenMP runtime. Is that expected or a bug?

Is it possible (or even advisable) to force the OpenMP runtime to build with a different compiler and linker than the preferred one? Would it be possible to limit the use of the just-built Clang/LLD for the OpenMP runtime only without impacting the other parts but without having to split up the several components in different packages again? Another route to solve this could be to broaden the support for other compilers/linkers that are detected/allowed to build the OpenMP runtime successfully? Would that be technically feasible? At least from my perspective and use case, the current CMAKE scheme does not seem to be flexible enough (or I might lack some expertise or imagination on how to deal with this problem properly).

  1. I was trying to build a 32-bit x86 LLVM toolchain with Polly and OpenMP (see: archpkgbuilds/toolchain-experimental/lib32-llvm-git at main · ms178/archpkgbuilds · GitHub) but could not get a 32-bit OpenMP runtime to work that way. It seems that a pre-existing 32-bit OpenMP package is needed (which is not offered via one of the Arch repos or via the AUR) to get the OpenMP runtime to compile successfully as the CMAKE check of the OpenMP runtime only finds the installed 64-bit OpenMP runtime and complains about the wrong ELF-class. This is a bit of a chicken/egg problem, how am I supposed to get a working 32-bit OpenMP package if there is no pre-existing 32-bit-OpenMP package available?

Thanks in advance for some help to overcome these obstacles!

  1. The only part that requires the weird compiler business is the GPU runtime. This is generated as LLVM-IR that needs to be built with the same compiler as the user applications for GPU offloading. You can disable this directly with LIBOMPTARGET_BUILD_DEVICERTL_BCLIB but you won’t be able to do GPU offloading. We made some recent changes to how the device runtime is built using the LLVM_ENABLE_PROJECTS route. It should not try to find the user’s clang binary and use that. Does this still cause problems for you?

  2. I haven’t done much testing with 32-bit OpenMP builds. It should work as far as I know, but you will probably need to do some cross-compilation with CMake. I don’t have much experience with that either unfortunately.

1 Like

Unfortunately, if I specify

    -D LLVM_TARGETS_TO_BUILD="AMDGPU;X86" \
    -D LLVM_ENABLE_PROJECTS="polly;lld;compiler-rt;clang;openmp" \
    -D LIBOMPTARGET_BUILD_DEVICERTL_BCLIB=OFF \

in my PKGBUILD, I get the following two CMake errors:

CMake Error at /home/marcus/Downloads/llvm-git/src/llvm-project/openmp/libomptarget/plugins/amdgpu/CMakeLists.txt:95 (add_dependencies):
The dependency target “omptarget.devicertl.amdgpu” of target
“omptarget.rtl.amdgpu” does not exist.

CMake Error at /home/marcus/Downloads/llvm-git/src/llvm-project/openmp/libomptarget/plugins/cuda/CMakeLists.txt:89 (add_dependencies):
The dependency target “omptarget.devicertl.nvptx” of target
“omptarget.rtl.cuda” does not exist.

Note that I need the AMDGPU target for Mesa as I have an AMD GPU but don’t need Cuda as I don’t own an Nvidia GPU nor do I want to use Cuda or NVPTX. Something still seems to be missing and even if I add NVPTX to my targets, I still get the same error output.

As @jhuber6 noted:

You can disable this directly with LIBOMPTARGET_BUILD_DEVICERTL_BCLIB but you won’t be able to do GPU offloading.
That said, CMake should not fail but simply not build those GPU runtimes.

Long story short, you need to build the OpenMP GPU runtimes with “a compatible clang”, preferably the one just build. You can in theory build the rest of OpenMP with your compiler and linker but I am doubtful we have the appropriate cmake magic already. It should be possible to add new flags to this end though.

As I’ve just added -D LIBOMPTARGET_BUILD_DEVICERTL_BCLIB=OFF to my PKGBUILD, I want to report back that this errors out the build with the output as shown in my post above, while it errors out at the beginnig via the LLVM_ENABLE_PROJECTS route (as noted in the post above), I also reproduced the same error when building OpenMP via the runtime route with the just-built LLVM/Clang/lld, but at the end of the build process, where the OpenMP runtime gets build.

Am I missing something? Analyzing the errors and the given CMake-files, it seems to me that OpenMP either still mandates omptarget.rtl.amdgpu and omptarget.devicertl.nvptx to be present or the check does not take the other configuration option into account, as I thought that this should disable GPU offloading completely. By the way, it would be great if there was documentation on how to build a CPU-only OpenMP offload capable compiler, the OpenMP FAQ (Support, Getting Involved, and FAQ — LLVM/OpenMP 16.0.0git documentation) doesn’t tell me anything about this use case; are all of the build configuration options documented elsewhere?

I’m not sure it seems to explicitly depend on those. Those libraries are only used by clang when compiling / linking for offloading. It will fail at compile / runtime if you don’t have them, but it shouldn’t prevent us from building the runtimes. @shiltian any opinions on removing that dependency?

I think we should make disabling the bitcode library build properly, then we can update the documentation to state that.

Thanks for taking care of these issues.

I also have good news to report, building OpenMP via the LLVM_ENABLE_PROJECTS route does compile fine again with LLVM/Clang. As that was my old way of compiling LLVM/Clang project this should give me less headaches in the meantime as building OpenMP via the runtime route was giving me the most issues lately. This should also unblock the 32-bit OpenMP / Polly build via that route, I still need to test that to be sure though.

Update: No luck with the 32-bit build with the projects route though, it errors out at the beginning with the same missing dependancy errors from above.