Cross Compile Any Llvm Component Using Clang Only (No Gcc Requirement)?

It is quite possible to build a Linux toolchain with LLVM components, entirely from scratch, without involving GCC in any form.

It’s not entirely trivial, for a couple reasons:

  • Most of the LLVM components by default act as a single drop-in replacement for the corresponding component in an existing toolchain. E.g. Clang defaults to linking against GCC’s libstdc++ and libgcc, on Linux platforms. This is possible to override though - either at runtime via wrapper scripts, or with config files, or by hardcoding new defaults into the clang binary via cmake options.
  • The default all-at-once setup for building both Clang and the other tools, and the cross built runtimes, assumes that you already have a premade sysroot for your target - and that’s what we’re trying to build here. So for such a target, it’s at least easiest to orchestrate the build manually by building the various components one at a time.

I do this kind of bootstrapping from scratch all the within llvm-mingw, but that’s for a mingw cross toolchain. For Linux targets, I’ve actually also played with setting up something similar, with Musl - see Comparing master...musl · mstorsjo/llvm-mingw · GitHub and llvm-mingw/build-all.sh at musl · mstorsjo/llvm-mingw · GitHub. (The branch is a conversion of my mingw toolchain to test building for a musl target; the conversion isn’t entirely complete, there are a bunch of leftovers all around, including the readme.)

The basic build sequence is this:

  1. Build LLVM/Clang tools
  2. Add wrappers, like x86_64-linux-musl-clang, which invoke clang with the right --target and --sysroot arguments and such. (It should be possible to avoid this by just symlinking x86_64-linux-musl-clang to the regular clang binary and using config files, but my current setup/install layout kinda requires this.)
  3. Configure building Musl, but only install the headers. (Building it won’t succeed yet at this point.)
  4. Install Linux kernel headers
  5. Build compiler-rt builtins. This is used instead of libgcc, in this setup, as the wrappers pass -rtlib=compiler-rt.
  6. Build Musl
  7. Build libunwind/libcxxabi/libcxx

With the branch I linked above, you get one single toolchain that can target a large number of architectures (I’ve tested building it targeting i386, x86_64, arm, aarch64, powerpc64le and riscv64) all with one compiler binary, one set of headers, and only separate libraries for each architecture. (Note, I’ve just recently rebased the branch past 6 months of other changes, so I may take a couple iterations before it runs successfully on github actions right now, but it runs fine locally.)

Note that with a toolchain like this, you can cross-build binaries for the Musl C library, which isn’t compatible with glibc (but if you have Musl installed next to glibc, or if you link statically, it works fine). Built binaries obviously also link against libc++, so it won’t be binary compatible with libstdc++ either - if one wants a toolchain that targets libstdc++, it quite obviously requires involving GCC to build that at least.

It should, in theory, be possible to build a similar toolchain that uses Glibc instead of Musl, but building Glibc with LLVM tools is much more tricky, it requires a long branch of patches that aren’t upstream, and I haven’t sorted out how to successfully build/bootstrap that.

But the general concept of setting up a cross toolchain from scratch with only LLVM tools, plus a third party libc (mingw-w64 or Musl) is quite doable.

3 Likes