Clarifying the supported ways to build libc++, libc++abi and libunwind

[Cross-post to llvm-dev to make sure everybody relevant sees this]

Hi,

I’m currently trying to simplify the libc++/libc++abi/libunwind build systems and testing setup. In doing so, I am encountering issues related to “unusual” ways of building them. By unusual, I just mean “not the usual monorepo build with LLVM_ENABLE_PROJECTS”. I would like to pin down what the set of supported use cases for building the runtime libraries are. In particular, the world I would like to live in is one where the only way to build libc++/libc++abi/libunwind is:

$ mkdir build
$ cd build
$ cmake /llvm -DLLVM_ENABLE_PROJECTS=libcxx;libcxxabi;libunwind
$ ninja -C build install-{cxx,cxxabi,unwind}

The “runtimes” build would be built on top of this – it would be just a driver for building these libraries using documented options against the just-built Clang. I think it already does so in essence, however if I’m not mistaken it uses the “Standalone build” and it definitely sets some magic and undocumented CMake variables (like HAVE_LIBCXXABI) that we have to be really careful not to break.

So, to better understand what people use today, I have some questions. I know the answer to some of those, but I want to see what others have to say:

  1. What is a “Standalone build”? What does it enable that a normal monorepo build can’t?
  2. What is the “Runtimes” build? How does it work, what is it used for, and what does it expect from libc++/libc++abi/libunwind?
  3. Are there other “hidden” ways to build the runtime libraries?

Cheers,
Louis

[Cross-post to llvm-dev to make sure everybody relevant sees this]

Hi,

I'm currently trying to simplify the libc++/libc++abi/libunwind build systems and testing setup. In doing so, I am encountering issues related to "unusual" ways of building them. By unusual, I just mean "not the usual monorepo build with LLVM_ENABLE_PROJECTS". I would like to pin down what the set of supported use cases for building the runtime libraries are. In particular, the world I would like to live in is one where the only way to build libc++/libc++abi/libunwind is:

    $ mkdir build
    $ cd build
    $ cmake <monorepo-root>/llvm -DLLVM_ENABLE_PROJECTS=libcxx;libcxxabi;libunwind <options>
    $ ninja -C build install-{cxx,cxxabi,unwind}

The "runtimes" build would be built on top of this -- it would be just a driver for building these libraries using documented options against the just-built Clang. I think it already does so in essence, however if I'm not mistaken it uses the "Standalone build" and it definitely sets some magic and undocumented CMake variables (like HAVE_LIBCXXABI) that we have to be really careful not to break.

So, to better understand what people use today, I have some questions. I know the answer to some of those, but I want to see what others have to say:

1. What is a "Standalone build"? What does it enable that a normal monorepo build can't?

We use stand-alone builds in Fedora. How we build is essentially like this:

$ mkdir build
$ cd build
$ cmake <monorepo-root>/libcxx <options>
$ ninja -C build
$ ninja -C build install

The main advantages of building this way is that you don't need the full
source tarball, just the libcxx source, and also it ensures that the
default targets only build the parts that we need.

For reference, the spec files we use for building these projects can be found here:
https://src.fedoraproject.org/rpms/libcxx/blob/master/f/libcxx.spec
https://src.fedoraproject.org/rpms/libcxxabi/blob/master/f/libcxxabi.spec

-Tom

Hello Louis,

I earlier this week converted our toolchain scripts from a monorepo
build for libc++ to a "standalone build" where I build libunwind,
libc++abi and libc++ in succession instead of one single build. The
only reason I converted to this is because it's the only sane way to
build libc++ for a system without a existing C++ runtime. The problem
is actually not that libc++ depends on a existing C++ runtime - but
rather that CMake testing does.

When running a monorepo build on a system without a C++ runtime it
will fail a lot of tests from HandleLLVMOptions, while the standalone
builds for unwind, libc++abi and libc++ is much simpler and have less
tests I need to force a value for. This way I don't need first add a
c++ runtime.

My scripts are an adoption of mstorsjo's llvm-mingw (his script to
build libc++ for mingw can be seen here:
https://github.com/mstorsjo/llvm-mingw/blob/master/build-libcxx.sh).

I am all for simplifying this build on the other hand - building
libc++ and knowing which of the three should be static or dynamic and
when to use the USE_STATIC_UNWINDER etc is very complex and required a
lot of iterations to get it to work and I am still not sure if I did
the best thing (I ended up with static libc++abi but shared unwinder).

Thanks,
Tobias

1. What is a "Standalone build"? What does it enable that a normal

monorepo build can't?

A build where only a specific project is compiled, i.e. not as part of LLVM.

I build a LLVM toolchain for development at the company I work for. I only need to crosscompile compiler-rt, libcxx, libcxxabi, libunwind for the AArch64 target machine. Standalone builds save a lot of build time and simplify the setup because I don't have to crosscompile the whole LLVM, which is useless to me.

Standalone builds also simplify packaging, because I can install each project into its own directory.

I would very much like this to stay possible and will gladly fix issues found with this setup.

>2. What is the "Runtimes" build? How does it work, what is it used for, and what does it expect from libc++/libc++abi/libunwind?

The above process of building the runtime libraries and packaging them.
I also statically link compiler-rt, libcxxabi and libunwind into libcxx.

(CCing Chris and Petr, who’ve done the most work on the runtimes build)

At least for me on Linux, using LLVM_ENABLE_PROJECTS is actually the unusual way of building libc++; I use LLVM_ENABLE_RUNTIMES. The reason is, my host compiler is often gcc, but I want to build, test, and ship libc++ with the clang I just built.

The runtimes build is when you use LLVM_ENABLE_RUNTIMES. It sets up the build of all runtimes (compiler-rt, libc++, libc++abi, libunwind, etc.) as a CMake ExternalProject which depends on the build of clang and other toolchain tools. In other words, if I run the following:

cmake -DLLVM_ENABLE_PROJECTS=clang -DLLVM_ENABLE_RUNTIMES=‘libcxx;libcxxabi’ path/to/my/llvm-project/llvm

ninja cxx

The build system will automatically build clang and other toolchain tools (e.g. llvm-ar), run the ExternalProject configuration with e.g. CMAKE_C_COMPILER and CMAKE_CXX_COMPILER set to the just-built clang, and then build libc++ with that configuration (so with the just-built clang). It’s a pretty convenient workflow for my setup. It also takes care of e.g. automatically rebuilding libc++ if you make changes to clang and then run ninja cxx again.

As for why the runtimes build use the “standalone build” setup, it’s because there’s a separate CMake configuration happening for the runtimes in this setup (which is necessary in order to be able to configure them to use the just-built toolchain), so e.g. clang isn’t available as an in-tree target. See https://reviews.llvm.org/D62410 for more details. Your top-level CMakeLists.txt in the runtimes build is llvm/runtimes/CMakeLists.txt and not libcxx/CMakeLists.txt (as it would be in a fully standalone build), but it’s also not llvm/CMakeLists.txt (as it would be with LLVM_ENABLE_PROJECTS).

At the CMake round table at the dev meeting last October, we’d discussed the runtimes builds, and Chris had advanced that there should be two supported ways to build the runtimes:

Thanks Shoaib for a great summary. To summarize this as an answer to Louis’ questions:

  1. What is a “Standalone build”? What does it enable that a normal monorepo build can’t?

This means building any of the runtimes separately, where the runtime’s CMakeLists.txt (e.g. path/to/my/llvm-project/libcxx/CMakeLists.txt) is the top-level one. The reason for using this variant is the ability to build individual runtimes without the rest of LLVM (e.g. building only libc++) which is not something that the monorepo build can do.

  1. What is the “Runtimes” build? How does it work, what is it used for, and what does it expect from libc++/libc++abi/libunwind?

This means building runtimes with the just-built Clang. The reason why this is complicated is because at the point when you run CMake for runtimes (i.e. where path/to/my/llvm-project/llvm/runtimes/CMakeLists.txt is the top-level one), you may not have a fully working toolchain yet.

For example, in case of Fuchsia, our sysroot contains only libc, so when we’re doing the runtimes build, and one of the runtimes tries to use check_cxx_compiler_flag, which behind the scenes tries to compile and link a small C++ binary, that check is going to fail not because the flag isn’t supported, but because we don’t have libc++ yet (that’s what we’re trying to build right now). So we need to be really careful and avoid introducing cycles e.g. where libc++ CMake build depends C++ standard library (even if that dependency is not explicit).

Note that this is going to become significantly easier after we upgrade CMake because 3.6 introduced https://cmake.org/cmake/help/v3.6/variable/CMAKE_TRY_COMPILE_TARGET_TYPE.html which allows building static archives instead of executables when running check_* functions which can help and break these cycles.

  1. Are there other “hidden” ways to build the runtime libraries?

Not “hidden”, but there’s the default way which most developers use where you use the same host compiler to build both Clang and your runtimes. I think this mode should go away because it’s too fragile: it silently relies on your host compiler and the Clang you just built using the same ABI, which isn’t guaranteed (unless you’re using a multi-stage build) and I’m surprised we haven’t yet seen issues due to this (or maybe we did and people just aren’t aware of the problem).

(Why do we need HAVE_* flags?)

It’s because in the runtimes build, there’s no guarantee about the order in which runtimes are being built, so you cannot use e.g. if (TARGET cxxabi_shared) from within libc++ build because you don’t have any guarantee whether libc++abi has already been processed. So instead, runtimes build sets these flags for each runtime being built (that is each runtime specified in -DLLVM_ENABLE_RUNTIMES=) and you can at least check whether that runtime is being built at all (e.g. you can check from within libc++ whether libc++abi is also being built as part of the runtimes build).

There’s more discussion related to this in https://reviews.llvm.org/D68833, once we update CMake to >=3.11 we’ll eliminate all HAVE_* variables and replace them with generator expressions which is a much better solution.

We use standalone builds in Chrome OS to build libc++ for different ISA targets. One reason to do so is to cross-compile libc++ only for the target Chromebooks.
We do not want to cross-compile all of clang/llvm as they are not going to be shipped on device.

The cmake invocation roughly looks like:
cmake /libcxx -DLLVM_ENABLE_PROJECTS=libcxx -DCMAKE_INSTALL_PREFIX=/path/to/target_root/usr

Thanks,
Manoj

This is one of the use cases for the runtimes build. You can invoke CMake as:

cmake
-DLLVM_ENABLE_PROJECTS=clang
-DLLVM_ENABLE_RUNTIMES=libcxx
-DLLVM_RUNTIME_TARGETS=aarch64-unknown-linux-gnu;armv7-unknown-linux-gnueabihf;i386-unknown-linux-gnu;x86_64-unknown-linux-gnu
path/to/llvm-project/llvm

This is going to compile Clang, and then use the just-built Clang to cross-compile libc++ to four different targets. There are a lot of other customization options. You can also pass-through additional flags to individual targets, but this gets complicated quickly, so we usually put this logic into a cache file (see for example https://github.com/llvm/llvm-project/blob/master/clang/cmake/caches/Fuchsia-stage2.cmake) and then invoke CMake as:

cmake
-C path/to/llvm-project/clang/cmake/caches/Fuchsia-stage2.cmake
path/to/llvm-project/llvm

Aside from the fact that you can do everything in a single CMake build without any additional wrapper scripts, another advantage is that anyone can easily reproduce your build without needing anything else other than the LLVM monorepo.

I also forgot to mention, another reason to do separate standalone builds is during bootstrapping. When building clang for the first time, Glibc and Linux kernel headers for some of the target architectures we support are not available.

We therefore have to do multiple builds in order like:

  1. Build clang for host ISA and install it in SDK.

  2. Build the rest of the SDK with the clang we just built.
    (much much later)

  3. Build and install glibc and linux headers for each supported target ISA.

  4. Build and install libc++ for each supported target using the clang that we built in step 1.

So at this point we are just done building clang and other libraries that we need in the SDK that supports multiple architectures.
This SDK is then packaged and used in many projects inside Google, not just Chrome OS.

Now for a target Chromebook, we do not copy over the generic libc++ we have in SDK but do a re-build.
This also let us use the accurate ISA flags for the chromebook that are used e.g. march=skylake or mcpu=cortex… etc.

The steps for a Chromebook build looks like:

  1. Build and install glibc in target root

  2. Build and install libc++ in target root (we do not want to build llvm/clang etc.)

  3. Build the rest of the target system linking with the libc++ we built for target.

This is one of the use cases for the runtimes build. You can invoke CMake as:

cmake
-DLLVM_ENABLE_PROJECTS=clang
-DLLVM_ENABLE_RUNTIMES=libcxx
-DLLVM_RUNTIME_TARGETS=aarch64-unknown-linux-gnu;armv7-unknown-linux-gnueabihf;i386-unknown-linux-gnu;x86_64-unknown-linux-gnu
path/to/llvm-project/llvm

This is going to compile Clang, and then use the just-built Clang to cross-compile libc++ to four different targets. There are a lot of other customization options. You can also pass-through additional flags to individual targets, but this gets complicated quickly, so we usually put this logic into a cache file (see for example https://github.com/llvm/llvm-project/blob/master/clang/cmake/caches/Fuchsia-stage2.cmake) and then invoke CMake as:

cmake
-C path/to/llvm-project/clang/cmake/caches/Fuchsia-stage2.cmake
path/to/llvm-project/llvm

Aside from the fact that you can do everything in a single CMake build without any additional wrapper scripts, another advantage is that anyone can easily reproduce your build without needing anything else other than the LLVM monorepo.

I thought I mentioned it but we do not want to be cross-compiling llvm/clang here, just libc++. We’ll use system clang that we already built and installed in the SDK.

Thanks,
Manoj

Folks,

I just wanted to reply to this thread since I never had the chance to do so before. I did take all of the input in the thread into consideration and learned a lot about all of your respective use cases. This has been very useful, thanks for taking the time to share your use cases with me.

The basic conclusion I come out of this with is that it doesn’t make sense to build the runtimes as part of the rest of LLVM, and I agree with this. I think the runtimes should always be built “standalone”, however I also believe they should be built against each other, i.e. we should be able to build (at least) libc++/libc++abi/libunwind with a single CMake invocation, outside of the rest of LLVM. That would basically be a standalone build, except for all the runtimes you decide to build at once. That just reduces the build system complexity and removes the possibility for subtle mismatches between runtime projects.

I’ve created a RFC explaining this and the reasons for it, and sent it to llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142384.html. Please chime in if you’d like to add something to the discussion.

Thanks again,
Louis