An opinionated way to outsource, simplify and unify repetitive CMake code in standalone build mode

I’ve proposed an opinionated change that affects how we build projects in standalone mode (e.g. for linux distributions). Let me put some of the information here so we have a place to discuss the overall idea and motivation okay?

Rationale

The rationale behind this change is to unify the CMake parts of standalone
builds that historically had to be kept in each and every project. With
the advent of the top-level cmake directory inside the LLVM source
tree, it is now possible to bundle the CMake instructions into a file,
aka cmake/Modules/StandaloneBuildHelpers.cmake, and include that file
in each project that wants to build in standalone-mode.

Historically the standalone build mode is used mostly by Linux
distributions. Certainly not every LLVM contributor cares about Linux
distributions. To reduce the frictions it makes even more sense to have
a unified place where to keep the specialities of building in standalone
mode.

Affected projects (so far)

This change brings the unified standalone build mode to the clang and
lld project.

Assumptions

One radical assumption for this change is that in order to build clang
or ldd in standalone mode, you have to first build the llvm subproject
and install it into any location. You can assist the build process to
find LLVM using find_package(LLVM) by specifying
-DCMAKE_PREFIX_PATH=${LLVM_INSTALL_DIR}/lib/cmake/llvm -DLLVM_CMAKE_DIR=${LLVM_INSTALL_DIR}/lib/cmake/llvm in the cmake
configuration process.
You have to build the llvm subproject with utilities included and (optionally) installed (-DLLVM_INCLUDE_UTILS:BOOL=ONand-DLLVM_INSTALL_UTILS:BOOL=ON`. But
I’m sure that this is done most of the time anyways, no?

Don’t build as you go: No more cross-project dependencies on LLVM utilties

Another assumption is that in standalone build mode it makes no sense to
build clang and try to build an LLVM utility binary like FileCheck if
that is missing. This only adds noise to the cmake files and creates an
indirect dependency on the LLVM utilities directory which doesn’t exist
in the the clang source tarball.

Don’t silently turn off tests

Before this change, we would silently turn off tests if a binary like
FileCheck, count or not was missing. This is not only dangerous
but IMHO not helpful. If someone asks for tests by passing
-DLLVM_INCLUDE_TESTS=On we should error out at configure time because
we cannot fulfil this request when a binary is missing. This is exactly
what this tests does. If you want to check if an LLVM utility binary
exists and what the path to it is, you can call
require_llvm_utility_binary_path("FileCheck") and it will
make sure the import location for the target exists, aka the path to the binary that
was found when we did find_package(LLVM).

NOTE: You can take a look at this small example project which shows you
how importing of an installed project works:

Require external LIT in standalone mode

We also think that in standalone mode you always want to use an external
lit and not build it as you go. That’s why the find_external_lit macro
checks if LLVM_EXTERNAL_LIT is set and the path exists. If one of
these conditions doesn’t hold true, we error out.

Some TODOs:

( ) make sure the correct binaries of FileCheck and count and not
get substituted in lit test files.
( ) get feedback on this change or just opinions
( ) extend usage to other projects like mlir, libomp and so on.
( ) more encapsulation in cmake/Modules/StandaloneBuildHelpers.cmake

I’d love to hear your feedback on this.

I think I generally like the sound of this. Thought I question how much the opinionated part actually helps — as long as their are non-standalone builds need to exist we need to support both, so some of the semu-standalone stuff can be gotten “for free”? Either way, using /cmake to deduplicate stuff, and not having so many (public, hard to change) utility stuff in /llvm definitely strikes me as a good question.

LLDB is an extensive user of the standalone build. We were one of first projects verifying this configuration in CI. Please take us into consideration when determining the impact of these changes.

I can relate to the fact that the standalone build requires a bunch of downstream CMake code. I’m very supportive of making the situation better.

A few things that come to mind when reading through your proposal:

  • You’re proposing to drop support for standalone builds against an LLVM build tree. Can you elaborate on the need for this and what it buys us? We’ve been explicitly supporting both because of the Swift build system that uses the standalone build. Having to install LLVM will slow down and complicate the build and make it harder for developers to iterate.

  • With the need for an external lit, how can downstream projects implement something like check-lldb for the standalone build? LLDB has to start building its own lit because in that configuration it doesn’t know about the downstream map_config directives.

1 Like

Is there something specific in lldb that requires this? I don’t think clang and lld need this for standalone builds.

This was a while ago so I don’t remember all the details. Does check-lld and check-clang work in the standalone build? If so, do you know how the path remapping happens? Maybe there’s another way to do this that doesn’t involve map_config which is a private function from llvm-lit.

@JDevlieghere Yes, check-lld and check-clang work fine. I don’t really know much about map_config and what it does, so I’m not sure what those projects are doing.

Thanks for confirming. I’ll take another look at this and hopefully there’s a way for us to achieve this without having to rely on our own lit.

On that specific topic, using an external lit vs recreating a local one (where the latter is needed for properly remapped paths), I recently posted ⚙ D137774 [openmp] Create a local llvm-lit script when building openmp standalone for switching OpenMP to use a local lit (for the sake of path remapping) - if there are other insights in this area, better ways of doing it etc, this patch would also benefit from it.

Absolutely. Thank you for your feedback so far.

@JDevlieghere thank you for bringing this to my attention. It seems I can learn so much every day.

I took my time to do my homework and loved that I could distil an example repository on github that proofs to me what I would need in order to support both, the installation method and the build-tree method that you spoke about.
I know in LLVM we have lots of wrappers around cmake functions but if you look at this example: GitHub - kwk/cmake-export-binary-example . Is this in essence what you would be needing? In my example both subprojects, subproject-b and subproject-c consume subproject-a but with different locations. The first uses the installation location the later the build tree. If I’m not mistaken we already have something similar in LLVM so it shouldn’t be a problem to continue to support both methods.
I guess my initial confusion came from thinking about build-tree as a full mono-repo build using LLVM_ENABLE_PROJECTS but this isn’t what you meant I hope.

Having said that, there is one more thing I would still like to do:

When in standalone build mode, I don’t want any dependency on other projects other than the one I’m building, with the exception of pulling in stuff with find_package. But things like add_subdirectory(../some-sibling/project/dir) should go. As a consequence, if you don’t have i.e. FileCheck, you should make sure to build (and optionally install) it beforehand. Do you agree that this is okay? If you want to make sure that the FileCheck target exists (either from the build tree or the install site) you should call my require_llvm_utility_binary_path function.

I look forward to your feedback.

The check-lldb topic I postpone a bit.