[RFC] [CMake] Removing support for LLVM_TOOL_<PROJECT> CMake cache variables

Hi,

In our CMake build system there are currently two ways of specifying
which LLVM sub projects to build by setting CMake cache variables.

* Setting `LLVM_ENABLE_PROJECTS` to the list of projects to enable
(e.g. `-DLLVM_ENABLE_PROJECTS=clang;compiler-rt`)
* Setting `LLVM_TOOL_<PROJECT>_BUILD` boolean CMake cache variables
(e.g. `-DLLVM_TOOL_CLANG_BUILD=ON -DLLVM_TOOL_COMPILER_RT_BUILD=ON`)

Having two different ways of specifying the same thing is problematic
because from the CMake perspective because we can't detect which way
the user actually wants to use.

Since r353148 if `LLVM_ENABLE_PROJECTS` is set by the user then that
is used to determine which projects are built and any user specified
value for the `LLVM_TOOL_<PROJECT>_BUILD` variables get overridden.
`LLVM_ENABLE_PROJECTS` currently only works with the new mono-repo
layout (projects outside of the LLVM source tree) which basically
means that:

* `LLVM_ENABLE_PROJECTS` is used for the mono repo project layout
* `LLVM_TOOL_<PROJECT>_BUILD` is used for the traditional in-tree
project layout (e.g. projects located at `tools/clang`, `tools/lldb`,
`projects/compiler-rt`).

This is a bit of a mess and I'd like to propose we switch to only
using `LLVM_ENABLE_PROJECTS` and remove support for
`LLVM_TOOL_<PROJECT>_BUILD` variables.

I see two ways of doing this:

* Graceful. We'll eventually move everyone over to the mono repo
layout anyway so just drop support for `LLVM_TOOL_<PROJECT>_BUILD`
variables as part of the process of removing support for the in-tree
subprojects inside the LLVM source tree. We just need to document this
change clearly.

* Aggressive. Remove support for setting `LLVM_TOOL_<PROJECT>_BUILD`
variables and only use `LLVM_ENABLE_PROJECTS`. The logic for
`LLVM_ENABLE_PROJECTS` would need to be changed to work with in-tree
subprojects.

I'd prefer Graceful because it's less work and I actually have old
scripts that rely on setting `LLVM_TOOL_<PROJECT>_BUILD` variables.
Others might too.

Thoughts?

Thanks,
Dan.

Hi,

In our CMake build system there are currently two ways of specifying
which LLVM sub projects to build by setting CMake cache variables.

* Setting `LLVM_ENABLE_PROJECTS` to the list of projects to enable
(e.g. `-DLLVM_ENABLE_PROJECTS=clang;compiler-rt`)
* Setting `LLVM_TOOL_<PROJECT>_BUILD` boolean CMake cache variables
(e.g. `-DLLVM_TOOL_CLANG_BUILD=ON -DLLVM_TOOL_COMPILER_RT_BUILD=ON`)

Having two different ways of specifying the same thing is problematic
because from the CMake perspective because we can't detect which way
the user actually wants to use.

Since r353148 if `LLVM_ENABLE_PROJECTS` is set by the user then that
is used to determine which projects are built and any user specified
value for the `LLVM_TOOL_<PROJECT>_BUILD` variables get overridden.
`LLVM_ENABLE_PROJECTS` currently only works with the new mono-repo
layout (projects outside of the LLVM source tree) which basically
means that:

* `LLVM_ENABLE_PROJECTS` is used for the mono repo project layout
* `LLVM_TOOL_<PROJECT>_BUILD` is used for the traditional in-tree
project layout (e.g. projects located at `tools/clang`, `tools/lldb`,
`projects/compiler-rt`).

This is a bit of a mess and I'd like to propose we switch to only
using `LLVM_ENABLE_PROJECTS` and remove support for
`LLVM_TOOL_<PROJECT>_BUILD` variables.

I see two ways of doing this:

* Graceful. We'll eventually move everyone over to the mono repo
layout anyway so just drop support for `LLVM_TOOL_<PROJECT>_BUILD`
variables as part of the process of removing support for the in-tree
subprojects inside the LLVM source tree. We just need to document this
change clearly.

* Aggressive. Remove support for setting `LLVM_TOOL_<PROJECT>_BUILD`
variables and only use `LLVM_ENABLE_PROJECTS`. The logic for
`LLVM_ENABLE_PROJECTS` would need to be changed to work with in-tree
subprojects.

I'd prefer Graceful because it's less work and I actually have old
scripts that rely on setting `LLVM_TOOL_<PROJECT>_BUILD` variables.
Others might too.

+1, we also have old scripts that we're currently migrating to the
monorepo layout, but it's nobody's top priority and we'd appreciate it
if we could take our time.

For the LLVM_ENABLE_PROJECTS (and LLVM_EXTERNAL_PROJECTS) case, dropping the LLVM_TOOL_*_BUILD variables makes sense. We could just change our build code for enabling projects to ignore those variables entirely (which is essentially the case after r353148 anyway).

For in-tree builds, the LLVM_TOOL_*_BUILD variables are the only way to control including/excluding projects, and I'd like to keep them around for as long as we support in-tree builds. It's useful to have the same source tree and build different configurations from it and only enable certain projects for certain configurations.

Dropping support for in-tree builds after the monorepo migration is an interesting question, because in theory people could still nest the read-only single project mirrors (assuming those end up coming to fruition) in the same style. I think it'd be good to reduce the number of supported configurations and clean up our build; I'm adding Chris to see what he thinks.

Dropping support for in-tree builds after the monorepo migration is an interesting question, because in theory people could still nest the read-only single project mirrors (assuming those end up coming to fruition) in the same style. I think it'd be good to reduce the number of supported configurations and clean up our build; I'm adding Chris to see what he thinks.

When we complete the migration to the monorepo my suggestion would be
to drop the special support we have today for in-tree builds. Instead
expose CMake cache variables that tell the CMake build where to find
the projects' source trees. By default they would be initialised to
their location in the monorepo but a user could pick any location. I
think this approach is better because it's more general (there is a
single source of truth for the location of a projects source tree),
and is general enough to allow the user to place projects, in tree if
they really want to.

This generality would also allow a user to build LLVM and other
projects with each project potentially at different revisions. Git
worktrees (i.e. create a worktree per project with a unique branch
name to avoid doing a git clone per project) could potentially make
this a bit easier too. It's a bit clunky but most people probably
shouldn't be mixing revisions anyway.

Thanks,
Dan.