CMake Question: Do we need to support stand-alone builds?

Hello folks,

TL;DR: When working on better/simpler CMake suport for Clang/LLVM I am often thwarted by one of the most complex parts of the Clang setup: the ability to do a “standalone” build of Clang. I would like to get rid of this feature in order to simplify and make more progress. Any objections?

Most people don’t even know what the “standalone” build is, so in summary it allows you to check out just clang and to build it against a separate checkout and build tree of LLVM. It doesn’t re-use the build output tree of LLVM, or the source tree of LLVM.

Why does this exist? I’ll try to summarize the points from the last time I talked to Doug about it, but honestly, I don’t use this so I may mess it up. I’ve CC-ed Doug who is (I suspect) one of the few using it to clarify anything I miss:

  1. It allows updating the LLVM source checkout and build tree less frequently than the Clang checkout and build tree.
  2. It allows sharing a single LLVM checkout and build tree amongst many Clang checkouts and build trees.
  3. ???

Certainly #1 and #2 can (in some cases) combine to improve incremental rebuild speed, but in practice I rarely benefit from them as the time is now heavily dominated by running the test suite. Thus, I see this benefit as diminishing these days, and also as offset by improved performance of CMake, especially when coupled with the bleeding edge ninja system.

Now, why do I want to get rid of this? What is it preventing or getting in the way of?

  1. Remove duplication! Massive amounts of the CMake infrastructure of LLVM are copied into Clang’s CMake build in order for the latter to not depend on the former. What’s worse, these copies have evolved independently and now often diverge, duplicate bugs, and sometimes give birth to their own bugs.

  2. Remove duplication! The ‘lit’ based test running is needlessly duplicated in Clang’s CMake build. It is also incorrect, missing dependencies, and behaving differently from LLVM’s. Yep. ‘make check && make clang-test’ does not in fact test the same thing as ‘make check-all’. Scary, eh?

  3. Integrate with CompilerRT: This comes out of the LLVM projects subtree, and so is inherently missing in a stand-alone build.

  4. Integrate with libc++: Same story as CompilerRT.

  5. Integrate support for automatic bootstrapping: This will be among the most complex things to add to our CMake builds, but also one of the highest value. I’d really like to not spend time thinking about how this interacts with Clang’s standalone cmake bits, but I have to as long as its there.

So, thoughts?
-Chandler

My 2 cents: +1 for removing standalone builds. The complexity it
forces on the CMakeLists.txt has come to bite me in the past.

Hello folks,

TL;DR: When working on better/simpler CMake suport for Clang/LLVM I am often thwarted by one of the most complex parts of the Clang setup: the ability to do a “standalone” build of Clang. I would like to get rid of this feature in order to simplify and make more progress. Any objections?

Most people don’t even know what the “standalone” build is, so in summary it allows you to check out just clang and to build it against a separate checkout and build tree of LLVM. It doesn’t re-use the build output tree of LLVM, or the source tree of LLVM.

Why does this exist? I’ll try to summarize the points from the last time I talked to Doug about it, but honestly, I don’t use this so I may mess it up. I’ve CC-ed Doug who is (I suspect) one of the few using it to clarify anything I miss:

  1. It allows updating the LLVM source checkout and build tree less frequently than the Clang checkout and build tree.
  2. It allows sharing a single LLVM checkout and build tree amongst many Clang checkouts and build trees.
  3. ???

Certainly #1 and #2 can (in some cases) combine to improve incremental rebuild speed, but in practice I rarely benefit from them as the time is now heavily dominated by running the test suite. Thus, I see this benefit as diminishing these days, and also as offset by improved performance of CMake, especially when coupled with the bleeding edge ninja system.

  1. CMake generates gigantic project files for IDEs like Visual Studio and Xcode, which causes those IDEs to behavior very poorly, with long project load times and sluggish overall performance. It’s a significant productivity problem.

Now, why do I want to get rid of this? What is it preventing or getting in the way of?

  1. Remove duplication! Massive amounts of the CMake infrastructure of LLVM are copied into Clang’s CMake build in order for the latter to not depend on the former. What’s worse, these copies have evolved independently and now often diverge, duplicate bugs, and sometimes give birth to their own bugs.

  2. Remove duplication! The ‘lit’ based test running is needlessly duplicated in Clang’s CMake build. It is also incorrect, missing dependencies, and behaving differently from LLVM’s. Yep. ‘make check && make clang-test’ does not in fact test the same thing as ‘make check-all’. Scary, eh?

The duplication could be solved by installing some of LLVM’s CMake macros in a place where Clang could find and re-use them. Think of it this way: we’d like to make it easy to use LLVM, or LLVM+Clang, as a library, so that other projects that use CMake can easily import the “FindLLVM” module and make use of LLVM. Clang should be able to simply do this, so that Clang is just an external tool build on the LLVM core. LLDB is another external tool that builds on the LLVM core and on Clang, and so on.

  1. Integrate with CompilerRT: This comes out of the LLVM projects subtree, and so is inherently missing in a stand-alone build.

  2. Integrate with libc++: Same story as CompilerRT.

These might be reasonable to pull into the Clang build, since the Clang installation is not whole without them. Luckily, they’re fairly small.

  1. Integrate support for automatic bootstrapping: This will be among the most complex things to add to our CMake builds, but also one of the highest value. I’d really like to not spend time thinking about how this interacts with Clang’s standalone cmake bits, but I have to as long as its there.

I don’t see how a stand-alone build of Clang gets in the way of bootstrapping. Bootstrapping is going to be very odd in CMake regardless.

Overall, I look at “standalone Clang builds” as simply “Clang using LLVM as the library like it’s intended to be.” That’s an important use case in and of itself, and Clang is simply the LLVM-based tool that’s most near and dear to us. That doesn’t mean it’s build should be intertwined with LLVM’s build.

  • Doug

Douglas Gregor <dgregor@apple.com> writes:

Overall, I look at "standalone Clang builds" as simply "Clang using LLVM as
the library like it's intended to be." That's an important use case in and
of itself, and Clang is simply the LLVM-based tool that's most near and dear
to us. That doesn't mean it's build should be intertwined with LLVM's build.

That's a very good point, although I for one have never built Clang without
also building LLVM.

Hello folks,

TL;DR: When working on better/simpler CMake suport for Clang/LLVM I am often
thwarted by one of the most complex parts of the Clang setup: the ability to
do a "standalone" build of Clang. I would like to get rid of this feature in
order to simplify and make more progress. Any objections?

Most people don't even know what the "standalone" build is, so in summary it
allows you to check out *just* clang and to build it against a separate
checkout and build tree of LLVM. It doesn't re-use the build output tree of
LLVM, or the source tree of LLVM.

Why does this exist? I'll try to summarize the points from the last time I
talked to Doug about it, but honestly, I don't use this so I may mess it up.
I've CC-ed Doug who is (I suspect) one of the few using it to clarify
anything I miss:
1) It allows updating the LLVM source checkout and build tree less
frequently than the Clang checkout and build tree.
2) It allows sharing a single LLVM checkout and build tree amongst many
Clang checkouts and build trees.
3) ???

Certainly #1 and #2 can (in some cases) combine to improve incremental
rebuild speed, but in practice I rarely benefit from them as the time is now
heavily dominated by running the test suite. Thus, I see this benefit as
diminishing these days, and also as offset by improved performance of CMake,
especially when coupled with the bleeding edge ninja system.

3) CMake generates gigantic project files for IDEs like Visual Studio and
Xcode, which causes those IDEs to behavior very poorly, with long project
load times and sluggish overall performance. It's a significant productivity
problem.

This is the only reason I use it. A project file with llvm and clang
almost kills VS even on my crazy work machine.

- Michael Spencer

Now, why do I want to get rid of this? What is it preventing or getting in
the way of?
1) Remove duplication! Massive amounts of the CMake infrastructure of LLVM
are copied into Clang's CMake build in order for the latter to not depend on
the former. What's worse, these copies have evolved independently and now
often diverge, duplicate bugs, and sometimes give birth to their own bugs.

2) Remove duplication! The 'lit' based test running is needlessly duplicated
in Clang's CMake build. It is also incorrect, missing dependencies, and
behaving differently from LLVM's. Yep. 'make check && make clang-test' does
not in fact test the same thing as 'make check-all'. Scary, eh?

The duplication could be solved by installing some of LLVM's CMake macros in
a place where Clang could find and re-use them. Think of it this way: we'd
like to make it easy to use LLVM, or LLVM+Clang, as a library, so that other
projects that use CMake can easily import the "FindLLVM" module and make use
of LLVM. Clang should be able to simply do this, so that Clang is just an
external tool build on the LLVM core. LLDB is another external tool that
builds on the LLVM core and on Clang, and so on.

I agree that this is the proper solution.

Hi folks,

may I bring up this topic again ... I second Douglas here. Being through the
build process for clang multiple times (configure & cmake) it would really
save my day if we could separate the process like:

a) LLVM build + install

and one of:

b) clang build + install
c) compiler-rt build+install

or

b) compiler-rt build+install
c) clang build + install

or even

b) (clang & compiler-rt) build+install

So what is the problem for integrated (non-stand-alone) builds to pick-up the
settings/config from the llvm installation we're pointed to (or thats
installed in --prefix) ? Worse than the links forth and back across 3 svn
trees? ( llvm + llvm/tools/clang -> llvm/projects/compiler-rt )

Best,
Jan-Simon