[RFC] LLVM Directory Structure Changes (was Re: [PATCH] D20992: [CMake] Add LLVM runtimes directory)

Moving to llvm-dev (I think this has gone a bit further than a patch review discussion)

In hindsight I probably should have explained more of my thinking on this with the patch, or done an RFC on llvm-dev to start with. I’l do that now, and answer the questions along the way. I sent a separate email discussing Justin’s patch review feedback.

In the build system today there is no strong distinction between ‘projects’ and ‘tools’. There are a few subtle differences, but I’m not sure any of them really matter. The differences are:

(1) The projects directory is always configured, tools can be disabled using LLVM_INCLUDE_TOOLS=Off (projects and tools can both be individually disabled too)
(2) Projects are configured before tools, so tools can rely on targets being created for projects (we don’t really use this, and anywhere we are is probably a bug)
(3) Some projects have special handling. For example test-suite isn’t actually treated as a project, it has special handling in LLVM/CMakeLists.txt:727, and Compiler-RT is handled by clang if you set LLVM_BUILD_EXTERNAL_COMPILER_RT=On.

With this in mind I was thinking about the general usability of our build system. The distinction between a project and a tool is not very clear. At a high level I see three different use cases that are covered by our current projects & tools directories.

(1) Projects that are configured with LLVM
(2) Runtime projects that should be configured using the just-built tools
(3) The LLVM test-suite, which is really just external tests that should be configured and run with the just-built tools

My proposal is that we make the tools subdirectory the only place for projects that fall into category 1. I don’t think there is any technical reason to drop an in-tree project into projects over tools today, and I think we migrating people who are doing that away from it should be easy.

Second I want to add a “runtimes” directory to LLVM to cover case 2 (see D20992). The idea behind this is to use common code in LLVM to support building runtimes. This will allow the full LLVM toolchain to be visible during configuration. I will abstract this functionality into an installed CMake module so that Clang can use it for out-of-tree clang builds.

Lastly we need to give the test-suite a new home. I’m not super concerned with where we do that. It could be under tests, it could just be at the root of the LLVM directory. I don’t think it matters too much because it is a one-off. Thoughts welcome.

My proposed patch makes the runtimes directory work for Compiler-RT, but it doesn’t yet handle libcxxabi, libcxx and libunwind. There is some special case handling between libcxxabi and libcxx that will need to be handled to make the dependencies work between the two, and I still need to work that out.

If we want to go with this proposal I envision the transition being multi-staged:

(1) Adding the new functionality, getting it up and fully working for all runtime projects - this will involve changes to runtime projects
(2) Work with bot maintainers to migrate bots, and fix any issues that come up
(3) Add support for a new secondary location for the test-suite
(4) Set a date for removing the projects directory, post patches including updated documentation
(5) Remove the projects directory entirely

Thoughts?
-Chris

From: "Chris Bieneman via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Chandler Carruth" <chandlerc@gmail.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, June 9, 2016 12:20:36 PM
Subject: [llvm-dev] [RFC] LLVM Directory Structure Changes (was Re: [PATCH] D20992: [CMake] Add LLVM runtimes
directory)

Moving to llvm-dev (I think this has gone a bit further than a patch
review discussion)

In hindsight I probably should have explained more of my thinking on
this with the patch, or done an RFC on llvm-dev to start with. I’l
do that now, and answer the questions along the way. I sent a
separate email discussing Justin’s patch review feedback.

In the build system today there is no strong distinction between
‘projects’ and ‘tools’. There are a few subtle differences, but I’m
not sure any of them really matter. The differences are:

(1) The projects directory is always configured, tools can be
disabled using LLVM_INCLUDE_TOOLS=Off (projects and tools can both
be individually disabled too)
(2) Projects are configured before tools, so tools can rely on
targets being created for projects (we don’t really use this, and
anywhere we are is probably a bug)
(3) Some projects have special handling. For example test-suite isn’t
actually treated as a project, it has special handling in
LLVM/CMakeLists.txt:727, and Compiler-RT is handled by clang if you
set LLVM_BUILD_EXTERNAL_COMPILER_RT=On.

With this in mind I was thinking about the general usability of our
build system. The distinction between a project and a tool is not
very clear. At a high level I see three different use cases that are
covered by our current projects & tools directories.

(1) Projects that are configured with LLVM
(2) Runtime projects that should be configured using the just-built
tools
(3) The LLVM test-suite, which is really just external tests that
should be configured and run with the just-built tools

My proposal is that we make the tools subdirectory the *only* place
for projects that fall into category 1. I don’t think there is any
technical reason to drop an in-tree project into projects over tools
today, and I think we migrating people who are doing that away from
it should be easy.

Second I want to add a “runtimes” directory to LLVM to cover case 2
(see D20992). The idea behind this is to use common code in LLVM to
support building runtimes. This will allow the full LLVM toolchain
to be visible during configuration. I will abstract this
functionality into an installed CMake module so that Clang can use
it for out-of-tree clang builds.

Lastly we need to give the test-suite a new home. I’m not super
concerned with where we do that. It could be under tests, it could
just be at the root of the LLVM directory. I don’t think it matters
too much because it is a one-off. Thoughts welcome.

This all makes sense to me. It matches my mental model that tools get compiled with the 'host' compiler and the runtimes need to get cross-compiled. Putting the test-suite at the top-level could work, we already have a 'test' and 'unittests' directory. Maybe, however, unit tests should live under tests?

-Hal

Chris Bieneman <beanz@apple.com> writes:

Moving to llvm-dev (I think this has gone a bit further than a patch
review discussion)

In hindsight I probably should have explained more of my thinking on
this with the patch, or done an RFC on llvm-dev to start with. I’l do
that now, and answer the questions along the way. I sent a separate
email discussing Justin’s patch review feedback.

In the build system today there is no strong distinction between
‘projects’ and ‘tools’. There are a few subtle differences, but I’m
not sure any of them really matter. The differences are:

(1) The projects directory is always configured, tools can be disabled
using LLVM_INCLUDE_TOOLS=Off (projects and tools can both be
individually disabled too)
(2) Projects are configured before tools, so tools can rely on targets
being created for projects (we don’t really use this, and anywhere we
are is probably a bug)
(3) Some projects have special handling. For example test-suite isn’t
actually treated as a project, it has special handling in
LLVM/CMakeLists.txt:727, and Compiler-RT is handled by clang if you
set LLVM_BUILD_EXTERNAL_COMPILER_RT=On.

With this in mind I was thinking about the general usability of our
build system. The distinction between a project and a tool is not very
clear. At a high level I see three different use cases that are
covered by our current projects & tools directories.

(1) Projects that are configured with LLVM
(2) Runtime projects that should be configured using the just-built tools
(3) The LLVM test-suite, which is really just external tests that
should be configured and run with the just-built tools

My proposal is that we make the tools subdirectory the *only* place
for projects that fall into category 1. I don’t think there is any
technical reason to drop an in-tree project into projects over tools
today, and I think we migrating people who are doing that away from it
should be easy.

Second I want to add a “runtimes” directory to LLVM to cover case 2
(see D20992). The idea behind this is to use common code in LLVM to
support building runtimes. This will allow the full LLVM toolchain to
be visible during configuration. I will abstract this functionality
into an installed CMake module so that Clang can use it for
out-of-tree clang builds.

Lastly we need to give the test-suite a new home. I’m not super
concerned with where we do that. It could be under tests, it could
just be at the root of the LLVM directory. I don’t think it matters
too much because it is a one-off. Thoughts welcome.

This all seems pretty sensible. Should we also use the opportunity to
split compiler-rt's builtins and profiling/sanitizer/etc runtimes, since
we'll be moving things around anyway?

Some place like test/external or test/integration would probably make
sense. It could potentially also be used for other optional tests like
debuginfo-tests, which are currently somewhat awkwardly checked out into
clang's tests.

My proposed patch makes the runtimes directory work for Compiler-RT,
but it doesn’t yet handle libcxxabi, libcxx and libunwind. There is
some special case handling between libcxxabi and libcxx that will need
to be handled to make the dependencies work between the two, and I
still need to work that out.

If we want to go with this proposal I envision the transition being
multi-staged:

(1) Adding the new functionality, getting it up and fully working for
all runtime projects - this will involve changes to runtime projects
(2) Work with bot maintainers to migrate bots, and fix any issues that come up
(3) Add support for a new secondary location for the test-suite
(4) Set a date for removing the projects directory, post patches
including updated documentation

Sure, but we might as well update the documentation earlier (in step 1)
- as soon as compiler-rt can live in runtimes it makes sense to tell
people to put it there, even if we still have legacy logic to make it
continue to work out of projects as well.

Also be good to make Compiler-RT and libc++ cross-compile for multiple
targets... :confused:

While that may be another battle entirely, it'd be good to keep that
in mind if we do split them up to pieces.

cheers,
--renato

My proposal is that we make the tools subdirectory the *only* place for projects that fall into category 1.

+1

Second I want to add a “runtimes” directory to LLVM to cover case 2 (see D20992).

+1

If we want to go with this proposal I envision the transition being multi-staged:

(1) Adding the new functionality, getting it up and fully working for all runtime projects - this will involve changes to runtime projects
(2) Work with bot maintainers to migrate bots, and fix any issues that come up
(3) Add support for a new secondary location for the test-suite
(4) Set a date for removing the projects directory, post patches including updated documentation
(5) Remove the projects directory entirely

This is pretty intrusive … We (internally) probably have a lot of individual people and scripts doing things in various ways that will need adjustment. I’d vote for a pretty liberal window on the timing of the removal step.

However, I think it’s clearly worthwhile.

Jim Rowan
jmr@codeaurora.org
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation

+1

Jim Rowan
jmr@codeaurora.org
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation

This all seems pretty sensible. Should we also use the opportunity to
split compiler-rt's builtins and profiling/sanitizer/etc runtimes, since
we'll be moving things around anyway?

Also be good to make Compiler-RT and libc++ cross-compile for multiple
targets... :confused:

Yes, that is part of the eventual goal for this. I *really* want to be able to build a full cross-compiler and runtime stack from a single CMake invocation. Ideally supporting /n/ targets.

I have lots of ideas in this area, but it is being kept in mind.

-Chris

My proposal is that we make the tools subdirectory the *only* place for projects that fall into category 1.

+1

Second I want to add a “runtimes” directory to LLVM to cover case 2 (see D20992).

+1

If we want to go with this proposal I envision the transition being multi-staged:

(1) Adding the new functionality, getting it up and fully working for all runtime projects - this will involve changes to runtime projects
(2) Work with bot maintainers to migrate bots, and fix any issues that come up
(3) Add support for a new secondary location for the test-suite
(4) Set a date for removing the projects directory, post patches including updated documentation
(5) Remove the projects directory entirely

This is pretty intrusive … We (internally) probably have a lot of individual people and scripts doing things in various ways that will need adjustment. I’d vote for a pretty liberal window on the timing of the removal step.

Agreed completely. I think removing the projects directory is not something we can do on a whim because people rely on it extensively.

-Chris

Chris Bieneman <beanz@apple.com> writes:

Moving to llvm-dev (I think this has gone a bit further than a patch
review discussion)

In hindsight I probably should have explained more of my thinking on
this with the patch, or done an RFC on llvm-dev to start with. I’l do
that now, and answer the questions along the way. I sent a separate
email discussing Justin’s patch review feedback.

In the build system today there is no strong distinction between
‘projects’ and ‘tools’. There are a few subtle differences, but I’m
not sure any of them really matter. The differences are:

(1) The projects directory is always configured, tools can be disabled
using LLVM_INCLUDE_TOOLS=Off (projects and tools can both be
individually disabled too)
(2) Projects are configured before tools, so tools can rely on targets
being created for projects (we don’t really use this, and anywhere we
are is probably a bug)
(3) Some projects have special handling. For example test-suite isn’t
actually treated as a project, it has special handling in
LLVM/CMakeLists.txt:727, and Compiler-RT is handled by clang if you
set LLVM_BUILD_EXTERNAL_COMPILER_RT=On.

With this in mind I was thinking about the general usability of our
build system. The distinction between a project and a tool is not very
clear. At a high level I see three different use cases that are
covered by our current projects & tools directories.

(1) Projects that are configured with LLVM
(2) Runtime projects that should be configured using the just-built tools
(3) The LLVM test-suite, which is really just external tests that
should be configured and run with the just-built tools

My proposal is that we make the tools subdirectory the only place
for projects that fall into category 1. I don’t think there is any
technical reason to drop an in-tree project into projects over tools
today, and I think we migrating people who are doing that away from it
should be easy.

Second I want to add a “runtimes” directory to LLVM to cover case 2
(see D20992). The idea behind this is to use common code in LLVM to
support building runtimes. This will allow the full LLVM toolchain to
be visible during configuration. I will abstract this functionality
into an installed CMake module so that Clang can use it for
out-of-tree clang builds.

Lastly we need to give the test-suite a new home. I’m not super
concerned with where we do that. It could be under tests, it could
just be at the root of the LLVM directory. I don’t think it matters
too much because it is a one-off. Thoughts welcome.

This all seems pretty sensible. Should we also use the opportunity to
split compiler-rt’s builtins and profiling/sanitizer/etc runtimes, since
we’ll be moving things around anyway?

About that… So this is a complicated issue, but we should discuss it. Building compiler-rt as a monolithic chunk under the runtimes model is actually problematic because it would have circular dependencies.

For example:

libclang_rt.asan.aarch64 depends on libcxx.aarch64, but libcxx.aarch64 depends on libclang_rt.builtins.aarch64.

That means to satisfy the proper build dependencies you need to build clang, then builtins, then libcxx, then the sanitizer libraries.

Since Compiler-RT’s build does support configuring the builtins directory separately from the sanitizers, we could support this with my runtimes proposal without changing anything else in Compiler-RT through the application of some project-specific hacks. Not idea, but it means this doesn’t need to block my proposal.

We should consider other alternatives, because it would nice to not have hacks. As an added benefit, if we separated the builtins and sanitizers into separate libraries we would be able to have the sanitizer libraries licensed the same way as LLVM (with the attribution clause), which would limit the amount of code that has its wonky modified license.

-Chris

I’m great with moving the runtimes into their own directory and making cmake modules to standardize an interface between the LLVM build process and the runtime build process. I would like to ask for more though.

I find working with the runtimes a bit frustrating at times, and I think a lot of that frustration stems from the second-class nature of the runtime build process. The “best” way to build just the runtimes today is to clone them in the llvm tree, pass a very long cmake line, then run a very specific make / ninja line (i.e. ninja cxx, ninja check-libcxxabi). This is even the case if I don’t want to use the freshly generated compiler. Here are some specific nuisances that running in-tree has caused me:

  • I routinely need to pull a new LLVM for new cmake files, even though I am not building LLVM.

  • I don’t get to use “standard” commands to build and test the components I work with. I want to just run make / ninja with no arguments, but instead, I need to build specific targets.

  • Choices made for LLVM’s build end up affecting mine, and I don’t have a good way to change them. For example, I wanted to perform some libcxx builds without -Wl,-z,defs. I had to do some wacky string filtering to remove it from the compile line after the fact, instead of preventing it from getting added in the first place. There wasn’t a flag available on the LLVM side to disable it at configure time.

  • To get cmake to work, I have to set HAVE_CXX_ATOMICS_WITHOUT_LIB, even though I have no intention of building LLVM. I then get to set LIBCXX_HAVE_CXX_ATOMICS_WITHOUT_LIB too, because reasons.

  • Multi-libs require multiple independent build directories, with all the associated cmake overhead.

So why not run out of tree instead you may ask?

  • No lit tests or lit utilities (FileCheck, not, etc…)

  • Even more difficult to manage dependencies between libcxxabi and libcxx

So some things I would like to see…

  • Standalone runtime builds should use the “normal” build interfaces (bare make, make all, make check, make install. s/make/ninja as desired).

  • For in-tree builds, LLVM would use the new cmake ExternalProject feature. This way LLVM’s in-tree support could be implemented in terms of the runtime’s “normal” build interfaces. LLVM may also define a refinement of that interface to provide extra LLVM information on top.

  • For example, maybe all llvm runtime projects get passed something like LLVM_RUNTIME_LIBCXXABI_PRESENT and LLVM_RUNTIME_LIBCXXABI_PATH, and those projects can act on that or not.- Developers using standalone builds can use the same “LLVM build” interface as the in-tree builds use.

  • Break out testing infrastructure to a common repo, so that the runtimes can have access to the testing “banana” without dragging along the LLVM “gorilla”.

From: "Chris Bieneman via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Justin Bogner" <mail@justinbogner.com>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, June 9, 2016 1:23:18 PM
Subject: Re: [llvm-dev] [RFC] LLVM Directory Structure Changes (was
Re: [PATCH] D20992: [CMake] Add LLVM runtimes directory)

> Chris Bieneman < beanz@apple.com > writes:

> > Moving to llvm-dev (I think this has gone a bit further than a
> > patch
>

> > review discussion)
>

> > In hindsight I probably should have explained more of my thinking
> > on
>

> > this with the patch, or done an RFC on llvm-dev to start with.
> > I’l
> > do
>

> > that now, and answer the questions along the way. I sent a
> > separate
>

> > email discussing Justin’s patch review feedback.
>

> > In the build system today there is no strong distinction between
>

> > ‘projects’ and ‘tools’. There are a few subtle differences, but
> > I’m
>

> > not sure any of them really matter. The differences are:
>

> > (1) The projects directory is always configured, tools can be
> > disabled
>

> > using LLVM_INCLUDE_TOOLS=Off (projects and tools can both be
>

> > individually disabled too)
>

> > (2) Projects are configured before tools, so tools can rely on
> > targets
>

> > being created for projects (we don’t really use this, and
> > anywhere
> > we
>

> > are is probably a bug)
>

> > (3) Some projects have special handling. For example test-suite
> > isn’t
>

> > actually treated as a project, it has special handling in
>

> > LLVM/CMakeLists.txt:727, and Compiler-RT is handled by clang if
> > you
>

> > set LLVM_BUILD_EXTERNAL_COMPILER_RT=On.
>

> > With this in mind I was thinking about the general usability of
> > our
>

> > build system. The distinction between a project and a tool is not
> > very
>

> > clear. At a high level I see three different use cases that are
>

> > covered by our current projects & tools directories.
>

> > (1) Projects that are configured with LLVM
>

> > (2) Runtime projects that should be configured using the
> > just-built
> > tools
>

> > (3) The LLVM test-suite, which is really just external tests that
>

> > should be configured and run with the just-built tools
>

> > My proposal is that we make the tools subdirectory the *only*
> > place
>

> > for projects that fall into category 1. I don’t think there is
> > any
>

> > technical reason to drop an in-tree project into projects over
> > tools
>

> > today, and I think we migrating people who are doing that away
> > from
> > it
>

> > should be easy.
>

> > Second I want to add a “runtimes” directory to LLVM to cover case
> > 2
>

> > (see D20992). The idea behind this is to use common code in LLVM
> > to
>

> > support building runtimes. This will allow the full LLVM
> > toolchain
> > to
>

> > be visible during configuration. I will abstract this
> > functionality
>

> > into an installed CMake module so that Clang can use it for
>

> > out-of-tree clang builds.
>

> > Lastly we need to give the test-suite a new home. I’m not super
>

> > concerned with where we do that. It could be under tests, it
> > could
>

> > just be at the root of the LLVM directory. I don’t think it
> > matters
>

> > too much because it is a one-off. Thoughts welcome.
>

> This all seems pretty sensible. Should we also use the opportunity
> to

> split compiler-rt's builtins and profiling/sanitizer/etc runtimes,
> since

> we'll be moving things around anyway?

About that… So this is a complicated issue, but we should discuss it.
Building compiler-rt as a monolithic chunk under the runtimes model
is actually problematic because it would have circular dependencies.

For example:

libclang_rt.asan.aarch64 depends on libcxx.aarch64, but
libcxx.aarch64 depends on libclang_rt.builtins.aarch64.

That means to satisfy the proper build dependencies you need to build
clang, then builtins, then libcxx, then the sanitizer libraries.

I think we might take this opportunity to break apart the various essentially-disjoint parts of compiler-rt. That should make this dependency problem much easier to handle.

-Hal

Thanks Chris,

A big odd user of RT is Android, and they're having a lot of trouble
cross-compiling, so they ended up creating their own build system.

I'm copying Steve that has had more than his share of problems, maybe
a sync on how we both want to compile RT wouldn't hurt.

My goal is to get Android compiling RT together with Clang in the
exact same way we do, so we only need to maintain one toolchain.

cheers,
--renato

>>> This all seems pretty sensible. Should we also use the opportunity to
>>> split compiler-rt's builtins and profiling/sanitizer/etc runtimes,
since
>>> we'll be moving things around anyway?
>>
>> Also be good to make Compiler-RT and libc++ cross-compile for multiple
>> targets... :confused:
>
> Yes, that is part of the eventual goal for this. I *really* want to be
able to build a full cross-compiler and runtime stack from a single CMake
invocation. Ideally supporting /n/ targets.
>
> I have lots of ideas in this area, but it is being kept in mind.

Thanks Chris,

A big odd user of RT is Android, and they're having a lot of trouble
cross-compiling, so they ended up creating their own build system.

Eh, the build system for Android existed before people were using cmake in
LLVM (i.e. it was based on the old autotools version for LLVM builds). At
the time, nearly all the LLVM components that Android was using were
trivially cross-compiled (well, except for TableGen, but even that isn't
hard to write make rules for). Now that LLVM is a more important chunk of
Android (and growing increasingly more complex), I agree that it isn't the
best to have to maintain a parallel build system that isn't even in
upstream.

I'm copying Steve that has had more than his share of problems, maybe
a sync on how we both want to compile RT wouldn't hurt.

I actually am not sure how we want to compile RT. I am going to have to
hack around the existing state of things for quite some time before
converging on what upstream is doing because Android can't really afford to
not make progress (and we have more than our share of other
problems/deadlines to make this lower priority). I also still don't know
enough about cmake to really understand why cross-compiling is so
difficult, but I can see where the builtins combined with the sanitizers
cause problems (due to different true dependencies on libcxx, etc.). I
think splitting them apart is a great first step, as I already have to
essentially do that for my new builds today (i.e. configure/build runtime,
then configure/build sanitizers later on). Other than that, I am not sure I
have any idea what to do after that.

My goal is to get Android compiling RT together with Clang in the
exact same way we do, so we only need to maintain one toolchain.

Do you have a pointer to how you build these things today? I have searched
for other users of compiler-rt for non-x86 platforms and nearly every
configuration I saw was relying on terrible hacks (i.e. Chromium doesn't
bother to use their new cross-compiler, they just rely on a pre-existing
Android NDK to cross-compile compiler-rt because it is too hard to
configure otherwise). I concluded that because everyone else is mostly
cross-compiling Android targets incorrectly, that I don't particularly feel
compelled to conform to a truly clean upstream build for compiler-rt right
now.

Thanks,
Steve

Eh, the build system for Android existed before people were using cmake in
LLVM (i.e. it was based on the old autotools version for LLVM builds).

Sorry, I meant "ended up using their own build system". Though, that's
also not the same thing, since you guys use your build system for
pretty much anything, right?

It may work for most other Android dependencies, but LLVM has a lot of
internal cross dependency and idiosyncratic build system, making
back-porting very hard if we disagree on how to build things.

I concluded that because everyone else is mostly cross-compiling
Android targets incorrectly, that I don't particularly feel compelled to
conform to a truly clean upstream build for compiler-rt right now.

AFAIK, no one has a good solution, and we definitely don't have a
universal one. But I believe whatever Chris is planning to has a high
potential of being that "one true way" (tm).

I don't like that the sanitizers are in RT in the same way I didn't
like the unwinder was there. The sanitizers cross-depend on Clang,
which the builtins don't have to. Unwind depends on libc++abi, while
neither of the other two do.

But the builtins are weird on their own, and how they pick optimised
versions inside targets directories is the kind of idea that probably
sounded brilliant before implementation, and everyone sighed
afterwards. :slight_smile:

Even if Android doesn't build LLVM in the upstream way now, knowing
how the plans are progressing will allow you guys to plan for the
future and share concerns, so that both sides know all issues before
agreeing on a long term plan.

cheers,
--renato

Hey Ben,

Thank you for providing this feedback. I’m going to lay out some ideas that I have inline below.

I’m great with moving the runtimes into their own directory and making cmake modules to standardize an interface between the LLVM build process and the runtime build process. I would like to ask for more though.

I find working with the runtimes a bit frustrating at times, and I think a lot of that frustration stems from the second-class nature of the runtime build process. The “best” way to build just the runtimes today is to clone them in the llvm tree, pass a very long cmake line, then run a very specific make / ninja line (i.e. ninja cxx, ninja check-libcxxabi). This is even the case if I don’t want to use the freshly generated compiler. Here are some specific nuisances that running in-tree has caused me:

Agree on all points. It might be useful to provide an alternate top-level CMake file for runtime projects. There is some complication because the runtimes do use CMake modules from LLVM, so you’ll need either an LLVM checkout or a built & installed LLVM. My general hope in formalizing a runtimes directory is to start treating runtimes as first-class members of the LLVM project. Admittedly I’m coming from a different perspective than you, so I’m tackling the "LLVM & Clang developer who also wants runtimes” side of the problem. I am, however, sympathetic to your side too.

  • I routinely need to pull a new LLVM for new cmake files, even though I am not building LLVM.

Some of this may be avoidable through restructuring our CMake modules. I think we should discuss this separately from the changes I’m asking for, but my general idea is it might be reasonable to create a separate LLVM repository that stores common CMake modules. If we did that it would require that everyone building LLVM would need to have those modules. We may be able to hide that complexity using CMake’s ExternalProject stuff. I’ll take it as a line item to look into that.

  • I don’t get to use “standard” commands to build and test the components I work with. I want to just run make / ninja with no arguments, but instead, I need to build specific targets.

This is the kind of thing I see as potentially solvable with a separate top-level CMake file for runtimes. That said, in general I’ve been moving toward adding more and more explicit targets into CMake. As a result of that my natural workflow is involving less and less running general commands and more and more running specific “ninja ” and “ninja check-llvm-”.

  • Choices made for LLVM’s build end up affecting mine, and I don’t have a good way to change them. For example, I wanted to perform some libcxx builds without -Wl,-z,defs. I had to do some wacky string filtering to remove it from the compile line after the fact, instead of preventing it from getting added in the first place. There wasn’t a flag available on the LLVM side to disable it at configure time.

LLVM’s flags impacting libcxx is fixed by my runtimes proposal. In fact, that’s part of the point. Bleeding options from LLVM & Clang builds into runtime libraries is not cool. It causes lots of problems on Darwin, so we’re sensitive to this.

  • To get cmake to work, I have to set HAVE_CXX_ATOMICS_WITHOUT_LIB, even though I have no intention of building LLVM. I then get to set LIBCXX_HAVE_CXX_ATOMICS_WITHOUT_LIB too, because reasons.

This is bad. I’m curious why you need to set those ever. Have you diagnosed this? For you to need to set that it means the host toolchain isn’t properly passing the CMake checks.

  • Multi-libs require multiple independent build directories, with all the associated cmake overhead.

I strongly believe that for building multi-lib or more generally when building for multiple targets you want multiple build directories, and in particular the multiple-cmake invocations. I believe this because you want the checks to be relevant for the target, and the only way to do that is to run the checks once per target.

That said, I also strongly believe that for any user of our projects we should find a way to have a single simple CMake invocation that gets the end result that you want. I don’t believe these are mutually exclusive goals.

One of the development steps of the new runtime directory will be supporting specifying multiple targets to build the runtimes for, and having CMake construct the appropriate number of build directories and manage building them all through a single top-level configuration and build directory. If you’re skeptical about how doable this is I’d encourage you to look at this bot → http://lab.llvm.org:8011/builders/clang-3stage-ubuntu.

That bot does a full 3-stage clang build from a single CMake invocation:

cmake -C …/llvm.src/tools/clang/cmake/caches/3-stage.cmake -GNinja -DLLVM_TARGETS_TO_BUILD=all -DLLVM_BINUTILS_INCDIR=/opt/binutils/include …/llvm.src

So why not run out of tree instead you may ask?

  • No lit tests or lit utilities (FileCheck, not, etc…)

I don’t know if the runtimes you’re building are setup for this or not, but you can get out-of-tree tests working if you have an LLVM installation on the system or a build directory that you can point at. Compiler-RT does this. It isn’t ideal but it is workable.

We should have a better solution.

  • Even more difficult to manage dependencies between libcxxabi and libcxx

Yep. That sucks.

So some things I would like to see…

  • Standalone runtime builds should use the “normal” build interfaces (bare make, make all, make check, make install. s/make/ninja as desired).

I think this is doable, but I’m hesitant to rope it in with what I’m trying to do here. Nothing I want to do would prevent this or make it any harder than it already is.

  • For in-tree builds, LLVM would use the new cmake ExternalProject feature. This way LLVM’s in-tree support could be implemented in terms of the runtime’s “normal” build interfaces. LLVM may also define a refinement of that interface to provide extra LLVM information on top.
  • For example, maybe all llvm runtime projects get passed something like LLVM_RUNTIME_LIBCXXABI_PRESENT and LLVM_RUNTIME_LIBCXXABI_PATH, and those projects can act on that or not.

Yes, there will need to be a mechanism for communicating project dependencies between runtimes.

  • Developers using standalone builds can use the same “LLVM build” interface as the in-tree builds use.

Yes. I’ve actually been having some hallway conversations about how to standardize the build interface a bit more cleanly. Some of this will depend on a per-project basis, but I think compiler-rt does some of this right today.

Specifically if an LLVM build tree is available it builds as if it were in-tree and being distributed with that build. If the LLVM build tree isn’t available, it builds in a more standard *nix format that would be compatible with non-LLVM toolchains. I may be wrong, but I think that is probably the right behavior for all our runtimes.

  • Break out testing infrastructure to a common repo, so that the runtimes can have access to the testing “banana” without dragging along the LLVM "gorilla”.

I’m hesitant to suggest more and more repos because I think there are some challenges and additional burdens with that. I do understand the benefit of what you’re asking for here, and I think it is worth considering. I think there is an argument for splitting out the LLVM testing infrastructure, as well as an argument for splitting out the LLVM build infrastructure.

In both cases I think those changes are larger than what I’m proposing, but worth considering.

-Chris

Obligatory troll: Maybe we should move to github and change the whole repo structure in the process?

  • To get cmake to work, I have to set HAVE_CXX_ATOMICS_WITHOUT_LIB, even though I have no intention of building LLVM. I then get to set LIBCXX_HAVE_CXX_ATOMICS_WITHOUT_LIB too, because reasons.

This is bad. I’m curious why you need to set those ever. Have you diagnosed this? For you to need to set that it means the host toolchain isn’t properly passing the CMake checks.

It looks like I don’t need to set these anymore, but the comment that I left myself at the time was that HAVE_CXX_ATOMICS_WITHOUT_LIB looked at the state of my local machine’s toolchain, as opposed to the toolchain that I was about to use. My local machine’s toolchain is gcc 4.6.3, and it doesn’t have an header.

  • Multi-libs require multiple independent build directories, with all the associated cmake overhead.

I strongly believe that for building multi-lib or more generally when building for multiple targets you want multiple build directories, and in particular the multiple-cmake invocations. I believe this because you want the checks to be relevant for the target, and the only way to do that is to run the checks once per target.

That said, I also strongly believe that for any user of our projects we should find a way to have a single simple CMake invocation that gets the end result that you want. I don’t believe these are mutually exclusive goals.

One of the development steps of the new runtime directory will be supporting specifying multiple targets to build the runtimes for, and having CMake construct the appropriate number of build directories and manage building them all through a single top-level configuration and build directory. If you’re skeptical about how doable this is I’d encourage you to look at this bot → http://lab.llvm.org:8011/builders/clang-3stage-ubuntu.

That bot does a full 3-stage clang build from a single CMake invocation:

cmake -C …/llvm.src/tools/clang/cmake/caches/3-stage.cmake -GNinja -DLLVM_TARGETS_TO_BUILD=all -DLLVM_BINUTILS_INCDIR=/opt/binutils/include …/llvm.src

I’m fine if I can invoke cmake once and get multiple library variants out. If that means that behind-the-scenes, cmake has multiple build sub-directories, then that’s fine by me. On Windows, this already happens to some degree, as the Visual Studio project is allowed to switch between Release, Debug, RelWithDebInfo, etc without re-running cmake.

So why not run out of tree instead you may ask?

  • No lit tests or lit utilities (FileCheck, not, etc…)

I don’t know if the runtimes you’re building are setup for this or not, but you can get out-of-tree tests working if you have an LLVM installation on the system or a build directory that you can point at. Compiler-RT does this. It isn’t ideal but it is workable.

We should have a better solution.

The nightly builds that my organization produces is generally a “customer build”, and not a “developer build”. Stated otherwise, it doesn’t include llvm-config, libclang.a, or any of the other things that people building against llvm would want. That means that if I wanted to use this particular out-of-tree solution, I would still need to clone down LLVM and rebuild and reinstall it on occasion.

So some things I would like to see…

  • Standalone runtime builds should use the “normal” build interfaces (bare make, make all, make check, make install. s/make/ninja as desired).

I think this is doable, but I’m hesitant to rope it in with what I’m trying to do here. Nothing I want to do would prevent this or make it any harder than it already is.

Completely fair. I figured I’d get my grievances and wishlist out first, just so that some agreement on a direction can be figured out. It does extend beyond the directory re-org patch.

  • Break out testing infrastructure to a common repo, so that the runtimes can have access to the testing “banana” without dragging along the LLVM "gorilla”.

I’m hesitant to suggest more and more repos because I think there are some challenges and additional burdens with that. I do understand the benefit of what you’re asking for here, and I think it is worth considering. I think there is an argument for splitting out the LLVM testing infrastructure, as well as an argument for splitting out the LLVM build infrastructure.

In both cases I think those changes are larger than what I’m proposing, but worth considering.

-Chris

Obligatory troll: Maybe we should move to github and change the whole repo structure in the process?

I figured the github thread was crazy enough without me sabotaging it with this kind of suggestion :slight_smile:

It seems to me that the feedback here has been generally positive, but a lot of different ideas have been added to the mix.

To focus conversation and move things along I’m going to provide a summary of changes with proposals for rollout.

Splitting Compiler-RT

If we want to split compiler-rt, which I think makes a lot of sense, I think the best path forward is to copy the trunk (via svn cp). Copying the branch is the best way to preserve the history and workflows.

For naming purposes I would suggest retaining the compiler-rt name for the builtin libraries, and having a repository named sanitizer-rt for the sanitizer libraries (this is of course just a suggestion, feel free to bike shed).

After duplicating the repository we could setup an auto-merge from compiler-rt to sanitizer-rt. We could setup the LLVM build system so that if both projects were present it would force only building builtins from compiler-rt and sanitizers from sanitizer-rt. This would allow a transition time where bots could be updated to include both repositories, and engineer workflows would not be impacted.

After a brief time for bots to be updated with the new repository we could modify the repositories separately to build only the parts they are supposed to build, remove the hack from LLVM to force that, and begin removing code from the separate repositories.

LLVM Restructuring

The first step here is adding the new functionality, iterating on the CMake interface for the runtime projects and getting all the runtime projects hooked up.

Once all the runtime project support is ready we can begin migrating bots and evangelizing the new runtime build process.

At some point before or after the runtime work we can modify CMake to support the test-suite living under tests (or somewhere else, bikeshed away).

Once runtime support is ready, and the test-suite is supported outside projects we can set a date for removal of the projects directory. This planning should take into account updating bots as well as updating scripts and tooling.

Breaking out testing tools

As I started looking into breaking out the testing tools I realized it is much more complicated than I had first thought. I do think that it is a good idea to do this, but it is going to be a bigger change than I had originally thought.

The big wrench in breaking out the testing tools is that you need more than just lit. In particular you need FileCheck, not, count and a few other random things under llvm/utils. This also means you need to break out libSupport and ADT.

While I think that breaking this stuff all out is a good idea, it is a much larger change than what I was trying to propose. If we go down this route I would recommend creating a new llvm-infrastructure repository. We could then stub it out and update projects and workflows to include it. After the workflows are updated we can start moving libraries and tools into it.

An alternative approach we could take would be to migrate the testing tools off libSupport to make them standalone. Then the testing tools and lit could more easily be lifted out of the LLVM repository. This approach has some benefits, but also has added complication because some of the libSupport functionality in use is non-trivial.

Thoughts?
-Chris

I like this plan a lot.

The testing tools side of things is what makes compiler-rt cross compiling so much fun :). You go through all sorts of trouble to make sure that the host compiler is ignored and that the cross compiler is the compiler that gets invoked, then the test infrastructure ends up needing the host compiler in addition the the cross compiler. I think it's still a valuable problem to solve, but I also agree that it isn't trivial.

Chris Bieneman via llvm-dev <llvm-dev@lists.llvm.org> writes:

It seems to me that the feedback here has been generally positive, but
a lot of different ideas have been added to the mix.

To focus conversation and move things along I'm going to provide a
summary of changes with proposals for rollout.

Splitting Compiler-RT

If we want to split compiler-rt, which I think makes a lot of sense, I
think the best path forward is to copy the trunk (via svn cp). Copying
the branch is the best way to preserve the history and workflows.

For naming purposes I would suggest retaining the compiler-rt name for
the builtin libraries, and having a repository named sanitizer-rt for
the sanitizer libraries (this is of course just a suggestion, feel
free to bike shed).

What about the profiling and coverage runtime support, the blocks
runtime, safestack, etc? I suspect sanitizer-rt is the wrong name and
we're better off splitting it the other way with "compiler-builtins".

After duplicating the repository we could setup an auto-merge from
compiler-rt to sanitizer-rt. We could setup the LLVM build system so
that if both projects were present it would force only building
builtins from compiler-rt and sanitizers from sanitizer-rt. This would
allow a transition time where bots could be updated to include both
repositories, and engineer workflows would not be impacted.

After a brief time for bots to be updated with the new repository we
could modify the repositories separately to build only the parts they
are supposed to build, remove the hack from LLVM to force that, and
begin removing code from the separate repositories.

This sounds like we'd plan to update checkouts twice - once for the
compiler-rt split and once for the restructuring. Do note that I only
mentioned splitting compiler-rt because it seemed like it'd be
convenient to only disrupt things once.

If we're going phase things such that we need two updates anyway then
there's no reason for this to block the LLVM Restructuring step you're
trying to do. It can happen whenever.

LLVM Restructuring

The first step here is adding the new functionality, iterating on the
CMake interface for the runtime projects and getting all the runtime
projects hooked up.

Once all the runtime project support is ready we can begin migrating
bots and evangelizing the new runtime build process.

At some point before or after the runtime work we can modify CMake to
support the test-suite living under tests (or somewhere else, bikeshed
away).

Once runtime support is ready, and the test-suite is supported outside
projects we can set a date for removal of the projects directory. This
planning should take into account updating bots as well as updating
scripts and tooling.

+1. From the responses so far this part seems like it isn't contentious
and it will unblock future improvements.

Breaking out testing tools

As I started looking into breaking out the testing tools I realized it
is *much* more complicated than I had first thought. I do think that
it is a good idea to do this, but it is going to be a bigger change
than I had originally thought.

The big wrench in breaking out the testing tools is that you need more
than just lit. In particular you need FileCheck, not, count and a few
other random things under llvm/utils. This also means you need to
break out libSupport and ADT.

While I think that breaking this stuff all out is a good idea, it is a
much larger change than what I was trying to propose. If we go down
this route I would recommend creating a new llvm-infrastructure
repository. We could then stub it out and update projects and
workflows to include it. After the workflows are updated we can start
moving libraries and tools into it.

An alternative approach we could take would be to migrate the testing
tools off libSupport to make them standalone. Then the testing tools
and lit could more easily be lifted out of the LLVM repository. This
approach has some benefits, but also has added complication because
some of the libSupport functionality in use is non-trivial.

This would be a huge change. While I don't really have a problem with
the idea I can't see it being urgent and I don't think it makes sense to
lump it in with the runtimes/ thing. This would deserve it's own
separate discussion anyway.

It seems to me that the feedback here has been generally positive, but a lot of different ideas have been added to the mix.

To focus conversation and move things along I’m going to provide a summary of changes with proposals for rollout.

Splitting Compiler-RT

Note that none of the main sanitizer developers have really chimed in here… It’d be good to actually talk to them first. =]

If we want to split compiler-rt, which I think makes a lot of sense, I think the best path forward is to copy the trunk (via svn cp). Copying the branch is the best way to preserve the history and workflows.

For naming purposes I would suggest retaining the compiler-rt name for the builtin libraries, and having a repository named sanitizer-rt for the sanitizer libraries (this is of course just a suggestion, feel free to bike shed).

I would very much like a more specific name than ‘compiler-rt’. The genericness of that name is what led to some of the confusion today I suspect.

I would also suggest not having a hyphen in the name which makes python and other systems sad (I don’t understand why, and I’ve given up fighting this battle).

I think you already used the word that would best describe this: “builtin libraries”.

However, I’m not sure if splitting (at this point) makes sense. Maybe it does, but its seems fuzzy to me. The “builtins” will still be a collection of multiple runtime libraries, all tied to builtin compiler features. Some will be C/C++ features (the libgcc alternative for EH, type info, and math stuff). Some will be profiling features and some will be sanitizer features. I think having a separate repository for the profiling runtimes would probably be overkill. Maybe sanitizers are big enough to split out, but it seems iffy to me. I think the big thing that would help would just be better organization within the tree to clearly name the profile, language builtins, and sanitizer components.

Either way, I’d call the thing with profiling and language builtin runitmes “builtins” before “compiler”-anything.

Breaking out testing tools

As I started looking into breaking out the testing tools I realized it is much more complicated than I had first thought. I do think that it is a good idea to do this, but it is going to be a bigger change than I had originally thought.

FWIW, I’m not at all convinced this is a good idea yet. It has some appeal, but we’ve tried this before and it created confusion bordering on chaos. I would definitely decouple these things.