RFC: Default path for cross-compiled runtimes

Today, there’re two different locations for runtimes files within Clang’s installation:

compiler-rt:
headers: $prefix/lib/clang/$version/include(/sanitizer)
libraries: $prefix/lib/clang/$version/lib/$os/libclang_rt.$name-$arch.$ext

libc++, libc++abi, libunwind:
headers: $prefix/include/c++/v1
libraries: $prefix/lib/$name.$ext

The scheme used by libc++, libc++abi, libunwind doesn’t support targets other than the host which is a problem when cross-compiling. In our toolchain, we would like to build runtimes for all host and target platforms we support, e.g. a single toolchain will have runtimes for x86_64 and aarch64 Linux, Fuchsia and Windows. All you need to provide is the target triple and the sysroot. While this is possible with builtins, sanitizers and other compiler-rt runtimes, it’s not possible with libc++, libc++abi, libunwind.

Our proposal is to move both compiler-rt and libc++, libc++abi, libunwind into a new location that would support cross-compilation and unify the layout:

headers: $prefix/lib/clang/$version/include(/$triple)(/c++/v1)
libraries: $prefix/lib/clang/$version/$triple/lib/$name.$ext

This means that for compiler-rt, the main difference would be moving the runtime libraries to an target specific subdirectory rather than including the architecture in the library name; for libc++, libc++abi, libunwind, both headers and libraries will be moved to a new, target specific location.

In terms of implementation, we’ll need to modify the Clang driver to use this location when looking for runtimes and C++ libraries and headers. This should be a non-intrusive change: we’ll modify the driver to look into the new location in addition to the existing ones, so if the new path doesn’t exist, the driver will simply fallback to the existing behavior. When this is done, we need to modify the CMake build to install files into the new location.

This layout would be only used when runtimes are built as part of the llvm/runtimes tree or using LLVM_ENABLE_RUNTIMES in the monorepo layout (because this setup supports cross-compiling runtimes). When built as part of LLVM or standalone, libc++, libc++abi, libunwind would still install their files to $prefix/include and $prefix/lib as today.

Once the overall scheme is agreed upon, we could also consider de-duplicating C++ headers across targets, by moving shared headers into a common directory, with only varying subset in include/$triple.

To give an example, for x86_64 and aarch64 Linux, this would look like:

$prefix/lib/clang/6.0.0/include/sanitizer

$prefix/lib/clang/6.0.0/include/c++/v1
$prefix/lib/clang/6.0.0/include/x86_64-linux-gnu/c++/v1/__config


$prefix/lib/clang/6.0.0/x86_64-linux-gnu/lib/libclang_rt.asan.so

$prefix/lib/clang/6.0.0/x86_64-linux-gnu/lib/libc++.so

$prefix/lib/clang/6.0.0/aarch64-linux-gnu/lib/libclang_rt.asan.so

$prefix/lib/clang/6.0.0/aarch64-linux-gnu/lib/libc++.so

I don't understand what you mean. Normally, $prefix is part of sysroot
and there is no problem here.

Joerg

Today, there're two different locations for runtimes files within Clang's installation:

compiler-rt:
headers: $prefix/lib/clang/$version/include(/sanitizer)
libraries: $prefix/lib/clang/$version/lib/$os/libclang_rt.$name-$arch.$ext

libc++, libc++abi, libunwind:
headers: $prefix/include/c++/v1
libraries: $prefix/lib/$name.$ext

The scheme used by libc++, libc++abi, libunwind doesn't support targets other than the host which is a problem when cross-compiling.

Yes, it does: --sysroot=

What's currently missing is a standardized naming for a) where the sysroots live, and b) what they're named. Host libraries should continue to live in $prefix/{include, lib, lib32, lib64} as appropriate, whereas target libraries should live in something like: $prefix/clang-runtimes/$triple/$multilib/{usr/include, usr/lib, etc.}

  In our

toolchain, we would like to build runtimes for all host and target platforms we support, e.g. a single toolchain will have runtimes for x86_64 and aarch64 Linux, Fuchsia and Windows. All you need to provide is the target triple and the sysroot. While this is possible with builtins, sanitizers and other compiler-rt runtimes, it's not possible with libc++, libc++abi, libunwind.

Our proposal is to move both compiler-rt and libc++, libc++abi, libunwind into a new location that would support cross-compilation and unify the layout:

headers: $prefix/lib/clang/$version/include(/$triple)(/c++/v1)
libraries: $prefix/lib/clang/$version/$triple/lib/$name.$ext

I don't think it's a good idea to tie all the runtimes to the compiler like that. It makes sense for the builtins to live there since they are heavily coupled with the specific compiler version, but I'm not convinced libc++/libc++abi/libunwind should.

This means that for compiler-rt, the main difference would be moving the runtime libraries to an target specific subdirectory rather than including the architecture in the library name; for libc++, libc++abi, libunwind, both headers and libraries will be moved to a new, target specific location.

In terms of implementation, we'll need to modify the Clang driver to use this location when looking for runtimes and C++ libraries and headers. This should be a non-intrusive change: we'll modify the driver to look into the new location in addition to the existing ones, so if the new path doesn't exist, the driver will simply fallback to the existing behavior. When this is done, we need to modify the CMake build to install files into the new location.

This layout would be only used when runtimes are built as part of the llvm/runtimes tree or using LLVM_ENABLE_RUNTIMES in the monorepo layout (because this setup supports cross-compiling runtimes). When built as part of LLVM or standalone, libc++, libc++abi, libunwind would still install their files to $prefix/include and $prefix/lib as today.

Once the overall scheme is agreed upon, we could also consider de-duplicating C++ headers across targets, by moving shared headers into a common directory, with only varying subset in include/$triple.

At the very least the __config_site headers cannot be de-duplicated.

Beyond that, I strongly disagree with de-duplicating them in general because it will break --sysroot= unless you add a bunch of symlinks... which don't exist on every host platform.

Jon

Today, there’re two different locations for runtimes files within
Clang’s installation:

compiler-rt:
headers: $prefix/lib/clang/$version/include(/sanitizer)
libraries:
$prefix/lib/clang/$version/lib/$os/libclang_rt.$name-$arch.$ext

libc++, libc++abi, libunwind:
headers: $prefix/include/c++/v1
libraries: $prefix/lib/$name.$ext

The scheme used by libc++, libc++abi, libunwind doesn’t support targets
other than the host which is a problem when cross-compiling.

Yes, it does: --sysroot=

What if my sysroot doesn’t contains C++ library, or even if it does I may still want to use libc++ shipped with the toolchain e.g. because the one that’s part of the sysroot doesn’t support C++17?

I don’t like the “build libc++ separately and then put it inside your sysroot” solution for several reasons, most importantly because the sysroot typically considered read-only e.g. I cannot modify Xcode’s sysroot replacing whatever libc++ is already there, but I can use libSystem.dylib from Xcode’s sysroot with my own libc++ version. In our case also we ship the toolchain separately from the sysroot at different frequencies because they come from different source and system libraries change far less often than libc++.

What this means in practice is that when cross-compiling you have to do something like:

clang++ --target=-- --sysroot=/path/to/sysroot -stdlib=libc++ -nostdinc++ -I<libc+±install-prefix>/include/c++/v1 -L<libc+±install-prefix>/lib

This is even when the C++ library for your target is part of your Clang toolchain so the driver arguably should know how to find it without you having to duplicate the driver logic.

What’s currently missing is a standardized naming for a) where the
sysroots live, and b) what they’re named. Host libraries should continue
to live in $prefix/{include, lib, lib32, lib64} as appropriate, whereas
target libraries should live in something like:
$prefix/clang-runtimes/$triple/$multilib/{usr/include, usr/lib, etc.}

I don’t care too much about what the path is going to be. However, it’d be nice if we could unify the paths between compiler-rt runtimes and libc++. I always assumed that $prefix/lib/clang/$version was the path for runtimes hence suggesting it.

In our

toolchain, we would like to build runtimes for all host and target
platforms we support, e.g. a single toolchain will have runtimes for
x86_64 and aarch64 Linux, Fuchsia and Windows. All you need to provide
is the target triple and the sysroot. While this is possible with
builtins, sanitizers and other compiler-rt runtimes, it’s not possible
with libc++, libc++abi, libunwind.

Our proposal is to move both compiler-rt and libc++, libc++abi,
libunwind into a new location that would support cross-compilation and
unify the layout:

headers: $prefix/lib/clang/$version/include(/$triple)(/c++/v1)
libraries: $prefix/lib/clang/$version/$triple/lib/$name.$ext

I don’t think it’s a good idea to tie all the runtimes to the compiler
like that. It makes sense for the builtins to live there since they are
heavily coupled with the specific compiler version, but I’m not
convinced libc++/libc++abi/libunwind should.

They may be less tied than sanitizers because they don’t rely on compiler instrumentation, but there’s still some dependency, e.g. in order to use coroutines in libc++ I need a version of Clang that has the appropriate intrinsics.

I could build libc++ against an older version of Clang which will disable features not supported by that Clang version, but that version is again tied to that version of Clang.

This means that for compiler-rt, the main difference would be moving the
runtime libraries to an target specific subdirectory rather than
including the architecture in the library name; for libc++, libc++abi,
libunwind, both headers and libraries will be moved to a new, target
specific location.

In terms of implementation, we’ll need to modify the Clang driver to use
this location when looking for runtimes and C++ libraries and headers.
This should be a non-intrusive change: we’ll modify the driver to look
into the new location in addition to the existing ones, so if the new
path doesn’t exist, the driver will simply fallback to the existing
behavior. When this is done, we need to modify the CMake build to
install files into the new location.

This layout would be only used when runtimes are built as part of the
llvm/runtimes tree or using LLVM_ENABLE_RUNTIMES in the monorepo layout
(because this setup supports cross-compiling runtimes). When built as
part of LLVM or standalone, libc++, libc++abi, libunwind would still
install their files to $prefix/include and $prefix/lib as today.

Once the overall scheme is agreed upon, we could also consider
de-duplicating C++ headers across targets, by moving shared headers into
a common directory, with only varying subset in include/$triple.

At the very least the __config_site headers cannot be de-duplicated.

I think it should be sufficient to have separate __config headers for separate targets, that’s really the only thing that differs.

Beyond that, I strongly disagree with de-duplicating them in general
because it will break --sysroot= unless you add a bunch of symlinks…
which don’t exist on every host platform.

I don’t think you need symlinks, all you need is to put both include/c++/v1 and include/$triple/c++/v1 to your include paths, where the former contains all the common headers while the latter contains the target specific __config. This is exactly what GNU “multi-arch” layout does.

Copy on write. From that perspective, there’s merit to splitting out what goes in a sysroot, vs what goes in an “SDK”. A sysroot should be a copy of what’s actually on a user’s base system, whereas an SDK would contain whatever gets layered on top of that. Or a version of GCC that supports them. Clang isn’t the only client of these runtimes. I guess, but then you’re pulling pieces out of the sysroot, and still breaking what --sysroot= is for.

Today, there’re two different locations for runtimes files within
Clang’s installation:

compiler-rt:
headers: $prefix/lib/clang/$version/include(/sanitizer)
libraries:
$prefix/lib/clang/$version/lib/$os/libclang_rt.$name-$arch.$ext

libc++, libc++abi, libunwind:
headers: $prefix/include/c++/v1
libraries: $prefix/lib/$name.$ext

The scheme used by libc++, libc++abi, libunwind doesn’t support targets
other than the host which is a problem when cross-compiling.

Yes, it does: --sysroot=

What if my sysroot doesn’t contains C++ library, or even if it does I may still want to use libc++ shipped with the toolchain e.g. because the one that’s part of the sysroot doesn’t support C++17?

I don’t like the “build libc++ separately and then put it inside your sysroot” solution for several reasons, most importantly because the sysroot typically considered read-only e.g.

Copy on write.

I’m not sure what you mean, can you go into more detail?

What I’m proposing is basically an overlay, but rather than relying on filesystem support which is not portable, I’d like to simply rely on the compiler driver and include/library paths, i.e. first look here and if you don’t don’t find headers/libraries there keep looking in other include/library paths.

When you use -stdlib=libc++ today, Clang driver will always first look into …/include/c++/v1 (see https://github.com/llvm-mirror/clang/blob/master/lib/Driver/ToolChains/Linux.cpp#L736) and only then check the sysroot. I don’t wan to replace or alter --sysroot=, what I’m proposing is generalizing the existing logic to support multiarch, akin to libstdc++ (see https://github.com/llvm-mirror/clang/blob/master/lib/Driver/ToolChains/Linux.cpp#L774).

I cannot modify Xcode’s sysroot replacing whatever libc++ is already there, but I can use libSystem.dylib from Xcode’s sysroot with my own libc++ version. In our case also we ship the toolchain separately from the sysroot at different frequencies because they come from different source and system libraries change far less often than libc++.

What this means in practice is that when cross-compiling you have to do something like:

clang++ --target=-- --sysroot=/path/to/sysroot -stdlib=libc++ -nostdinc++ -I<libc+±install-prefix>/include/c++/v1 -L<libc+±install-prefix>/lib

From that perspective, there’s merit to splitting out what goes in a sysroot, vs what goes in an “SDK”. A sysroot should be a copy of what’s actually on a user’s base system, whereas an SDK would contain whatever gets layered on top of that.

Yes, I agree. In our case, we’d like for runtimes to be a part of the “LLVM SDK”. The reason is that they are being built and tested as part of LLVM build, separately from the sysroot.

This is what we already discussed in https://reviews.llvm.org/D32816 and the solution implemented there and in https://reviews.llvm.org/D32613 is already being used for Fuchsia. However, we would like to use the same setup on other platforms as well (e.g. in our case Linux and Darwin), so this proposal is an attempt at generalizing that solution.

This is even when the C++ library for your target is part of your Clang toolchain so the driver arguably should know how to find it without you having to duplicate the driver logic.

What’s currently missing is a standardized naming for a) where the
sysroots live, and b) what they’re named. Host libraries should continue
to live in $prefix/{include, lib, lib32, lib64} as appropriate, whereas
target libraries should live in something like:
$prefix/clang-runtimes/$triple/$multilib/{usr/include, usr/lib, etc.}

I don’t care too much about what the path is going to be. However, it’d be nice if we could unify the paths between compiler-rt runtimes and libc++. I always assumed that $prefix/lib/clang/$version was the path for runtimes hence suggesting it.

In our

toolchain, we would like to build runtimes for all host and target
platforms we support, e.g. a single toolchain will have runtimes for
x86_64 and aarch64 Linux, Fuchsia and Windows. All you need to provide
is the target triple and the sysroot. While this is possible with
builtins, sanitizers and other compiler-rt runtimes, it’s not possible
with libc++, libc++abi, libunwind.

Our proposal is to move both compiler-rt and libc++, libc++abi,
libunwind into a new location that would support cross-compilation and
unify the layout:

headers: $prefix/lib/clang/$version/include(/$triple)(/c++/v1)
libraries: $prefix/lib/clang/$version/$triple/lib/$name.$ext

I don’t think it’s a good idea to tie all the runtimes to the compiler
like that. It makes sense for the builtins to live there since they are
heavily coupled with the specific compiler version, but I’m not
convinced libc++/libc++abi/libunwind should.

They may be less tied than sanitizers because they don’t rely on compiler instrumentation, but there’s still some dependency, e.g. in order to use coroutines in libc++ I need a version of Clang that has the appropriate intrinsics.

Or a version of GCC that supports them. Clang isn’t the only client of these runtimes.

Yes, but in that case you’re likely going to build them separately from LLVM as a standalone project in which case they’re going to use the standard layout same as today. What I’m proposing is intended for the llvm/runtimes (I’m not sure what better to call it) when cross-compiling runtimes as part of the LLVM toolchain build.

I could build libc++ against an older version of Clang which will disable features not supported by that Clang version, but that version is again tied to that version of Clang.

This means that for compiler-rt, the main difference would be moving the
runtime libraries to an target specific subdirectory rather than
including the architecture in the library name; for libc++, libc++abi,
libunwind, both headers and libraries will be moved to a new, target
specific location.

In terms of implementation, we’ll need to modify the Clang driver to use
this location when looking for runtimes and C++ libraries and headers.
This should be a non-intrusive change: we’ll modify the driver to look
into the new location in addition to the existing ones, so if the new
path doesn’t exist, the driver will simply fallback to the existing
behavior. When this is done, we need to modify the CMake build to
install files into the new location.

This layout would be only used when runtimes are built as part of the
llvm/runtimes tree or using LLVM_ENABLE_RUNTIMES in the monorepo layout
(because this setup supports cross-compiling runtimes). When built as
part of LLVM or standalone, libc++, libc++abi, libunwind would still
install their files to $prefix/include and $prefix/lib as today.

Once the overall scheme is agreed upon, we could also consider
de-duplicating C++ headers across targets, by moving shared headers into
a common directory, with only varying subset in include/$triple.

At the very least the __config_site headers cannot be de-duplicated.

I think it should be sufficient to have separate __config headers for separate targets, that’s really the only thing that differs.

Beyond that, I strongly disagree with de-duplicating them in general
because it will break --sysroot= unless you add a bunch of symlinks…
which don’t exist on every host platform.

I don’t think you need symlinks, all you need is to put both include/c++/v1 and include/$triple/c++/v1 to your include paths, where the former contains all the common headers while the latter contains the target specific __config. This is exactly what GNU “multi-arch” layout does.

I guess, but then you’re pulling pieces out of the sysroot, and still breaking what --sysroot= is for.

If you have a sysroot that contains everything you need and all you want from Clang is the tooling then yes, but different users have different use cases.

I don't see any reason for wanting to do that. libc++'s headers are
target independent. There is no magic configuration file involved.

Joerg