Single instance 'clang', multi-target support for headers and libraries

This is question I believe is mostly a CLang issue, though there are some interactions with LLVM, LibC++, Compiler-RT, LibCLC, and so on.

For the most part the compilers I work on are cross-compilers for embedded systems with varying degrees of OS support (from none at all, to pretty much everything), and things like the local host’s suite of header files are entirely inappropriate.

If I configure and build GCC for a particular target triple with a specific C library (e.g.: ‘glibc’, ‘newlib’, ‘uclibc’, ‘musl’), then the corresponding suites of headers and libraries are particular to that configuration.

The built cross-compiler has a sub-directory named after the triple (e.g.: ‘foo-unknown-elf’) which contains the ‘include’, ‘bin’ and ‘lib’ directories specific to this particular configuration, and this allows me to also configure and build GCC for multiple targets with different choices of supporting C libraries, and they can co-exist in the same base directory without cross contaminating each other.

I would like to do something similar with CLang/LLVM but there does not appear to be a convention for doing this, even though CLang (unlike GCC) can have a single compiler instance that can handle multiple targets.

A related aspect is that if I want to also build the supporting libraries for the selected target (e.g.: ‘libc++’, ‘compiler-rt’), then I really need the CMake configuration tools for the LLVM projects to be able to incorporate building the selected C library in advance of cross-compiling these other libraries; that is, the ‘include’ files from the selected C library would need to be staged as part of the build process, and the libraries (e.g.: ‘libc’, ‘libm’) would also need to be built for this cross-compiled target, before ‘libc++’ or the ‘compiler-rt’ libraries are built.

Typically I would like to build CLang and libraries to support one or more cross-targets plus the local host target, and then choose which target I want at runtime using the triple. For example, let’s say I want to support two targets ‘foo’ and ‘bar’, plus the local target; then I would like configure my build (one possible configuration choice) so that:

· target ‘foo’ prepares and builds the C library based on Newlib

· target ‘bar’ prepares and builds the C library based on uClibc

· local target uses the existing installed C library support on the host

Are there any plans within the LLVM community to standardise how to do this kind of build? The CLang Driver would need to adopt conventions that would allow this, and I think that perhaps a new CMake module could be added that would optionally prep & stage a selected and supported C library for a cross-compiler configuration, and that this could be made an optional dependency of the libraries which are to be cross-compiled (‘libc++’ and ‘compiler-rt’).

Related to this is how to build and manage the equivalent of GCC’s MULTILIBs for cross-compiler targets in a regularised way.

Does anybody have any experience of doing this kind of thing with CLang, and advice on how I should approach this? So far I have done this in an ad-hoc way for each of my intended targets, and use intermediate non-integrated build processes to ensure that the right headers and C libraries are prepared in advance of being used by the subsequent dependent libraries from the LLVM project.

Thanks,

MartinO

There is no need to do anything like that with clang. It supports
sysroot out of the box, i.e. just point it to a system image with the
headers and libraries, and the right thing will happen. Depending on
your path, you might also need -B<where-the-target-tools-are>.

Joerg

I don't think this helps, perhaps my explanation is not really clear enough.

When I configure CLang with options such as:

  -DLLVM_TARGETS_TO_BUILD=FOO;BAR;X86 -DLLVM_DEFAULT_TARGET_TRIPLE=foo

I need to also build LibC++, Compiler-RT and other LLVM libraries for each of the targets 'FOO', 'BAR' and 'X86'. And since the building of these libraries also requires the presence of the C headers in particular, the 'sysroot' is going to be different for each of the targets depending on which C library implementation the configuration needs.

So what I would like is to also seamlessly integrate the configuration and building of the required C headers and libraries for each target using the C library implementation required by that target before also building the LLVM libraries which include the C headers.

Later when using the resulting clang executable, I would like to be able to say something like:

  clang --target=bar helloworld.c -o helloworld.bar.out
  clang --target=x86_64 helloworld.c -o helloworld.x86.out
  clang helloworld.c -o helloworld.foo.out

and have each invocation use different sets of C headers and target libraries corresponding to the target selected.

Currently I build the clang executable for the target set, but not the libraries. Next, using an external non-integrated custom build system, I pre-configure the header set for the target (Newlib's '<stdio.h>' is not the same as MUSLs for instance), then build the libraries and object files for the target, and install them in separate target specific subdirectories. And I also have to adapt/hack the CLang Driver to select the corresponding header directories and libraries in the appropriate way.

What I would like is for the normal CLang build to have a set of conventions that allowed such configurations to be able to automatically configure the required C library headers, and build the libraries for each of the targets using the just-built 'clang'. Building LibC++ and Compiler-RT libraries for cross compilers has improved a bit over the past two years, but it still cannot handle the multiple-target scenario - or if it can, it is just not obvious to me how to do so.

Thanks,

  MartinO

Hello Martin,

Sadly I'm not sure I can offer you much advice, I've put some comments
and questions inline.

This is question I believe is mostly a CLang issue, though there are some
interactions with LLVM, LibC++, Compiler-RT, LibCLC, and so on.

I suspect that this is more of a Toolchain creation issue rather than
just clang. From what I can surmise there are two parts to this:
- How do I build an embedded toolchain based on clang that has
per-target C libraries, compiler-rt, libcxx etc.
- Given such a toolchain how do we do as much of the library/include
selection based on the target.

For the most part the compilers I work on are cross-compilers for embedded
systems with varying degrees of OS support (from none at all, to pretty much
everything), and things like the local host’s suite of header files are
entirely inappropriate.

Agreed.

If I configure and build GCC for a particular target triple with a specific
C library (e.g.: ‘glibc’, ‘newlib’, ‘uclibc’, ‘musl’), then the
corresponding suites of headers and libraries are particular to that
configuration.

I'm unfortunately not that familiar with building GCC. Just to confirm
that you mean that these are the libraries/includes that the GCC cross
compiler will use, and not libraries/includes that the GCC cross
compiler will be built with.

The built cross-compiler has a sub-directory named after the triple (e.g.:
‘foo-unknown-elf’) which contains the ‘include’, ‘bin’ and ‘lib’ directories
specific to this particular configuration, and this allows me to also
configure and build GCC for multiple targets with different choices of
supporting C libraries, and they can co-exist in the same base directory
without cross contaminating each other.

By cross-compiler do you mean the cross toolchain? For example my gcc
embedded toolchain has a (simplified) dir structure:
toolchain/arm-none-eabi/bin
toolchain/arm-none-eabi/lib
toolchain/arm-none-eabi/include
There is also
toolchain/lib/gcc/arm-none-eabi/7.2.1/...
With the main compiler binary in
toolchain/bin/arm-none-eabi-gcc (alongside the other binutils)

Do you mean the toolchain/(triple) directory here? If so I'm guessing
that this is more along the lines of toolchain construction than
building clang.

I would like to do something similar with CLang/LLVM but there does not
appear to be a convention for doing this, even though CLang (unlike GCC) can
have a single compiler instance that can handle multiple targets.

I'm not aware of such a convention either (at least outside the resource dir).

A related aspect is that if I want to also build the supporting libraries
for the selected target (e.g.: ‘libc++’, ‘compiler-rt’), then I really need
the CMake configuration tools for the LLVM projects to be able to
incorporate building the selected C library in advance of cross-compiling
these other libraries; that is, the ‘include’ files from the selected C
library would need to be staged as part of the build process, and the
libraries (e.g.: ‘libc’, ‘libm’) would also need to be built for this
cross-compiled target, before ‘libc++’ or the ‘compiler-rt’ libraries are
built.

If I understand you correctly you want to integrate the build system
of third party C-libraries into CMAKE? Or is it limited to just
selecting the header files from each of the C-libraries.

AFAIK there is a cmake recipe to build compiler-rt (when placed in
the runtimes directory) for multiple arm M class configurations. I
think adding more clang recipes might be one way to go.

Typically I would like to build CLang and libraries to support one or more
cross-targets plus the local host target, and then choose which target I
want at runtime using the triple. For example, let’s say I want to support
two targets ‘foo’ and ‘bar’, plus the local target; then I would like
configure my build (one possible configuration choice) so that:

· target ‘foo’ prepares and builds the C library based on Newlib

· target ‘bar’ prepares and builds the C library based on uClibc

· local target uses the existing installed C library support on the
host

Are there any plans within the LLVM community to standardise how to do this
kind of build? The CLang Driver would need to adopt conventions that would
allow this, and I think that perhaps a new CMake module could be added that
would optionally prep & stage a selected and supported C library for a
cross-compiler configuration, and that this could be made an optional
dependency of the libraries which are to be cross-compiled (‘libc++’ and
‘compiler-rt’).

I think that this would probably be classed as a toolchain build
outside of the scope of the LLVM project. I guess the main problem is
where do you stop, there are a huge number of possible combinations of
libraries and configurations, which ones would be supported, who would
test them? I think that there is scope for collaboration and
improvement in this area, but it might be best to start outside of the
existing build system.

Related to this is how to build and manage the equivalent of GCC’s MULTILIBs
for cross-compiler targets in a regularised way.

I certainly think that improvements to how clang could work with
MULTILIB would be welcome. My understanding is that there is some
driver specific multilib support but nothing in the existing BareMetal
driver. I think that this would be a great place to start in improving
clang support for embedded toolchains.

Does anybody have any experience of doing this kind of thing with CLang, and
advice on how I should approach this? So far I have done this in an ad-hoc
way for each of my intended targets, and use intermediate non-integrated
build processes to ensure that the right headers and C libraries are
prepared in advance of being used by the subsequent dependent libraries from
the LLVM project.

I think that the general problem is much harder and messier than the
ad-hoc so as far as I know most people have made an ad-hoc solution
for the use case that they want to support, but there isn't an easy
way to scale it to take into account everyone's use case. I think it
would be great to start with improving the clang bare metal driver so
that we can drop a clang cross compiler into an existing gcc embedded
toolchain (with existing multilibs), or at least something built with
the same structure. This is the "Given such a toolchain how do we do
as much of the library/include selection based on the target." which
is, I think, more of a tractable problem than the building a toolchain
part. I sadly don't have enough CMake or toolchain build experience to
say much about the toolchain build process; I feel like the problem
seems a bit too wide in scope to be practical, is there any way of
narrowing it down.

Peter

----- Oryginalna wiadomość -----

I don't think this helps, perhaps my explanation is not really clear enough.
(...)

generally you can crosscompile for any target (enabled in llvm) with -target/--sysroot switches:
https://clang.llvm.org/docs/CrossCompilation.html#target-triple

afaics there's a minor bug in clang driver, it searches target compiler-rt files inside clang installation tree instead of --sysroot:
https://bugs.llvm.org/show_bug.cgi?id=36053

i'm using a clang/llvm toolchain with lld linker and libcxxabi/libcxx/compiler-rt runtime for 32/64-bit linux crosscompilation and it works pretty fine.

Thanks,

Yes, this is as I expected it to be currently. This means that I cannot do:

  clang -target bar helloworld.c -o helloworld.bar.out
  clang -target x86_64 helloworld.c -o helloworld.x86.out
  clang helloworld.c -o helloworld.foo.out

but instead have to use:

  clang -target bar --sysroot=<path-for-bar> -L <libs-for-bar> -I <includes-for-bar> helloworld.c -o helloworld.bar.out
  clang -target x86_64 --sysroot=<path-for-foo> -L <libs-for-foo> -I <includes-for-foo> helloworld.c -o helloworld.x86.out
  clang helloworld.c -o helloworld.foo.out

It also means that the current CMake descriptions for building LibC++ and Compiler-RT cannot build these libraries for each of the targets configured for LLVM (using '-DLLVM_TARGETS_TO_BUILD=FOO;BAR;X86').

I had hoped that there was a less clunky way of doing this which I didn’t know about, or possibly in development by somebody else.

All the best,

  MartinO

Thanks Peter, responses inline ...