Dogfood part1: The problem

In 2017 at CPPCON Chandler Carruth introduced Titus Winters and his talk “C++ as a “Live at Head” Language”, that same year at Pacific++ Chandler Carruth himself gave a talk on “LLVM: A Modern, Open C++ Toolchain” where he worked at git HEAD.

My goal is to develop my application against the latest source code or against the latest stable code depending on the maturity of the codebase. One advantage of ‘live at HEAD’ is that I am more likely to be able to talk to the experts about changes that break my code.

For the llvm-project, this means automating a dogfood build where the project can build itself. This reduces dependencies on other code. It is possible that rogue llvm-project code could break this cycle, so restarting from a known position may be neccessry. I have used Docker and Python during my experimentation.

The problem:
My environment is aarch64-linux-gnu where writing c++ in this environment usually requires

  • Binary utilities such as ld
  • GNU Compiler Collection including g++, libstdc++, libgcc, libatomics

I am trying to replace these with the llvm-project environment:

  • llvm tools, clang++, lld
  • libc++ and compiler-rt

Many organisations have made this transition, but I would like to remain in the <aarch>-<vendor>-linux-gnu environment - It seems like this is a path less travelled. I have achieved my goal by using a combination of ‘project only’ builds and ‘runtime only’ builds but it required some patches and raises some issues. I am open to all feedback.

There are two circular dependencies within the code that I can see:

  • clang++ <-> libc++
  • libc++<-> compiler-rt

The first is broken by the careful work of your developers to provide the ability to build clang++ using the GNU Compiler Collection

subprocess.run (
    [
        'cmake',
        '-G',
        'Ninja',
        '-S',
        'llvm',
        '-B',
        build,
        '-DCMAKE_BUILD_TYPE=Release',
        '-DCMAKE_INSTALL_PREFIX=' + STAGING,
        '-DLLVM_ENABLE_PROJECTS=clang;lld',
        '-DLLVM_TARGETS_TO_BUILD=Native',
    ],
    check=True
)
subprocess.run (
    [
        'ninja',
        '-C',
        build
    ],
    check=True
)
subprocess.run (
    [
        'ninja',
        '-C',
        build,
        'install'
    ],
    check=True
)

Where

  • ‘build’ is set to ‘build-functional-clang’, the first of multiple builds
  • STAGING is set to ‘/staging’ - I am installing everthing into a staging area because many tools, libraries and headers are located relative to clang. This wastes both time and space, but I haven’t found a way around this yet.

Sorry for the Python code, I don’t want to create errors transcribing the builds, but I think you can see this is a simple ‘project only’ build. The code builds functional llvm tools, clang and lld, but using ‘ldd -v’ reveals that they use one or more of the GNU Compiler Collection’s libraries.

I am not sure if you have a question here? Do you want to build the LLVM tools without any GNU libraries?

1 Like

Yes, I am trying to create a build which is least dependent on other projects. As the llvm-project does not yet have a working libc, this seems to be the minimum dependency. However, I do not have enough experience to know if the decisions I have made to make it work are sound. So I am posting how I did it with the hope that someone will agree/disagree and hopefully apply the three changes to the code this approach needs. Dogfood is a reference to companies that use their own products in-house.

What you’re describing is already supported by the LLVM CMake build, specifically this can be achieved by the combination of 2-stage and bootstrapping build. This is (partially) documented in Advanced Build Configurations — LLVM 16.0.0git documentation.

We use this for example to build the Fuchsia Clang toolchain which is fully self-contained with the exception of the C library. That is, our toolchain is built against compiler-rt, libunwind and libc++, and ships compiler-rt, libunwind and libc++ (for multiple targets). The CMake cache files we use are part of LLVM, see llvm-project/Fuchsia.cmake at f17639ea0cd30f52ac853ba2eb25518426cc3bb8 · llvm/llvm-project · GitHub and llvm-project/Fuchsia-stage2.cmake at f17639ea0cd30f52ac853ba2eb25518426cc3bb8 · llvm/llvm-project · GitHub.

Thanks for your comments Petrhosek, you and Louis are at the top of my shortcuts at the moment.

I did indeed start out studying the Fuchsia cache code, but it did not quite work in the <arch>-<vendor>-<linux>-<gnu> environment.

I tried for a long time tweaking flags using a bootstrap build, but although close to what I wanted, each flag I tried moved me closer in one aspect and further away in another. I found it deeply frustrating and difficult to debug as I am sure you are aware.

When Louis wrote the article

libc++ 15.0.0git documentation

I used his ‘default build’ as my ‘runtimes only’ build and was able to make much faster progress in understanding the problems. I hope you can find the time to read the next 3 posts which highlight those problems I encountered and can comment on whether I have come to the right solutions to fix them.