Tooling RFC: Find build root from source root with a symlink?

Many projects/build systems have the concept of a “build tree” that can be separate from the source tree. (e.g. CMake supports this, and LLVM recommends this approach).

Tools sometimes need to find the build tree.
I’d like to suggest a convention for the Tooling library to support:
If $SRC/buildroot exists, it’s the default build directory for the source tree $SRC.
Otherwise, the build directory is $SRC itself.
$SRC/buildroot may be a symlink.
Configuration files like compile_commands.json are searched for in the build directory.
(Bikeshedding the “buildroot” string is welcome. “build” will conflict too often, sadly).

Implications:

  • for /foo/bar/baz.cc, we’d search for compile_commands.json in {/foo/bar/buildroot/,/foo/bar/, /foo/buildroot/, etc}. This is backwards-compatible.
  • where users currently symlink compile_commands.json itself, they could create the buildroot symlink instead. This would work in the same way, and enable new cases.
  • if we implement e.g. a Ninja-backed compilation DB that doesn’t need compile_commands.json, the same symlink convention would work.
  • it provides a simple model for multi-configuration-aware tools: a configuration is defined by a build dir, there’s a default configuration, tools can let the user override the build dir. e.g. clangd can write its index files into the build dir instead of the source dir, which is multi-configuration-friendly (tinyurl.com/clangd-automatic-index)
  • non-clang tools that consume compile_commands.json will need to be updated over time.

What do you think?
Cheers, Sam

Many projects/build systems have the concept of a "build tree" that can be separate from the source tree. (e.g. CMake supports this, and LLVM recommends this approach).

Tools sometimes need to find the build tree.
I'd like to suggest a convention for the Tooling library to support:
    If $SRC/buildroot exists, it's the default build directory for the source tree $SRC.
    Otherwise, the build directory is $SRC itself.
    $SRC/buildroot may be a symlink.
    Configuration files like compile_commands.json are searched for in the build directory.
(Bikeshedding the "buildroot" string is welcome. "build" will conflict too often, sadly).

Is "buildroot" supposed to be a hardcoded constant?
Personally, i have never ever named any of my build dirs that. They
were always named "build".
Also, what if there are multiple build trees - build-clang-release,
build-gcc-release, ... ?

Implications:
- for /foo/bar/baz.cc, we'd search for compile_commands.json in {/foo/bar/buildroot/,/foo/bar/, /foo/buildroot/, etc}. This is backwards-compatible.
- where users currently symlink compile_commands.json itself, they could create the buildroot symlink instead. This would work in the same way, and enable new cases.
- if we implement e.g. a Ninja-backed compilation DB that doesn't need compile_commands.json, the same symlink convention would work.
- it provides a simple model for multi-configuration-aware tools: a configuration is defined by a build dir, there's a default configuration, tools can let the user override the build dir. e.g. clangd can write its index files into the build dir instead of the source dir, which is multi-configuration-friendly (tinyurl.com/clangd-automatic-index)
- non-clang tools that consume compile_commands.json will need to be updated over time.

What do you think?
Cheers, Sam

Roman.

Many projects/build systems have the concept of a “build tree” that can be separate from the source tree. (e.g. CMake supports this, and LLVM recommends this approach).

Tools sometimes need to find the build tree.
I’d like to suggest a convention for the Tooling library to support:
If $SRC/buildroot exists, it’s the default build directory for the source tree $SRC.
Otherwise, the build directory is $SRC itself.
$SRC/buildroot may be a symlink.
Configuration files like compile_commands.json are searched for in the build directory.
(Bikeshedding the “buildroot” string is welcome. “build” will conflict too often, sadly).
Is “buildroot” supposed to be a hardcoded constant?

Yes, I don’t mind what the constant is, but this needs to be a constant for the discovery to work.

Personally, i have never ever named any of my build dirs that. They
were always named “build”.

Unfortunately people also name other types of directories “build”, such as those containing checked-in build scripts.
If you had a project set up with your build-dir as $SRC/build, then you’d need to ln -s $SRC/build $SRC/buildroot.

Also, what if there are multiple build trees - build-clang-release,
build-gcc-release, … ?

Source-based tools (clang-tidy, code completion, etc) need to be able to pick a default configuration without user interaction, for usability.
You should symlink one of these to $SRC/buildroot, to use as the default. Tools should provide some way to override this default.
(Possibly CMake etc could create this symlink if it doesn’t exist, i.e. when you create the first build dir).

+1, been thinking about this as well as I end up setting –compile-commands-dir all the time as a big user of cmake. And I’m on Windows where symlinks are generally suspect. clangd doing this for us would be definitely helpful.

Though, one for the bikeshed. As an example, the llvm .gitignore suggests using /build, as do many other projects. I’m just wondering if the cost of conflicts outweighs the convenience of not having to change.

Doug.

+1, been thinking about this as well as I end up setting –compile-commands-dir all the time as a big user of cmake. And I’m on Windows where symlinks are generally suspect. clangd doing this for us would be definitely helpful.

Just trying to unpack this a bit:

  • today, it’s possible to get this to work by symlinking src/compile_commands.json → build/compile_commands.json. I’m proposing to generalize this by supporting symlinking src/buildroot → build/. But if symlinks don’t work for you now, I don’t think this would fix that problem.
  • you could indeed put your physical build dir in src/buildroot, though obviously this limits your flexibility.

Though, one for the bikeshed. As an example, the llvm .gitignore suggests using /build, as do many other projects. I’m just wondering if the cost of conflicts outweighs the convenience of not having to change.

So we could check src/build and if there’s no CDB there, fall back to checking src/.
The problem is I don’t know what projects that have an existing src/build and also use an external build dir are supposed to do. Chromium is one such project. The cost is probably also adding a fallback to src/buildroot, which increases the number of stats we need to do when walking up from your source files to find the root.

(I put a basic implementation in https://reviews.llvm.org/D53145 in case anyone is curious what this would look like)

Consider using dot-prefix something like ‘.buildroot’ to hide it from a normal directory listing.