Unifying CMake variable names used in checks across subprojects

We’ve been using the runtimes build for a while now and we’re very happy with it. However, with an increasing number of targets, it can be fairly slow and I have noticed that we now spend more time in CMake than in Ninja. There are various ways we could improve things like eliminating unnecessary checks.

When running checks like check_c_compiler_flag, check_cxx_compiler_flag or check_library_exists, CMake caches the resulting variable and doesn’t run the check again. The problem is that in LLVM, each subproject uses different variable names for results of these checks. For example, most subprojects check if pthread is available and store the result in:

COMPILER_RT_HAS_LIBPTHREAD (compiler-rt)
LIBCXX_HAS_PTHREAD_LIB (libc++)

LIBCXXABI_HAS_PTHREAD_LIB (libc++abi)

LIBUNWIND_HAS_PTHREAD_LIB (libunwind)

HAVE_LIBPTHREAD (llvm)

This means that even though this check would ideally be performed just once (per target) and reused everywhere, it’s performed 5 times. The same is true for most flags and library checks.

I think that this is really unnecessary and could be easily improved by unifying CMake variable names used in checks across subprojects to benefit from caching.

I’ve looked at naming conventions used across all subprojects and I’m proposing the following:

C_SUPPORTS_${mangled_name}FLAG for check_c_compiler_flag
CXX_SUPPORTS
${mangled_name}_FLAG for check_cxx_compiler_flag

HAVE_${mangled_name} for check_library_exists

Note: It’d be more consistent for check_library_exists to use HAS_${mangled_name}_LIB but that’s going to cause more churn in LLVM so that’s something to consider.

This change should be mostly invisible to LLVM developers (except for the handful of build maintainers), but it should considerably speed up the runtimes build and hopefully pave the way to eventually hoist most of the common CMake logic into a shared location.

I’m happy to implement this change, but I want to get your opinion on the proposal as well as the proposed naming.

From the “not largely affected” camp:

  • the churn doesn’t feel that major for HAS_ and …
  • the uniformity feels nice

and in general feels nice and in pursuit of the longer term goals here.

-eric

IMO, these issues are a manifestation of the fact that we basically have (at least) 4 times the same overall logic, once for each runtime project: compiler-rt, libunwind, libcxxabi, libcxx.

At the end of the day, what we’re trying to achieve is link against the right system libraries when building the various runtimes. Would it make sense to instead bundle together the logic of searching for these libraries and adding the right compiler flags? We could use interface targets to achieve that. IOW, from libc++'s CMake, I’d love to just be able to write:

target_link_libraries(cxx PUBLIC runtimes-system-libraries)

This would add -lpthread, -lgcc -lgcc_s, -lSystem or whatever else is needed on the system. I think this approach would provide more build system simplification and be more robust in the long term than relying on a naming convention to achieve sharing. What do you think?

Louis

Using more interface libraries is definitely the right direction and a modern way to use CMake. I’m not sure if we can get to a single interface target since different runtimes have different requirements. I was assuming that we would have one interface target per dependency and use the existing CMake support where it already exists, for example use the FindThreads module to handle pthreads.

I mostly want to ensure that we’re not letting the perfect be the enemy of the good. We’ve been talking about more major CMake refactorings for some time, but we haven’t made much progress so far, partially because nobody has a clear idea what the end state is going to look like. I think that this proposal can be implemented pretty quickly (it’s mostly just a bunch of grep & sed) and while it’s not the end state we want, it’s a stepping stone which would make an immediate impact on users. After this change, we can start introducing interface targets and later factoring those out once we make more progress on setting up the common CMake infrastructure. Does that make sense?

Using more interface libraries is definitely the right direction and a modern way to use CMake. I’m not sure if we can get to a single interface target since different runtimes have different requirements. I was assuming that we would have one interface target per dependency and use the existing CMake support where it already exists, for example use the FindThreads module to handle pthreads.

I mostly want to ensure that we’re not letting the perfect be the enemy of the good. We’ve been talking about more major CMake refactorings for some time, but we haven’t made much progress so far, partially because nobody has a clear idea what the end state is going to look like. I think that this proposal can be implemented pretty quickly (it’s mostly just a bunch of grep & sed) and while it’s not the end state we want, it’s a stepping stone which would make an immediate impact on users. After this change, we can start introducing interface targets and later factoring those out once we make more progress on setting up the common CMake infrastructure. Does that make sense?

Sure, I agree. It is indeed difficult to make time for (and justify) improving to the build system. I’ll take a look if you send a patch.

Louis

I’d be interested to know what you think of this proposal, which is on the same lines: https://reviews.llvm.org/D85140

Steve

I’d be interested to know what you think of this proposal, which is on the same lines: https://reviews.llvm.org/D85140

I think that’s interesting, but it’s somewhat orthogonal to what I was talking about IIUC. Your proposal revolves around standardizing some of the parameters used to customize the build of various LLVM projects, whereas what I was suggesting is standardizing how different LLVM projects (only the runtimes to be specific) link against base system libraries.

I also don’t think that a standardization like the one proposed in D85140 makes sense for all projects. For example, it doesn’t strike me that it makes sense for the runtime projects to have a (e.g.) LIBCXX_INSTALL_UTILS CMake variable. It might make sense for some other projects though.

Louis