RFC: Supporting clang-cl /MP

Hi folks,

With the recent support for Clang in Visual Studio 2019 [1], I thought we should revive and eventually land the /MP proposal I made a while ago [2].

As it stands now, building with clang-cl is much slower than MSVC when used along MSBuild, see timings in summary [2].

With this patch, the situation is greatly improved, clang-cl /MP performance is better in all cases [5].

I think we should add it, I think it would benefit a lot of users.

Most of the changes in the demo patch in [2] are: modernization of the process launching API (ProcessInfo), and support for sys::WaitMany, which I’d like to commit separately and incrementally.

Support for /MP is very small and mostly located in clang/trunk/lib/Driver/Compilation.cpp in the demo patch.

+1 for incremental patches

We could subsequently add a new flag –j to the regular clang driver, which mimics /MP. Further down the road, I’d like to discuss optional support for concrt multithreading which, from some preliminary testing, would be much faster than the current cc1 sub-process invocation, at least on Windows 10 1703+.

Some people feel strongly that the compiler should not have a hard dependency on threading libraries, so I’m not sure we should go this far. The -cc1 process separation is mainly used for crash recovery and for compiler developers to separate the filling in of a bunch of default header search paths and flags from the user interface. I think as long as we implement some kind of crash recovery mechanism, I’d be in favor of eliminating the -cc1 process for most compiles.

Hi folks,

With the recent support for Clang in Visual Studio 2019 [1], I thought we should revive and eventually land the /MP proposal I made a while ago [2].

As it stands now, building with clang-cl is much slower than MSVC when used along MSBuild, see timings in summary [2].

With this patch, the situation is greatly improved, clang-cl /MP performance is better in all cases [5].


When the MP option is active (default behavior) in a Visual Studio Project [3], MSBuild issues commands such as:

clang-cl /MP file-A.cpp file-B.cpp file-C.cpp

This creates a root clang-cl process, and then sequentially a cc1 sub-process for each CPP file in the list. Evidently this is slow and doesn’t use the full potential on modern multi-cores machines.

A workaround is to increase the number of parallel builds [4], but that is sub-optimal and is still slower than MSVC.

The /MP proposal I made in [2] simply maintains a list of cc1 sub-processes for each hardware core (or for the number of cores provided as an option to /MP).

Waiting for termination in the root clang-cl process is done through a single OS primitive (see sys::WaitMany), which temporarily suspends the root process until at least one cc1 sub-process finishes. This ensure no cycles are wasted waiting, and avoids useless context-switches to the root process.


Most of the changes in the demo patch in [2] are: modernization of the process launching API (ProcessInfo), and support for sys::WaitMany, which I’d like to commit separately and incrementally.

Support for /MP is very small and mostly located in clang/trunk/lib/Driver/Compilation.cpp in the demo patch.

We could subsequently add a new flag –j to the regular clang driver, which mimics /MP. Further down the road, I’d like to discuss optional support for concrt multithreading which, from some preliminary testing, would be much faster than the current cc1 sub-process invocation, at least on Windows 10 1703+.

(Folks working on Microsoft’s standard library said they’ve moved off concrt, and the one use of it in LLVM is incorrect according to them, see https://bugs.llvm.org/show_bug.cgi?id=41198)