Tablegen ridiculously slow when compiling for Debug

Hi all,

On LLVM version 7.0.1, incremental builds are very fast for both Release and Debug. I’m compiling with Xcode

I recently downloaded LLVM 9.0 from the LLVM-mirror Github repository and found that Incremental "Debug” builds take a ridiculously long time due to Tablegen taking ages (literally more than 10 minutes) to generate files. This makes it totally unusable for debug purposes. However, incremental ‘Release’ builds only take a few seconds.

Why is that?. Any suggestions?.

Thanks.

John Lluch

Hi,

Hi Florian,

Ok, I ran this:

cmake -S LLVM -DCMAKE_INSTALL_PREFIX=INSTALL -DLLVM_OPTIMIZED_TABLEGEN=On -G Xcode

Compiled it again from clean, and the situation is worse than before. Incremental builds take an incredible amount of time stuck in running Tablegen scripts for all targets. Now this happens both in Release and Debug configurations. Just before this, at least Release compiled fine, but that’s no longer the case.

Any other suggestions? What could actually cause this?

Thanks
John

1 Like

Maybe try building llvm as a shared objects…

Hi Praveen,

Please, can you elaborate on this?. What do do mean by “building as shared objects”.

Thanks,

John

cmake BUILD_SHARED_LIBS option, it builds llvm as .so not as .a. It will use less memory during linking so you can increase the link threads and your build time will be lesser.

Check this in : https://llvm.org/docs/CMake.html

Hi Praveen,

Thanks for the tip, but Xcode seems to spend all the time running tablegen “custom shell scripts”, one by one at a time, not linking. Linking is actually very fast, possibly less than a second. The “scripts” that take longer are “AArch64CommonTableGen" and “AMDGPUCommonTableGen”. As said this is on LLVM 9.0.

However, on LLVM 7.0.1, the same process takes just 5-6 seconds in total, with individual “scripts” taking significantly less than 1 second each. There must be some difference between LLVM 9.0 and LLVM 7.0 that might cause this (?)

John

Hi Joan :slight_smile:
Oh I don’t have any experience in Xcode & previous versions of LLVM though…

Are you saying that the TableGen execution isn't parallelized? That seems like an obvious Xcode-specific problem...

The TableGen executions are parallelized with cmake/ninja just fine.

Cheers,
Nicolai

This is also the case with the Visual Studio generators. Custom commands in a single cmake file essentially get written out line by line into a single batch file that gets processed as a custom build step. In the VS case this means that it can, for example, run X86 and Aarch64 tablegen steps in parallel with each other but all of the individual X86 invocations get processed serially. I can well imagine that the Xcode situation is similar although I’ve no experience with it myself to know for sure.

As previously mentioned, the best solution is probably to try to adjust your workflow to use the Ninja (
https://ninja-build.org
) CMake generator if at all possible. It’s a bit of an adjustment but it does work very nicely with the LLVM build system.

-Greg

Hi Greg,

I tried to setup Ninja before on my mac but I mush have done something wrong and I didn’t manage to get it work. I’m not familiarised at all with the procedures involved. I may try that again to see If I have some luck though. It’s a pity that LLVM is not particularly friendly with familiar IDEs such as xCode on macs and Visual Studio on windows.

John

Hey Joan,

When looking for build support it is really useful to include a bunch of information about your build up front. Knowing that you are on macOS, and using the Xcode generator are really useful.

On macOS, BUILD_SHARED_LIBS won’t really help much because the default linker (ld64) is pretty good. Using an IDE generator and setting LLVM_USE_OPTIMIZED_TABLEGEN will kill your release builds.

In general Xcode takes 2x-3x longer than Ninja for incremental builds, and 1.5x-2x as long for clean builds. Lots of people use the Xcode generator to create a project file for navigation and editing, but most people I know doing LLVM development on macOS use Ninja for their builds.

-Chris

[resending to the whole list]

I wonder if we can stop rebuilding TD files unconditionally, i.e. generate dependencies for TD files based on include directives and just allow the build system do its job? Would that solve most of the build time issues?

Thanks,

Slava

If someone can manage it, it wouldn’t be a bad thing - obviously open up more parallelism (I don’t know how much of LLVM can be built before you hit everything that needs tblgen run - I guess libSupport and some other bits)

In CMake 3.7 and later the Ninja generator can handle depfiles which gives us correct and accurate dependencies for tablegen, and we do use that support if it is available.

I’m surprised CMake has never extended that support to the Makefile generator, but unsurprised it isn’t supported in the IDE generators. I’m reasonably confident that you can’t add that support to Xcode without treating tablegen as an extra compiler, which (I believe) requires an Xcode plugin. Even if that isn’t the case the Xcode build system’s extensibility is largely undocumented and I’m sure it would be very challenging to extend it in this way.

I think it would be possible to add that support to CMake’s MSBuild generator for Visual Studio, but I’m not sure why you would. It seems like Microsoft’s preferred approach to using CMake with Visual Studio is with ninja as the build tool via the CMake Server integration.

-Chris

Much of this has been discussed (over many, many years) on

Some of the issues that were identified included:

1 - Poor tablegen dependency handling leading to unexpected rebuilds.

2 - Debug STL iterator checks taking an insane amount of time (might be MSVC specific).

3 - Lack of parallelization of custom commands (XCode and VS builds) - VS at least has a recent (VS2017+?) ‘build custom tools in parallel’ option that can be enabled per project file - we should investigate setting that automatically.

4 - A lot of O(N^2), or worse, code that has built up over the years.

5 - Poor STL type selection resulting it excessive iteration/access times.

Running a profiler every so often helps find some quick improvements but its not really fixing these core problems.

Simon.

I think the O(N^2) behavior is probably the worst culprit, but I don’t really know anything about Tablegen to fix it. And I think that speaks to another problem: not many people do. It would be great if someone knowledgeable about Tablegen could look at it with the specific aim of improving the algorithmic complexity

There are other issues that come in WRT CMake + Xcode. We don’t see a lot of people using that configuration, so we don’t often discuss them.

It isn’t that Xcode doesn’t parallelize custom commands, it doesn’t parallelize targets unless you set a setting on the scheme. CMake has limited support for generating schemes which was only added in CMake 3.9. I had briefly looked at adding scheme support to our build, but I didn’t because schemes don’t really match super well to LLVM development workflows, so it would really need to be added by an Xcode user wanting to define a development flow for Xcode.

Additionally the Xcode project format doesn’t support adding custom tools in a declarative format. There are two routes that work either you embed them as shell scripts, or you wrap them in Makefiles and create Makefile build targets (CMake does the later). In either case Xcode is very limited in its ability to track dependencies in and out of shell and Makefile tasks, which frequently results in “throw hands in the air and rerun everything”.

CMake generating Makefile targets and not having historical support for Xcode schemes combined to make the Xcode build of LLVM much slower than it should be, and there isn’t really a good solution to the problem that the LLVM or CMake communities can drive.

-Chris

This is a useful piece of information, Chris! I will try it.

Unfortunately, this does not help resolving Joan’s problem with Xcode.

Thanks,

Slava

Much of this has been discussed (over many, many years) on 28222 – llvm-tblgen is still too slow in the default debug build configuration

Some of the issues that were identified included:

1 - Poor tablegen dependency handling leading to unexpected rebuilds.

2 - Debug STL iterator checks taking an insane amount of time (might be MSVC specific).

3 - Lack of parallelization of custom commands (XCode and VS builds) - VS at least has a recent (VS2017+?) 'build custom tools in parallel' option that can be enabled per project file - we should investigate setting that automatically.

4 - A lot of O(N^2), or worse, code that has built up over the years.

5 - Poor STL type selection resulting it excessive iteration/access times.

Let me add:

6 - The TableGen frontend is re-run for every TableGen backend invocation.

It might help to invoke the TableGen executable only once for each LLVM backend, having the TableGen frontend parse everything and instantiate and resolve records only once, and then run all backends in the same process.

This requires re-architecting how the backends are driven, and it does have the disadvantage that the backends can no longer run in parallel -- unless we take a careful look at making that possible -- so it's not actually an easy fix.

Cheers,
Nicolai