Cmake-gen'd parallel make breaks on native tablegen

Hello backend devs,

Been working on a parallelization bug in the cmake configs, where parallel makefile builds of llvm with clang and native tablegen would usually break during linking in either either the clang_tblgen or llvm_tblgen targets. Looking at the cmake command that generates the tablegen makefile commands (I think) (cmake/modules/TableGen.cmake:113-117):

      add_custom_command(OUTPUT ${${project}_TABLEGEN_EXE}
        COMMAND ${CMAKE_COMMAND} --build ../.. --target ${target} --config Release
        DEPENDS CONFIGURE_LLVM_NATIVE ${target}
        WORKING_DIRECTORY ${LLVM_NATIVE_BUILD}
        COMMENT "Building native TableGen...")

My guess is that the clang and llvm tablegen builds are trying to place their output into the same folder, so depending on when the resulting library gets linked/moved one of the threads could be left dangling with a partially-completed library or no library. The error was either "Couldn't stat output file: ../LLVMSupport.a No such file or directory" or something about an archive member extending past the end of the file, and the cmake progress messages before linking were printed twice. Once for each thread, I'd guess? Sort of surprised that they seem to share so much code (or at least filenames)

First thing I could think of to fix this is adding another layer or two of CMakeLists, so the tablegen creation command would be invoked in different folders. I'm not convinced that that is the best fix, and I couldn't figure out how to move the builds into different folders with only minor changes to the existing cmake files.

Thoughts?

-Alex

(CrossCompile.cmake also has a typo in the first line: "toochain" --> "toolchain", if that is something to worry about)

Hello backend devs,

Been working on a parallelization bug in the cmake configs, where parallel makefile builds of llvm with clang and native tablegen would usually break during linking in either either the clang_tblgen or llvm_tblgen targets. Looking at the cmake command that generates the tablegen makefile commands (I think) (cmake/modules/TableGen.cmake:113-117):

     add_custom_command(OUTPUT ${${project}_TABLEGEN_EXE}
       COMMAND ${CMAKE_COMMAND} --build ../.. --target ${target} --config Release
       DEPENDS CONFIGURE_LLVM_NATIVE ${target}
       WORKING_DIRECTORY ${LLVM_NATIVE_BUILD}
       COMMENT "Building native TableGen...")

My guess is that the clang and llvm tablegen builds are trying to place their output into the same folder, so depending on when the resulting library gets linked/moved one of the threads could be left dangling with a partially-completed library or no library.

If you build with LLVM_OPTIMIZED_TABLEGEN the native tablegen is a fully generated build subtree. Make seems to have some problems with thread scheduling with these builds. I haven’t diagnosed the issue, but it isn’t that the builds are writing to the same directory. It happens regardless of whether you’re building llvm with or without clang, and the llvm & clang tablegen executables are named differently so they don’t conflict.

The error was either "Couldn't stat output file: ../LLVMSupport.a No such file or directory" or something about an archive member extending past the end of the file, and the cmake progress messages before linking were printed twice. Once for each thread, I'd guess? Sort of surprised that they seem to share so much code (or at least filenames)

First thing I could think of to fix this is adding another layer or two of CMakeLists, so the tablegen creation command would be invoked in different folders. I'm not convinced that that is the best fix, and I couldn't figure out how to move the builds into different folders with only minor changes to the existing cmake files.

I don’t think that will work. As I said above, they are building to different directories. You can note that in the custom command setting WORKING_DIRECTORY to ${LLVM_NATIVE_BUILD}, which is a nested fully-independent cmake build tree.

-Chris

That sounds much more reasonable than more directories. I’ll see if I can’t hack something together…

I still think you have a point; I would have expected cmake to catch that dependency. Strange thing is that just that library had an independent cmake build going, with different percents showing and not being verbose when the main config had it on.

-Alex

Alright, got something thrown together. Here it is inline (also attached if that's more convenient):

diff --git a/cmake/modules/TableGen.cmake b/cmake/modules/TableGen.cmake
index 452a728..be6729d 100644
--- a/cmake/modules/TableGen.cmake
+++ b/cmake/modules/TableGen.cmake
@@ -107,9 +107,18 @@ macro(add_tablegen target project)
       endif()
       set(${project}_TABLEGEN_EXE ${${project}_TABLEGEN_EXE} PARENT_SCOPE)

+ if(NOT TARGET NATIVE_LIB_LLVMSUPPORT)
+ add_custom_command(OUTPUT LIB_LLVMSUPPORT
+ COMMAND ${CMAKE_COMMAND} --build . --target LLVMSupport --config Release
+ DEPENDS CONFIGURE_LLVM_NATIVE
+ WORKING_DIRECTORY ${LLVM_NATIVE_BUILD}
+ COMMENT "Building libLLVMSupport for native TableGen...")
+ add_custom_target(NATIVE_LIB_LLVMSUPPORT DEPENDS LIB_LLVMSUPPORT)
+ endif(NOT TARGET NATIVE_LIB_LLVMSUPPORT)

TableGen.cmake.patch (1.18 KB)

Probably should have checked earlier... Do patches go directly to llvm-commits for review instead of getting reviewed here first?

ccing llvm-commits too in case

Probably should have checked earlier... Do patches resulting from conversations here go directly to llvm-commits for review?

ccing llvm-commits too in case

Yes, CCing llvm-commits is enough.

Probably should have checked earlier... Do patches go directly to llvm-commits for review instead of getting reviewed here first?

ccing llvm-commits too in case

Alright, got something thrown together. Here it is inline (also attached if that's more convenient):

diff --git a/cmake/modules/TableGen.cmake b/cmake/modules/TableGen.cmake
index 452a728..be6729d 100644
--- a/cmake/modules/TableGen.cmake
+++ b/cmake/modules/TableGen.cmake
@@ -107,9 +107,18 @@ macro(add_tablegen target project)
     endif()
     set(${project}_TABLEGEN_EXE ${${project}_TABLEGEN_EXE} PARENT_SCOPE)

+ if(NOT TARGET NATIVE_LIB_LLVMSUPPORT)

Instead of doing this you could move creating this out of the macro. Then it will be executed when the file is included which should only be once.

diff --git a/cmake/modules/TableGen.cmake b/cmake/modules/TableGen.cmake
index 452a728..b24197b 100644
--- a/cmake/modules/TableGen.cmake
+++ b/cmake/modules/TableGen.cmake
@@ -70,6 +70,13 @@ function(add_public_tablegen_target target)
   set(LLVM_COMMON_DEPENDS ${LLVM_COMMON_DEPENDS} ${target} PARENT_SCOPE)
endfunction()

+add_custom_command(OUTPUT LIB_LLVMSUPPORT
+ COMMAND ${CMAKE_COMMAND} --build . --target LLVMSupport --config Release
+ DEPENDS CONFIGURE_LLVM_NATIVE
+ WORKING_DIRECTORY ${LLVM_NATIVE_BUILD}
+ COMMENT "Building libLLVMSupport for native TableGen...")
+add_custom_target(NATIVE_LIB_LLVMSUPPORT DEPENDS LIB_LLVMSUPPORT)

It should probably be inside an `if(LLVM_USE_HOST_TOOLS)` block.

That way the extra target and command won’t get added unless needed and CMake won’t spew dev warnings. Adding a target with a nonexistent dependency makes CMake dump a bunch of developer warnings.

Other than that this looks good. Does it resolve the issue for you?

-Chris

diff --git a/cmake/modules/TableGen.cmake b/cmake/modules/TableGen.cmake
index 452a728..cb06450 100644
--- a/cmake/modules/TableGen.cmake
+++ b/cmake/modules/TableGen.cmake
@@ -70,6 +70,15 @@ function(add_public_tablegen_target target)
   set(LLVM_COMMON_DEPENDS ${LLVM_COMMON_DEPENDS} ${target} PARENT_SCOPE)
endfunction()

+if(LLVM_USE_HOST_TOOLS)
+ add_custom_command(OUTPUT LIB_LLVMSUPPORT
+ COMMAND ${CMAKE_COMMAND} --build . --target LLVMSupport --config Release
+ DEPENDS CONFIGURE_LLVM_NATIVE
+ WORKING_DIRECTORY ${LLVM_NATIVE_BUILD}
+ COMMENT "Building libLLVMSupport for native TableGen...")
+ add_custom_target(NATIVE_LIB_LLVMSUPPORT DEPENDS LIB_LLVMSUPPORT)
+endif(LLVM_USE_HOST_TOOLS)

Alright, this version works for me.

Anything else that needs to be done?

-Alex

Alright, this version works for me.

Anything else that needs to be done?

LGTM. Do you have commit access?

-Chris

Don't believe so. First contribution for me.

-Alex

Ok. I will commit this for you this morning.

Thanks for the patch, and welcome to the community!

-Chris

Looks like the LLVMSupport patch didn't get everything -- build failed in the
same way on libLLVMTableGen. Problem/solution looked the same as for
LLVMSupport, so just tweaked the previous patch, and consecutive builds
seem to work fine on my machine.

Hope this won't turn into a game of whack-a-bug. I would guess having each
tablegen build in its own directory would solve this and any related bugs, but
that's a change that would take a bit more time.

-Alex

llvmtablegen.patch (1.29 KB)

Looks good to me!

I can commit this for you today.

Thanks!
-Chris

Alex,

Sorry for the delay on this, I’ve committed your patch r251138.

Thanks,
-Chris

Not a problem at all

Thanks for your time!

-Alex