MLIR with empty LLVM_TARGETS_TO_BUILD

Hi all,

Most of the MLIR codebase does not need any LLVM targets to be configured in order to build and pass the check-mlir test suite (i.e., -DLLVM_TARGETS_TO_BUILD=“” should almost work). There are currently just 6 out of 418 test cases that rely on target info (list below), and check-mlir would just pass if these weren’t run if no targets were configured. Given the significant build time that is saved when using an empty target list, would it be meaningful and easy to exclude these tests when LLVM_TARGETS_TO_BUILD is empty?

Failing Tests (6):
  MLIR :: mlir-cpu-runner/bare_ptr_call_conv.mlir
  MLIR :: mlir-cpu-runner/linalg_integration_test.mlir
  MLIR :: mlir-cpu-runner/sgemm_naive_codegen.mlir
  MLIR :: mlir-cpu-runner/simple.mlir
  MLIR :: mlir-cpu-runner/unranked_memref.mlir
  MLIR :: mlir-cpu-runner/utils.mlir
1 Like

I think it would make sense. We already have some mechanism to disable tests depending on mlir-cuda-runner and mlir-rocm-runner depending on cmake flags. We can do the same for CPU.

+1 from me. Ideally it would use the same disable mechanism that the CUDA/ROCM tests use.

As a side note, @aartbik started a set of integration tests that also depend on the CPU runner. They are disabled by default, but we should make sure they cannot be enabled if we don’t have the host target.

1 Like

Thanks for pointing that out, @ftynse!

Just for completeness, the suite is enabled with -DMLIR_INCLUDE_INTEGRATION_TESTS=ON during setup. After that, you can run the suite with the target check-mlir-integration. Future contributions may go beyond just cpu-runner, but right now all suites depend on that.

Should be fixed in bc14c77a1e88 ; ninja check-mlir goes from 1848 actions to 1650 when going from X86 to no target (2m05s on my machine).

And here is the list of the top 20 slow targets on my machine:

43742 lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o
33699 lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/CodeGenPrepare.cpp.o
33393 lib/Transforms/Vectorize/CMakeFiles/LLVMVectorize.dir/LoopVectorize.cpp.o
32415 lib/Analysis/CMakeFiles/LLVMAnalysis.dir/ScalarEvolution.cpp.o
32381 tools/mlir/lib/Dialect/LLVMIR/CMakeFiles/obj.MLIRLLVMIR.dir/IR/LLVMDialect.cpp.o
31103 lib/Bitcode/Reader/CMakeFiles/LLVMBitReader.dir/BitcodeReader.cpp.o
30694 tools/mlir/lib/Dialect/StandardOps/CMakeFiles/obj.MLIRStandardOps.dir/IR/Ops.cpp.o
29857 lib/Transforms/Vectorize/CMakeFiles/LLVMVectorize.dir/SLPVectorizer.cpp.o
29041 tools/mlir/lib/Dialect/SPIRV/CMakeFiles/obj.MLIRSPIRV.dir/SPIRVOps.cpp.o
28506 tools/mlir/test/lib/Dialect/Test/CMakeFiles/obj.MLIRTestDialect.dir/TestDialect.cpp.o
28473 lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/NewGVN.cpp.o
27998 tools/mlir/lib/Dialect/SPIRV/CMakeFiles/obj.MLIRSPIRV.dir/SPIRVDialect.cpp.o
27077 lib/Transforms/Utils/CMakeFiles/LLVMTransformUtils.dir/SimplifyCFG.cpp.o
27033 lib/Transforms/IPO/CMakeFiles/LLVMipo.dir/AttributorAttributes.cpp.o
26847 lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachinePipeliner.cpp.o
26477 tools/mlir/lib/Dialect/SPIRV/Serialization/CMakeFiles/obj.MLIRSPIRVSerialization.dir/Deserializer.cpp.o
26185 lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/SimpleLoopUnswitch.cpp.o
25105 lib/Bitcode/Writer/CMakeFiles/LLVMBitWriter.dir/BitcodeWriter.cpp.o
25037 lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/LoopStrengthReduce.cpp.o
24728 tools/mlir/lib/Dialect/SPIRV/Serialization/CMakeFiles/obj.MLIRSPIRVSerialization.dir/Serializer.cpp.o
2 Likes

I think many of these files are just big, and thus take long. The file that I think has the highest compile time for its size in MLIR is lib/Dialect/Linalg/Transforms/Loops.cpp - for just 746 lines, it takes nearly 11.5s to compile on a fast workstation (3.7 GHz Skylake-based Core i7) in release mode and with Clang 10.0! It’s probably due to the specific C++ patterns or templates used there - I was planning to post on it separately or file a bug/feature request. A time like that could be a drain on productivity. Files of similar size in MLIR build nearly 5x faster on the same system/config. Any compile time improvements will be welcome!

If you don’t yet, I also encourage you to try ccache: it helps a lot with iterative builds.

Yes, thanks. However, ccache in such cases (like with lib/Dialect/Linalg/Transforms/Loops.cpp) won’t help with each different update requiring the same amount of long time to build it.

It does not help with the part of the codebase you’re changing indeed.