See RFC: Future of Windows pre-commit CI - #77 by philnik for context.
Here is an example of a pre-merge run: Github pull requests #56855
The interesting part is the time for lit
execution running ninja check-flang
on Linux and Windows:
# Linux
Testing Time: 5.98s
Total Discovered Tests: 2992
Unsupported : 137 (4.58%)
Passed : 2840 (94.92%)
Expectedly Failed: 15 (0.50%)
# Windows
Testing Time: 1220.35s
Total Discovered Tests: 2983
Unsupported : 195 (6.54%)
Passed : 2771 (92.89%)
Expectedly Failed: 17 (0.57%)
Slowest tests:
# Linux
Slowest Tests:
--------------------------------------------------------------------------
4.03s: Flang :: Lower/OpenMP/threadprivate-real-logical-complex-derivedtype.f90
1.90s: Flang :: Lower/OpenMP/FIR/threadprivate-real-logical-complex-derivedtype.f90
1.71s: Flang :: Fir/convert-to-llvm.fir
1.27s: Flang :: Lower/array.f90
1.05s: Flang :: Intrinsics/math-codegen.fir
1.04s: Flang :: Fir/dispatch.f90
0.95s: Flang :: Driver/omp-driver-offload.f90
0.92s: Flang :: Lower/select-type.f90
0.88s: Flang :: Driver/default-optimization-pipelines.f90
0.80s: Flang :: Driver/use-module.f90
0.80s: Flang :: Driver/optimization-remark.f90
0.76s: Flang :: Lower/OpenMP/FIR/rtl-flags.f90
0.74s: Flang :: Lower/OpenMP/rtl-flags.f90
0.72s: flang-OldUnit :: Evaluate/real.test
0.68s: Flang :: Lower/Arm/arm-sve-vector-bits-vscale-range.f90
0.67s: flang-OldUnit :: Evaluate/integer.test
0.66s: Flang :: Lower/RISCV/riscv-vector-bits-vscale-range.f90
0.65s: Flang :: Driver/falias-analysis.f90
0.65s: Flang :: Lower/allocatable-polymorphic.f90
0.62s: Flang :: Driver/fopenmp.f90
# Windows
Slowest Tests:
--------------------------------------------------------------------------
18.49s: Flang :: Driver/omp-driver-offload.f90
8.15s: Flang :: Lower/OpenMP/threadprivate-real-logical-complex-derivedtype.f90
7.39s: Flang :: Driver/fopenmp.f90
6.15s: Flang :: Intrinsics/math-codegen.fir
4.30s: Flang :: Driver/linker-flags.f90
4.26s: Flang :: Driver/fveclib.f90
4.11s: Flang :: Lower/OpenMP/FIR/threadprivate-real-logical-complex-derivedtype.f90
4.10s: Flang :: Driver/default-optimization-pipelines.f90
4.06s: Flang :: Driver/pic-flags.f90
4.04s: Flang :: Driver/aarch64-sve-vector-bits.f90
3.98s: Flang :: Semantics/modfile07.f90
3.83s: Flang :: Driver/gcc-toolchain-install-dir.f90
3.70s: Flang :: Driver/optimization-remark.f90
3.46s: Flang :: Driver/fsave-optimization-record.f90
3.29s: Flang :: Driver/mlir-debug-pass-pipeline.f90
3.18s: Flang :: Driver/use-module.f90
3.16s: Flang :: Driver/fdefault.f90
3.13s: Flang :: Driver/fixed-line-length.f90
3.08s: Flang :: Fir/convert-to-llvm.fir
3.05s: Flang :: Driver/falias-analysis.f90
Here is the cmake invocation for reference (release+asserts):
cmake -S C:/ws/src/llvm -B C:/ws/src/build -D 'LLVM_ENABLE_PROJECTS=clang;flang;llvm;mlir' -G Ninja -D CMAKE_BUILD_TYPE=Release -D LLVM_ENABLE_ASSERTIONS=ON -D LLVM_BUILD_EXAMPLES=ON -D COMPILER_RT_BUILD_LIBFUZZER=OFF -D 'LLVM_LIT_ARGS=-v --xunit-xml-output C:/ws/src/build/test-results.xml --timeout=1200 --time-tests' -D COMPILER_RT_BUILD_ORC=OFF -D CMAKE_C_COMPILER_LAUNCHER=sccache -D CMAKE_CXX_COMPILER_LAUNCHER=sccache -D MLIR_ENABLE_BINDINGS_PYTHON=ON -D CMAKE_EXE_LINKER_FLAGS=/MANIFEST:NO -D CMAKE_MODULE_LINKER_FLAGS=/MANIFEST:NO -D CMAKE_SHARED_LINKER_FLAGS=/MANIFEST:NO
As a comparison, during the same run, MLIR tests ran in 13.13s on Linux and 34.81s on Windows.
What should we do about this right now? Flang seems to become the bottleneck of the pre-merge testing here.