A good way to find missing dependencies is to delete the build directory to start fresh, and build only the target that broke non-deterministically ninja tools/mlir/lib/Dialect/Linalg/IR/CMakeFiles/obj.MLIRLinalg.dir/LinalgOps.cpp.obj
This is a bit strange, I would expect a missing include file instead of this error. If you can get the build directory in this state, I’m interested if you can look up tools/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yamlgen.cpp.inc and tools/mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.h.inc.
Because the DepthwiseConv2DInputNhwcFilterHwcPolyOp should be declared through:
This is very strange and seems like the kind of thing that triggers when there is a multi level tool dependency declared incorrectly in cmake, in combination with an incremental rebuild (it is easy to misconfigure cmake to not cause dependent-of-dependent rules to trigger on change).
I’m not familiar with buildbot setups: do they reuse the build directory in some way?
Oh I didn’t understand originally that it was about an incremental build: this seems a bit fragile to me to setup a bot this way. CMake isn’t bulletproof on incremental builds, in particular across revisions (there could be stale generated file left, and they won’t be cleaned up, but can be looked up by header includes).
I spend some time on this today and it seems the problem is indeed that cmake requires also a file dependency and not just a target dependency once you have dependent custom_commands (we run yaml-gen and then tablegen).
Section five in the following blog explains the problem:
The second custom_command is well hidden in LLVM in our case. I implemented the following hack which seems to work:
Update llvm/cmake/modules/TableGen.cmake to add LINALG_DEPS to the dependencies:
Buildbots can be configured either way; clean before every build or not. Build is always cleaned if any CMakeLists.txt is changed, so there should be no problem with stale files.
I setup my buildbots always with incremental builds. This reduces the typical time from more than an hour to minutes. This means it does not have to coalesce as many commits, you get faster responses, any honestly always recompiling everything feels like a waste of resources. It also helps identifying problems like this.
IMO: ccache is just more principled and robust from this point of view, and very efficient. This is what we use for Buildbot and the build is frequently <5 min.
Independently of how buildbots should be configured, don’t you agree that incremental build should work as well? That’s what every developer is using and occasional failures will cost a lot of developer time.
Yes, within the limits of the tools. I typically blow away my cmake build dir and start over in O(month). It is not always glitch free – I suspect that people are just running ninja again when this issue happens for them locally. Thanks for raising it as a real concern.
Right, I’m building incrementally all the time, but CMake is intrinsically limited in terms of incremental builds: you should just keep these limitations in mind, because there isn’t much we can do about it. I guess bots may not break frequently because we don’t modify the CMake structure too often: I just don’t want to be on the receive end of debugging such issues when they occurs
And of course we should fix bugs when we find them, thanks @gysit