Major ARM bots failure


I'm not sure what happened, but some commit in this build made Clang
run forever on ARM:

The missing commits are to the sanitizers and they are not even
checked out on this bot, so I don't think it has anything to do with
that. It's also unlikely to be something on the Hexagon back-end,
since all changes are self-contained.

Any ideas?

I had to completely turn off all bots, since the tests time out and
the next build starts with the current build running at 100% CPU,
accumulating, and killing the board.


I'm not sure if it helps but the clang builder for mips has been dying of timeouts since which only has one commit in the blamelist (r223478 - LLVMContext: Store APInt/APFloat directly into the ConstantInt/FP DenseMaps.). I see the same commit in the build you linked to.

r223478 seems to be responsible for PR21770:

r223478 seems to be responsible for PR21770:

Reverted for now. Not sure what's going on there. Sorry for the breakage.

- Ben

No worries, at least that was easy to spot. Huzzah for buildbots! :slight_smile:


clang-cmake-mips didn't turn green immediately after the revert. I just checked on the machine and it appears that this was because of a large number of leftover processes from the build that timed out. After rebooting, it's turned green.

I'm curious about the reason these processes didn't die when buildbot timed out the build. Does lit clean up its subprocesses when it's killed?

No worries, at least that was easy to spot. Huzzah for buildbots! :slight_smile:

:slight_smile: Yep, these things happen sometimes. Also, it's given me the perfect example of why continuous builds is better than the nightly buildbots we use internally at the moment. Once we're past the 3.5.1 release, I really need to push our upgrades along.

The same happened to our bots. It’s because the processes don’t die and buildbot thinks it’s dead and starts a new cycle.

I had to stop the bot, kill all remaining processes and restart the bot. That did it.

Maybe we need an extra step on the bots to make sure the previous instance is not running, or a signal to kill it when the master gives up on time out.