I assume you’re getting emails in addition to the chat spam? Or are you not/are these bots sending chat spam but not email? If that’s the case, yeah, I’d rather have a consistent notification experience - and disable all notifications from a bot if some notifications are disabled (eg: if it’s not good enough to be sending email, then it shouldn’t be spamming the IRC channel either)
llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,
If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.
If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)
I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.
Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?
Just land a change Or look at IRC scrollback. Given how easy it is to find these problems, it doesn’t seem like there’s a lot of appetite for improving this.
I think there’s apetite for changing it in some way - no one enjoys the current state of things. But often people assume it’s not changeable, whereas I think it is - and I think it’s important that it be changed because if we silence all the bots, then quality is likely to go down. Silencing the IRC bot may still be good - folks should be getting buildbot fail email which is more targeted and not spamming the channel for people who aren’t to blame (heck, the bots could send private messages instead, I guess?).
But improving signal/noise should benefit the email, and the bot spam (whichever channel it’s in).
Hence me asking about removing llvmbb (…and so far everyone seems to be in favor).
In this case, from my IRC scrollback (there’s more people on the blamelist, spread over several follow-on IRC messages):
build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1] Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975 blamelist: LLVM GN Syncbot <firstname.lastname@example.org>, Nico Weber <email@example.com>
That doesn’t look like the “always be broken” case. It was green on the build prior to this one ( http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974 )
Looks like the buildbot triggered correctly, only took the 2 revisions you committed. The test did pass at the prior revision and did fail at that revision - perhaps either the buildbot or the test is flakey? (interestingly the test failed in stage 1 at 13975, then failed in stage 2 at 13976 - then passed again in 13977. Both failures for the same reason “/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script: line 5: /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang: No such file or directory” - perhaps some problem with creating the symlink?
Started an llvm-dev thread to discuss that separately in more detail.
build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler] Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132 blamelist: Nico Weber <firstname.lastname@example.org>, Matt Arsenault <Matthew.Arsenault@amd.com>, Eric Astor <email@example.com>, Craig Topper <firstname.lastname@example.org>, Alina
Also green on the prior build ( http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131 ).
Went green again after a revert here: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which matches the commit that made the bot go red - so this looks to be a bot doing what it’s meant to do. (varying levels of quality, and 2 hour cycle time isn’t ideal by any means, though it found this failure in 5 minutes once it started (but that could be 2 hours after a commit))
What do you think we should do with bots like this? Should long cycle time/long blame list bots (not always the same thing) produce no notifications, and require them to be triaged by the bot owner who then manually sends email/follow-up once a rough guess of blame has been made & checked that it hasn’t already been possibly diagnosed, discussed and fixed due to a faster bot or other means?
build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all] Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255 blamelist: LLVM GN Syncbot <email@example.com>, Eric Astor <firstname.lastname@example.org>, Craig Topper <email@example.com>, Alina Sbirlea <firstname.lastname@example.org>, Nico Weber <email@example.com>, Amara
Also green on the prior build ( http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2254 ), and went back to green on the following build.
Possibly this was related to the same commit/revert as in the previous bot in this list. It’s a fairly fast bot, went red on a build including the revision that committed the xor issue, and green on the next build that included a revert of that patch. I couldn’t say for sure, though.
I also got email with pointers to:
Was red for a few builds then green again here: http://green.lab.llvm.org/green/job/clang-stage1-RA/14183/
Looks like the build that went red and the build that went green (& the fact that the failure was related to libfuzzer) correlates well with this commit: https://github.com/llvm/llvm-project/commit/2665425908e00618074e42155ec922a37f7c9002 and this revert: https://github.com/llvm/llvm-project/commit/7139736261e047e9cca030e2ee5912bf2a16f816
Chances are that there’s something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there’s something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.
They also contain information about failures - yeah, they might not be yours, but they are often/usually someone’s, not just flakey bot failures. If you’re suggesting all the bots are unactionable - then perhaps we should turn off all notifications on all of them? I have certainly considered that - and then only enabling bots that are fast/high signal-to-noise/small blame list. Though I imagine that’s a bigger discussion.