Can we remove llvmbb from IRC?

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you’re on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

The best thing about llvmbb I’ve heard it’s easy to just “/ignore llvmbb”, but if that’s what everybody does then why not not have it in the first place?

Nico

Check out the #llvm-build channel. In themain IRC channel, I block llvmbb, then just also listen in for my name in #llvm-build. The bot has a different name there, so it is still possible to block in #llvm.

+cfe-dev again :blush:

Hi Nico,

Hi,

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn’t be longer than, say, an hour or maybe less. (so, eg: if you have a bot that’s just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn’t solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.

So if you’re on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :slight_smile: Or look at IRC scrollback. Given how easy it is to find these problems, it doesn’t seem like there’s a lot of appetite for improving this. Hence me asking about removing llvmbb (…and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there’s more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1] Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975 blamelist: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>, Nico Weber <thakis@chromium.org>

build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler] Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132 blamelist: Nico Weber <thakis@chromium.org>, Matt Arsenault <Matthew.Arsenault@amd.com>, Eric Astor <epastor@google.com>, Craig Topper <craig.topper@intel.com>, Alina

build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all] Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255 blamelist: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>, Eric Astor <epastor@google.com>, Craig Topper <craig.topper@intel.com>, Alina Sbirlea <asbirlea@google.com>, Nico Weber <thakis@chromium.org>, Amara

I also got email with pointers to:
http://green.lab.llvm.org/green//job/clang-stage1-RA/14180/consoleFull#-1417328700a1ca8a51-895e-46c6-af87-ce24fa4cd561

Chances are that there’s something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there’s something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.

I assume you’re getting emails in addition to the chat spam? Or are you not/are these bots sending chat spam but not email? If that’s the case, yeah, I’d rather have a consistent notification experience - and disable all notifications from a bot if some notifications are disabled (eg: if it’s not good enough to be sending email, then it shouldn’t be spamming the IRC channel either)

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :slight_smile: Or look at IRC scrollback. Given how easy it is to find these problems, it doesn’t seem like there’s a lot of appetite for improving this.

I think there’s apetite for changing it in some way - no one enjoys the current state of things. But often people assume it’s not changeable, whereas I think it is - and I think it’s important that it be changed because if we silence all the bots, then quality is likely to go down. Silencing the IRC bot may still be good - folks should be getting buildbot fail email which is more targeted and not spamming the channel for people who aren’t to blame (heck, the bots could send private messages instead, I guess?).

But improving signal/noise should benefit the email, and the bot spam (whichever channel it’s in).

Hence me asking about removing llvmbb (…and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there’s more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1] Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975 blamelist: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>, Nico Weber <thakis@chromium.org>

That doesn’t look like the “always be broken” case. It was green on the build prior to this one ( http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974 )

Looks like the buildbot triggered correctly, only took the 2 revisions you committed. The test did pass at the prior revision and did fail at that revision - perhaps either the buildbot or the test is flakey? (interestingly the test failed in stage 1 at 13975, then failed in stage 2 at 13976 - then passed again in 13977. Both failures for the same reason “/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script: line 5: /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang: No such file or directory” - perhaps some problem with creating the symlink?

Started an llvm-dev thread to discuss that separately in more detail.

build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler] Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132 blamelist: Nico Weber <thakis@chromium.org>, Matt Arsenault <Matthew.Arsenault@amd.com>, Eric Astor <epastor@google.com>, Craig Topper <craig.topper@intel.com>, Alina

Also green on the prior build ( http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131 ).
Went green again after a revert here: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which matches the commit that made the bot go red - so this looks to be a bot doing what it’s meant to do. (varying levels of quality, and 2 hour cycle time isn’t ideal by any means, though it found this failure in 5 minutes once it started (but that could be 2 hours after a commit))

What do you think we should do with bots like this? Should long cycle time/long blame list bots (not always the same thing) produce no notifications, and require them to be triaged by the bot owner who then manually sends email/follow-up once a rough guess of blame has been made & checked that it hasn’t already been possibly diagnosed, discussed and fixed due to a faster bot or other means?

build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all] Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255 blamelist: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>, Eric Astor <epastor@google.com>, Craig Topper <craig.topper@intel.com>, Alina Sbirlea <asbirlea@google.com>, Nico Weber <thakis@chromium.org>, Amara

Also green on the prior build ( http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2254 ), and went back to green on the following build.
Possibly this was related to the same commit/revert as in the previous bot in this list. It’s a fairly fast bot, went red on a build including the revision that committed the xor issue, and green on the next build that included a revert of that patch. I couldn’t say for sure, though.

I also got email with pointers to:

http://green.lab.llvm.org/green//job/clang-stage1-RA/14180/consoleFull#-1417328700a1ca8a51-895e-46c6-af87-ce24fa4cd561

Was red for a few builds then green again here: http://green.lab.llvm.org/green/job/clang-stage1-RA/14183/

Looks like the build that went red and the build that went green (& the fact that the failure was related to libfuzzer) correlates well with this commit: https://github.com/llvm/llvm-project/commit/2665425908e00618074e42155ec922a37f7c9002 and this revert: https://github.com/llvm/llvm-project/commit/7139736261e047e9cca030e2ee5912bf2a16f816

Chances are that there’s something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there’s something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.

They also contain information about failures - yeah, they might not be yours, but they are often/usually someone’s, not just flakey bot failures. If you’re suggesting all the bots are unactionable - then perhaps we should turn off all notifications on all of them? I have certainly considered that - and then only enabling bots that are fast/high signal-to-noise/small blame list. Though I imagine that’s a bigger discussion.

I assume you’re getting emails in addition to the chat spam? Or are you not/are these bots sending chat spam but not email? If that’s the case, yeah, I’d rather have a consistent notification experience - and disable all notifications from a bot if some notifications are disabled (eg: if it’s not good enough to be sending email, then it shouldn’t be spamming the IRC channel either)

I received a single email for the greendragon bot. The rest was IRC only. (The greendragon bot didn’t send an IRC ping I think.)

Hi,

llvmbb’s job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn’t be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it’s already broken you shouldn’t be blamed for it. If you are seeing bot spam or emails from a bot that’s already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you’re proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn’t something that sometimes happens, it’s something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :slight_smile: Or look at IRC scrollback. Given how easy it is to find these problems, it doesn’t seem like there’s a lot of appetite for improving this.

I think there’s apetite for changing it in some way - no one enjoys the current state of things. But often people assume it’s not changeable, whereas I think it is - and I think it’s important that it be changed because if we silence all the bots, then quality is likely to go down. Silencing the IRC bot may still be good - folks should be getting buildbot fail email which is more targeted and not spamming the channel for people who aren’t to blame (heck, the bots could send private messages instead, I guess?).

But improving signal/noise should benefit the email, and the bot spam (whichever channel it’s in).

Hence me asking about removing llvmbb (…and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there’s more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1] Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975 blamelist: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>, Nico Weber <thakis@chromium.org>

That doesn’t look like the “always be broken” case. It was green on the build prior to this one ( http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974 )

Looks like the buildbot triggered correctly, only took the 2 revisions you committed. The test did pass at the prior revision and did fail at that revision - perhaps either the buildbot or the test is flakey? (interestingly the test failed in stage 1 at 13975, then failed in stage 2 at 13976 - then passed again in 13977. Both failures for the same reason “/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script: line 5: /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang: No such file or directory” - perhaps some problem with creating the symlink?

Started an llvm-dev thread to discuss that separately in more detail.

build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler] Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132 blamelist: Nico Weber <thakis@chromium.org>, Matt Arsenault <Matthew.Arsenault@amd.com>, Eric Astor <epastor@google.com>, Craig Topper <craig.topper@intel.com>, Alina

Also green on the prior build ( http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131 ).
Went green again after a revert here: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which matches the commit that made the bot go red - so this looks to be a bot doing what it’s meant to do. (varying levels of quality, and 2 hour cycle time isn’t ideal by any means, though it found this failure in 5 minutes once it started (but that could be 2 hours after a commit))

What do you think we should do with bots like this? Should long cycle time/long blame list bots (not always the same thing) produce no notifications, and require them to be triaged by the bot owner who then manually sends email/follow-up once a rough guess of blame has been made & checked that it hasn’t already been possibly diagnosed, discussed and fixed due to a faster bot or other means?

My personal opinion is that we shouldn’t have any bots that take more than an hour to cycle send any notifications.

I’m not on IRC anymore, so my opinion matters less, but I think it’s time to shut it down. It is a relic from a different time when there were fewer bots, fewer contributors, and more IRC users. These days it generates too many notifications and the audience isn’t as well targeted.

Fair enough - it is very noisy.

I’m surprised these aren’t all producing corresponding emails, though (at least those only go to the people on the blame list - but that seemed to be the case Nico was citing - though the general “I’m not even to blame but this is adding a lot of noise to the channel/making it hard to have conversations” is a broad/broader problem). And still seems important to improve the signal/noise ratio.

Sent https://reviews.llvm.org/D87100 to do that. & also looking into the config about IRC V email notification configuration. Hopefully those configurations can be unified. I don’t think it’s any more appropriate to send IRC notifications than email notifications. (I guess now that the IRC notifications will be fairly opt-in, maybe - but I still would rather there be less noise to make the signal stand out, so if a bot isn’t producing accurate enough info to send mail, then maybe not IRC either)

Thanks, David for proposing the patch.

but I still would rather there be less noise to make the signal stand out, so if a bot isn’t producing accurate enough info to send mail, then maybe not IRC either

Fair enough.

IRC notifier is a bot, so if somebody is interested in notifications from a particular bot they could subscribe to those. And by default we could send notifications to the #llvm-build only from faster bots with shorter blame lists. That would discriminate heavier and slower bots, but it seems those get buried under noise anyway.

David has already said this, but I want to repeat. The bots report only state changes. If a bot is red, it does not report failures following the first one till it gets green again. If anyone sees otherwise, please let me know, so I could troubleshoot.

Thanks

Galina