Buildbot started commenting PRs on buid failures

Buildbot now is commenting PRs when built a single commit and detected a build or tests failure (when a builder goes from green to red).

Here is an example of such comment:
https://github.com/llvm/llvm-project/pull//95942#issuecomment-2183087642

We are still tuning up the relevant log chunk extraction, but the functionality is there and running.

Please let me know if you see issues.

10 Likes

This is great! Thanks :slight_smile:

I see LLDB flaky tests creating some noise:

Could this be fixed? I remember Windows being a problematic host for LLDB test-suite. Is this still an issue?

Another set of bots that seem relatively flaky are the clang-cuda-* ones(https://lab.llvm.org/buildbot/#/builders/69

It might be a good idea to restrict commenting to bots that have short blamelists and are known to be quite reliable.

Isn’t this the case already since the initial message in the thread said “commenting PRs when when built a single commit”? (I understand this as a blamelist of 1)

Thanks for mentioning this, Florian.

I’d rather use this occasion to make the bots reliable. We can move to the staging till fixed those bots proven unreliable and not getting enough love and care, or make them silent. There is not much value in reporting flaky failures to any communication channel, IRC and mails included.

@Artem-B could you take a look at the mentioned CUDA builders, please?

2 Likes

Here is the example of multiple false negatives from clang-cuda-* builders:

Is anybody looking at the issue?

I believe @Artem-B maintains that bot, but he’s on vacation for the month last I heard.

I’ll make those builders silent then.

We can always make them reporting failures again once they are reliable.

WDYT about changing the test output a bit on the bots to improve the signal/noise ratio on test failures? Something like:

diff --git a/llvm/utils/lit/lit/cl_arguments.py b/llvm/utils/lit/lit/cl_arguments.py
index b9122d07afd8..1c334c328153 100644
--- a/llvm/utils/lit/lit/cl_arguments.py
+++ b/llvm/utils/lit/lit/cl_arguments.py
@@ -84,6 +84,13 @@ def parse_args():
         help="Deprecated alias for -v.",
         action="store_true",
     )
+    format_group.add_argument(
+        "-m",
+        dest="minimalOutput",
+        help="Minimize output."
+        " Prints a '.' for each passing test to minimize output for CI jobs",
+        action="store_true",
+    )
     format_group.add_argument(
         "-a",
         "--show-all",
diff --git a/llvm/utils/lit/lit/display.py b/llvm/utils/lit/lit/display.py
index 7de5a298d230..809bfbd117ab 100644
--- a/llvm/utils/lit/lit/display.py
+++ b/llvm/utils/lit/lit/display.py
@@ -84,6 +84,7 @@ class Display(object):
         self.progress_predictor = ProgressPredictor(tests) if progress_bar else None
         self.progress_bar = progress_bar
         self.completed = 0
+        self.last_was_minimal = False
 
     def print_header(self):
         if self.header:
@@ -115,6 +116,16 @@ class Display(object):
             self.progress_bar.clear(interrupted)
 
     def print_result(self, test):
+        if self.opts.minimalOutput and not test.isFailure():
+            sys.stdout.write('.')
+            sys.stdout.flush()
+            self.last_was_minimal = True
+            return
+
+        if self.last_was_minimal:
+            print("")
+            self.last_was_minimal = False
+
         # Show the test result line.
         test_name = test.getFullName()
         print(

Failures would stick out a lot better in a sea of dots, and we could even tune that to print only every Nth pass so as to stay under buildkite’s 2MB display limit.

1 Like

This issue is not specific to just the GitHub reporting, but also the emails: I think that currently buildbot sends a mail/comment if the build failed, but builds for previous commits have not yet finished. This means that even though a failure was actually introduced in a previous commit, you’ll still get notified.

This is possible only for builders with multiple workers right? Otherwise builds are always in-order.

Yes-ish. There also appears to be a variant of this issue where builds get started in the wrong order. For example for Buildbot and Buildbot the build order is inverted from the commit order.

2 Likes

Thanks for mentioning this, Nikita. Looks like a bug in buildbot. We will take a look. Buildbot could run builds on a single worker in order different to the order of commits · Issue #219 · llvm/llvm-zorg · GitHub.

As of the case of multiple workers, not sure how much we can do about that. The only thing that comes to mind is to delay the reporting… I’ll think of it. Though, for now, we can either live with this issue or disable reporting for multiple workers. I prefer live with this issue, not strongly, though.

I reverted a change and created a PR for the revert (PR#98851), but just now noticed that when it was reapplied (by reverting my change), a buildbot failure of the revert of my revert caused a notification to appear in the PR I had created (see the linked PR). The two “failures” mentioned in the PR are not actually from my revert, but the revert of my revert.

Thanks for mentioning this, @dyung. I’ll take a look.