Documentation for triaging Github issues?

Folks like @Endill have been working hard to keep the Github issues in good shape, and I’ve been making work for them going through issues in lldb and forgetting to update things :slight_smile:

To help me and others, is there any documentation on how we should process issues? We have a lot of labels so sometimes you don’t know you’ve missed adding one because you didn’t know it existed. Something covering the top 10 generic labels maybe and the basic flow from new → assigned to a sub-project.

Perhaps this does exist and I’ve missed it.

1 Like

I’m not aware of a general document/process for handling issues. The initial labeling has been done by @EugeneZelenko and he’s asked for volunteers to help. Some description of what he actually does would be good to have written down, because of bus factor if nothing else.

It may be not easy to create universal document. There are many factors to consider.

If stack trace is present, label is defined by top of stack. If not, LLVM part should be taken in consideration. Most relevant label should be used and new issue label should be removed after classification. Sometimes additional labels are needed.

After issue is closed label should be adjusted according to commit(s), i.e. root cause.

For release backport release:backport label should be added when /cherry-pick or /branch command are present in issue and release:merged label should be added after patch merged.

1 Like

I’m not aware of such documentation, but the highlights are (not covering project-specific ones):

  1. Use confirmed for issues with bugzilla label if they are still relevant today (e.g. still reproduces).
  2. When you close an issue, pay attention to closing as completed (i.e. done, resolved, fixed) and closing as not planned (i.e. not a bug, won’t fix, works for me, duplicate, stale)
  3. When closing an issue as not planned, be sure to add the most appropriate label with reason. Options are: worksforme, wontfix, invalid, duplicate, incomplete
  4. We have labels for outcomes: crash-on-valid, crash-on-invalid, hang, miscompilation, accepts-invalid, rejects-valid. (There is also crash label, but its target audience are people doing triage for project they are not experienced with. Experts are encouraged to make a judgement between crash-on-valid and crash-on-invalid.)
  5. I believe good first issue should come with some analysis and points of entry into codebase.
  6. new issue should be removed when other labels are added.
  7. waiting-for-author is useful to track which issues can be closed (often as incomplete), because author hasn’t got back to us in a while.
  8. documentation is a single label for documentation issues. Meant to be used together with a project-specific label.
  9. metabug for (usually) long-lived issues that are used to collect references to a group of similar or related issues.
  10. enhancement for issues that are not really issues, but e.g. feature requests. Recently we tried to improve description of this one: Improving things as opposed to bug fixing, e.g. new or missing feature .
  11. question is obviously for questions.
  12. bug label exists, and widely used if you look at numbers, but I consider it redundant, because we are in bug tracker context, after all.
  13. extension:* labels were recently invented for Clang, but this group might become useful for other subprojects as well. Currently it consists of extension:gnu, extension:microsoft, extension:clang, and extension:borland.
  14. Other useful labels: regression, quality-of-implementation.

We strive to avoid proliferation of labels, e.g. clang should never go together with more specific clang:frontend label. My personal soft limit is 5 labels. After that point I think twice before adding another label.

Be sure to check labels specific to your project here: Labels · llvm/llvm-project · GitHub !

I believe PRs and issues should be treated the same way when it comes to labels (if possible, of course).

I’ll give it a try, but it’s not a step-by-step instruction with particular order. When I look at issue, I try to do the following:

  • Find or create Compiler Explorer link.
    • Ask for more context if CE link is possible, but no enough info provided
      • Add waiting-for-author
    • Mark as regression
    • Mark as confirmed
  • Add subproject/infrastructure/cmake label
  • Add crash/hang/miscompilation label
  • Add appropriate backend label
  • Mark as question or enhancement
    • I often leave enhancement to people with more experience in respective subproject

If I end up doing anything (leaving a comment or labeling), I remove new issue. To be honest, I feel embarrassed each time I look at 273 Open 50 Closed “new” issues, which date back months and years. So recently I’ve been draining this queue.

3 Likes

IMHO, it would really help if all tickets were also labeled with something like release-<N> vs main:

  • release-<N> for issues reproducible using a particular release LLVM (N),
    • For example, for an issue that can only be reproduced with clang-14 one would use release-14
    • IMHO this would be super helpful as many bug reports refer to some previous releases of LLVM. These are rarely fixed unless reproducible with main.
  • main for issues reproducible with top-of-tree LLVM.
    • Issues like this should be accompanied with a Git SHA (one won’t be able to reproduce them otherwise),
    • I’d expect issues reproducible with main to be more likely to be taken on.

Triaging would be much easier with the above.

Not sure what the utility of such issues would be.
If the issue is not reproducible in main, then it can be considered fixed. If it’s still reproducible on the latest release but not in main, then you’d likely want to find/bisect the fixing commit and backport it into the patch level release (if there is going to be one).

The “reproducible on main” label also serves little point due to above. Either the bug is already fixed on main at the time of the creation, or it wasn’t and having such label provides no additional info.

tl;dr: I think “the issue is open” should mean “reproducible on main” in 90%+ of cases (for functional bugs at least) and there is no need to label closed issues.

1 Like

Labeling something as reproducible means someone has gone to the trouble to try to reproduce it, and there’s no need to spend further effort re-reproducing it. You could argue that a comment on the issue would provide the same information, which is true if your are reading the issue but not if you are doing a search for issues that have (or have not) been reproduced. Labels let you search more conveniently.

While the upstream LLVM project really doesn’t care about anything prior to the most recent release, this is not necessarily true for downstream vendors, and it can be a service to them to indicate that a given older release has (or fixes) the issue.

Labels like this are also useful for researchers or archaeologists trying to understand the history of the project.

Also if it is crash bug, try out the bug (if it has a reproducer) with assertion build of clang since often it will crash in a different place and usually the point of assertion is helpful in debugging.

Also if we have a reproducer adding the assertion message and backtrace to the bug report will help folks find duplicate or related problem. This is a good example: Clang crashes on variable template instantiated with a member of a template class · Issue #65153 · llvm/llvm-project · GitHub

Finding duplicates is super helpful. This prevents wasted effort in debugging a problem others have already put effort into. You may find simpler reproducers or helpful details in other reports. Having a good collection of reproducers is also helpful when constructing a set of regression tests.

confirmed means “reproducible on main” among other things. Having a label that means just that is not very useful, because main is a moving target, and fast at that.

My experience suggests that labels you propose are not going to be helpful. @danilaml outlined the reason pretty well.

1 Like

Thank you for your excellent addition!

Personally, I’m struggling with this a lot. As someone who do batch triaging without deep dive, I just lack the understanding of the issue and insight into how Clang works in order to connect the dots. Searching for assertion message helps (and becoming increasingly more helpful as we build a pool of triaged issues), but I’m reluctant to declare issue a duplicate based on just that, so I’ve been relying on more experienced contributors to do that (thank you!). Maybe your insight could help here, if it works in circumstances I just described.

My suggestion was to clearly differentiate issues that:

  • can be reproduced with main (*), vs
  • issues discovered while using a particular release of LLVM.

As there no guidelines that would say:

Only submit a bug report if this can be reproduced with main.

I thought that it would be good to have a mechanism to prevent people from sending reports saying something like

fails with clang-15

Even confirmed could mean:

Yes, I confirm that it fails with clang-15.

How do we avoid situations like that?

In particular:

that’s not obvious to me, confirmed could mean many things.

That is understandable, it is not always obvious. We rarely get the almost identical code with the same assertion and identical backtrace. Often the code is not obviously doing the same thing and while the assertion matches the backtrace can vary in minor or major ways. I often can’t tell without deeper investigation and in those cases I merely point out that a bug maybe be related.

There are only 2 scenarios here:

  • The issue is reproducible on main
  • Someone has a compelling case for a backport fix to N-1.x

Bugs that are no longer reproducible just get closed. We sometimes try to pinpoint the commit or version that fixed the bug if that’s helpful to someone but otherwise there is no distinction between a fixed bug and one that reproduces with a specific version.

2 Likes

Thanks for clarifying, this makes sense to me.

I still feel that we assume that everyone reporting a bug checks main before opening an issue. But in my experience that’s not always the case (e.g. many “end users” that want to report a bug would never build a compiler themselves). So I’d like to be able to hint to people:

If possible, please check main before reporting an issue.

I’m not sure how to achieve this, but I appreciate that “labels” is not the mechanism that want to use for this.

Going back to confirmed - that is only going to be valid for a specific revision that the person triaging checks. That would be main at the point of checking. However, a lot can happen in 6 months in LLVM. So shouldn’t such label be time-limited? For example, removed auto-magically after 6 months of no updates?

I don’t think we need to time-limit such labels. Supposedly, when the dev that wants to work on some confirmed issue wants to pick it up, they would check if it’s still an issue on the latest build and close if it was fixed/no longer reproduces. I’d do that regardless of any labels (since the first step of working on a bug is usually trying to reproduce it on your machine). Unless we can guarantee that there will be volunteers that will timely re-add “confirmed” labels (at which point it’s no different than re-confirming them and removing the label after the fact).

If the issue is isolated enough, usually you can just plug it in compiler explorer which allows you to check if it exists on most recent releases as well as main. Otherwise, asking the bug reporter to at least check with the most recent release of the compiler available would be one of the first steps of the triaging (the older their release the higher chance that the issue was fixed/is not reproduced by main).

Also, this is mostly for bug/crash/miscompile kind of issues. There is also “missing optimization”/“missing feature”/“behavior divergence (i.e. w.r.t. gcc)” and others.

I’d hope that someone adding a confirmed label would provide some clue what revision they tried it with. If nothing else, the issue history should log when the label was added, giving a likely time-frame for when it was reproducible. So, I agree that expiring the confirmed label isn’t appropriate.

Speaking of compiler explorer check, I have an idea: Can we do some automatic check leveraging compiler explorer, or maybe GitHub actions as well.
The exception is we can avoid tedious verify + close work by manual. I think it can also be triggered whenever we have a public release. By this way, we can also automatic label it “fixed-in-release-xxx” or “also-failed-in-release-xxxx” etc.

In Clang we’ve seen couple of regressions on crashes fixed long ago, so I discussed that privately with other Clang contributors. My idea is to turn CE reproducers we have in issues into regression tests in our test suite, including tests that crash or fail. Then, if one of failing tests stops failing (or fails in a different way), CI will complain. Bug report where the test came from will be recorded in the test, so it’s easy to trace it back and close if appropriate.

The biggest concern I’ve heard is that we talk about adding thousands new tests, which will overlap with existing tests to some extent, hence making out tests run even longer than they are. I had an idea that we can mitigate this by checking whether new test increases our coverage, but the result was that we need to back it up with very decent hardware, otherwise it’s not very usable at scale: I have hundreds of such CE links prepared and counting, but on my hardware it takes about half an hour to run Clang test suite, and summarize all the coverage data. But we need to do it twice to compare coverage results.

1 Like

I’m not super familiar with how the coverage stuff works, but once you have coverage data for the existing suite, wouldn’t be sufficient to run each new test once and see if it hits anything previously uncovered?

That’s correct, but collecting coverage data even once is not too fast, and is mostly single-threaded.