[analyzer] Bugzilla Database Cleanup Policy

Hi all, I looked through the Bugzilla database for the static analysis component. I was wondering what, if any, cleanup policy exists for long standing bugs. I found 620 bugs today. While I did not systematically look at each and every one one those :slight_smile: I noticed in passing many were in one of the following various states:

  1. A duplicate
  2. An issue that had already been solved
  3. An issue that’s not concrete, or has enough information to start with.
  4. Some (many?) of which the originator cannot be contacted for further clarification.

Most of these are Assigned to Ted (especially the ones filed before 2018).

Artem and/or Devin: Is there a policy we’re following if we want to just start going through these issues, triage and cleanup the easier ones?

May I suggest the following?

  1. Maybe for the older ones, we can prove they are fixed and close them, documenting how they were proven to be fixed in the bug, leaving an audit trail?

  2. For ones that are not concrete, vague or have a reproducer, start a discussion on the mailing list, attempt to contact the originator? And after an appropriate time, close the bug as not reproducible?

  3. Mark duplicates in favor of a more complete description of the issue?

Please let me know if you have strong preferences to initiate a cleanup, and I’m happy to follow those. I’m also willing to lead and contribute to a cleanup effort.


I tried to clean up bugzilla bugs about a year ago. 620 doesn’t sound like a lot but i gave up after about 20 or so.

A lot of the early bugs are Objective-C-related because that’s where it all began - the retain count checker. We basically had one checker and people called it “The Checker”. There was also no interprocedural analysis at all.

I don’t think there’s an existing policy so let’s try to come up with something.

It’s pretty unlikely that you’ll get replies on 10-year-old bugs. You can try to ping the bug (all CCd people including the author will receive an email notification) but if it ends up having insufficient information there’s not much we can do.

Generally, i think it’s much better to start with new bugs and work backwards. Fresh bugs are more likely to be relevant, the author is more likely to be available for discussion, and addressing them quickly will make them happy.

Having a reproducer is a must for a good bug report. It doesn’t have to be small, especially given that false positives can’t be automatically reduced. We also shouldn’t ask people to reduce by hand as long as they’re allowed to provide a full preprocessed file, because not only we have enough tools to debug an unreduced bug but also it’s still very easy to accidentally remove essential bits of the puzzle when you’re reducing by hand.

If your best effort to reproduce fails and the author is not responding, closing an old bug as “works for me” is always a valid option. I don’t think there’s much value in building an ancient clang to reproduce the issue and bisecting find the exact commit that fixed.

Once a reproducer is obtained, the next step is to debug the bug. This step is not absolutely necessary as whoever finds the bug report will be able to do that anyway but it can often be done much faster than fixing the bug and also that’s the only way to properly categorize the bug report (find duplicates, assign to umbrella bugs, etc.). It’s usually very hard to guess the root cause just by looking at the report but exploded graph debugging usually yields the exact answer. So i usually try to do that. Especially when the report is about something that i thought was working perfectly.

As for categorization, i’m making “umbrella” bugs for large issues that affect many users and get reported often. I tag these bugs as [Umbrella] and for now there’s three of them (you’ve already seen two). The individual instances are duped to them and the dupe count is supposed to indicate how big of a problem it is (i don’t think it’s actually working though).

Finally, please cc me if you find something interesting ^.^