Update on Bugzilla migration

Dear fellow LLVM contributors and users,

I do apologize for the radio silence on the bugzilla migration and I
do certainly appreciate your patience on this topic. It turned out
that the overall migration process is quite non-trivial given the
amount of data, all kinds of restrictions, rules and regulations we
have to take into account. However now I believe we are very close to
the cut-off point. Here is the sequence of events that will likely
happen later this month:

1. I'd expect that we will perform a test migration of ~3k bugzilla
PRs into a dummy project later this week. Please let me know if you'd
encounter any side-effects of such migration, e.g. excessive
notifications being sent, etc.
2. After the migration we will perform the final check of the
contents: labels, users, cross-links, attachments, etc.
3. If step 2 will yield the content of adequate quality, we will
proceed to the next step. Otherwise we will have to fix the issues
that were found and redo steps 1-2. We will communicate the outcome of
the test migration to the community. We will likely not be able to
make big cosmetic changes at that moment, however, we will try to fix
bugs / usability problems should they appear. Note that ~400 first
bugs will be lost during the migration.
4. After step 3 we will decide on the roadmap of the actual migration.
I'd estimate that we will need to shut down / put bugzilla into
read-only mode for approx ~48-72 hours. We will communicate the
proposed roadmap with the community.
5. After the real migration Bugzilla will be put into read-only mode
for next few months to allow possible migration of any missed data.

Thanks for all the work and updates!

Dear All,

Here is the status update: the test migration was attempted. First set
of show-stoppers was identified, the majority of them are the
limitations on GitHub side. We are working on circumventing them,
however, there is no ETA yet.

I will keep you informed as soon as new information will be available.

Thank you for your patience and understanding.

Thanks a lot on working on the transition!

Hello Michael,

Because in such case we will lose all releases. It is not possible to
re-create them using dates in the past.

Dear Fellow LLVMers,

I believe we were able to work-around the majority of GitHub
deficiencies (at least those that were show-stoppers). We are checking
the results. Hopefully I will be able to return to you with the final
migration roadmap soon.

Stay tuned!

Awesome to hear - really looking forward to better supported tooling/more consistency across source versioning and other infrastructure (bugs in this case)!

Dear All,

Over the weekend we tried to perform a "dry-run" migration –
conversion of all 51k+ bugzilla issues to a temporary GitHub project.

Unfortunately, the migration failed due to some obscure error at the
GitHub side. So far, GitHub is unable to tell us what the problem is,
how to solve / workaround it and how to proceed with the migration
(not to say, how to prevent similar issues during the real migration).
So far this is the real show-stopper.

We will continue pushing, however, I do not have any ETA on when we
will be able to continue with the bugzilla migration.

I'm sorry to disappoint you, but sometimes things are beyond my control.

Dear All,

Over the weekend we tried to perform a “dry-run” migration –
conversion of all 51k+ bugzilla issues to a temporary GitHub project.

Unfortunately, the migration failed due to some obscure error at the
GitHub side. So far, GitHub is unable to tell us what the problem is,
how to solve / workaround it and how to proceed with the migration
(not to say, how to prevent similar issues during the real migration).
So far this is the real show-stopper.

We will continue pushing, however, I do not have any ETA on when we
will be able to continue with the bugzilla migration.

I’m sorry to disappoint you, but sometimes things are beyond my control.

Could you give a rough summary/details on the interactions with GitHub? Are they responsive? Expecting to provide more detail in some sort of timeline (not a guarantee, but more than a “We don’t know, and we don’t have any particular plans/expectation that we’ll ever know, what’s wrong here”)?

I've raised this point once before, but I think it's time to raise it again.

I believe we should drop the goal of keeping bug numbers in sync between github and the legacy llvm bug database. We do need a one-to-one mapping, but the numbers can be distinct. This requires a bit of extra ugliness in terms of needing to add a comment to every bug (both copies) with a link to the other, but this is a minimal badness, and stops mattering fairly quickly after the transition

Continuing to hold back the transition of new bugs to github is causing real immediate harm. I strongly believe we are better off moving now with an imperfect system than waiting any longer.

Philip

I’ve raised this point once before, but I think it’s time to raise it again.

I believe we should drop the goal of keeping bug numbers in sync between
github and the legacy llvm bug database. We do need a one-to-one
mapping, but the numbers can be distinct. This requires a bit of extra ugliness in terms of needing to add a comment to every bug (both copies)
with a link to the other, but this is a minimal badness, and stops
mattering fairly quickly after the transition

It is even better than this: if we can generate a map of old IDs to new IDs when doing the conversion, it really isn’t difficult to keep the existing URL working (redirecting to the right migrated GitHub issue).

I may be missing something about other advantages of mapping 1-1?

Continuing to hold back the transition of new bugs to github is causing
real immediate harm. I strongly believe we are better off moving now
with an imperfect system than waiting any longer.

Is the ID mapping really the only issue keeping us back though?

It seems to be a major one. If nothing else, without it we could migrate a subset of bugs which happen to migrate cleanly, and then come back and handle the ones with issues at a arbitrarily later point. Or we could simply close creation of new bugzilla bugs, and start all new traffic on github without waiting for a migration at all. The whole reason we’re not doing that (seems to be) is that we want to preserve the low bug numbers for 1-to-1 correspondence purposes.

Maybe we can close bugzilla bug creation, create placeholder issues for the existing bugzilla bugs, create new bugs on GitHub and fill those placeholders asynchronously?

Best, Mara

I was about to suggest the same thing as Mara (email came in just as I was about to hit send :slightly_smiling_face:). It should be pretty easy to do this:

  1. For each bugzilla bug number, create a GitHub issue with a name like “placeholder for bugzilla migration” and a link to the corresponding bugzilla bug (and maybe some explanation of what’s going on so it’s less weird to people coming across it)
  2. To avoid these issues clogging up the GitHub, close them all to start with (they can be reopened later if they correspond to an open issue on bugzilla)
  3. Stop new issue creation on Bugzilla
  4. Add GitHub issues in the same way as (1) for any new Bugzilla bugs that were filed between (1) and (3). Technically you could start by stopping new issue creation on Bugzilla, but it’s probably nice to minimize the window where there’s no way to file issues.
  1. Open up the GitHub for new issue creation
  2. Figure out how to migrate all the Bugzilla bugs without time pressure

Is there somewhere that has more details about the progress/problems/plans/etc for this migration? (a bug, some other mailing list, discourse-group, whatever…)

There was a previous email thread on llvm-dev a while (6 months? more?) back. But that's all I know of.

Philip

    Is the ID mapping really the only issue keeping us back though?

    It seems to be a major one. If nothing else, without it we could migrate a subset of bugs which happen to migrate cleanly, and then come back and handle the ones with issues at a arbitrarily later point. Or we could simply close creation of *new* bugzilla bugs, and start all new traffic on github without waiting for a migration at all. The whole reason we're not doing that (seems to be) is that we want to preserve the low bug numbers for 1-to-1 correspondence purposes.

Is there somewhere that has more details about the progress/problems/plans/etc for this migration? (a bug, some other mailing list, discourse-group, whatever...)

Current status is we are waiting until the middle of next week to get a response
from GitHub to some of our questions. If we don't hear anything by then, we will decide
the best way to proceed. Anton has done some test migrations and to me they look
pretty good, but he can fill in more details about what the issues have been so far.

-Tom

Thank you for your suggestion. The key point here is 6. As far as I
can see, after 1-5 we can lose many migration possibilities. To give
you some idea on possible problems: every comment in the migrated
issue will yield an email notification of everyone involved. Therefore
for some active contributors the inaccurate and dirty migration of
even a subset of issues would result in the enormous amount of emails
sent. We try to avoid all these and similar issues during the
migration.

Hope this helps.

Dear All,

It seems to be a major one. If nothing else, without it we could migrate a subset of bugs which happen to migrate cleanly, and then come back and handle the ones with issues at a arbitrarily later point. Or we could simply close creation of *new* bugzilla bugs, and start all new traffic on github without waiting for a migration at all. The whole reason we're not doing that (seems to be) is that we want to preserve the low bug numbers for 1-to-1 correspondence purposes.

Thank you for your valuable and outdated suggestions. Please stay
tuned for the progress updates that are posted periodically in the
mailing lists. The situation is much more complex than you could
imagine. We are waiting for responses from GitHub on several critical
issues that were found during the test migrations as I already
emailed. Note that these issues affect not only the existing migration
attempts but our future possible use of GitHub issues.

Thank you for understanding.

Anton,

Over the weekend we tried to perform a “dry-run” migration –
conversion of all 51k+ bugzilla issues to a temporary GitHub project.

Do we feel there could be value in sharing access to this temporary GitHub project with a wider audience so we can see how the issues would look? This might raise interest and excitement but also might help to ensure we don’t land something that people find hard to work with.

For example, I’m only interested in a very specific subset of the “bugs” (clang-format), how easy will that be for me to see just those. Are separate areas just labels? can we filter down to see just the areas we want?

I’d personally like to see how easy its going to be to find those I want and to see “what is new” in that area and “what I might work on next”

For me I’d like people to specifically log bugs against “clang-format” and not the old “Formatter” that it was in bugzilla so I’m interested to understand how in the future it might look and if there is any opportunity to “clean” the data as we migrate.

For example do we plan on taking “[clang-format] [clang-tidy] type labels from the subject lines and convert them into labels or projects?”

Just my 2p worth.

MyDeveloperDay.