IMPORTANT: LLVM Bugzilla migration

Dear Fellow LLVM'ers,

I'm happy to announce that we were able to find workarounds for the
vast majority of GitHub limitations and our dry import went
successfully.

Therefore, the following migration roadmap is proposed:

1. We will put Bugzilla in read-only mode on Wednesday, November 24
23:59 Pacific Time. This will be the last chance to submit the
bugzilla username => github username mapping.
2. We will download the data from Bugzilla and prepare the final
"migration dump" on Thursday, November 25.
3. We will perform the actual migration on November 26 - November 27
and verify the results
4. If everything will go smoothly, we will open the LLVM GitHub repo
for issues no later than on Monday, November 29. Bugzilla will remain
in read-only mode after the migration.

Please DO NOT submit any issues to LLVM github repo during the
migration as it might interfere with the migration process.

Please let me know if there are any objections to this schedule.

Hi Anton,
Perhaps I have missed it, but how do I submit the Bugzilla to GitHub username mapping?
Warm regards,
Deep

https://forms.gle/tNg86K6T7y2YfsBx9

More details can be found in https://lists.llvm.org/pipermail/llvm-dev/2021-March/149441.html

Thanks

Phoebe (Pengfei)

awesome, really glad to hear! Thanks for all the hard work!

(I guess this was probably answered somewhere else/documented somewhere: But will the llvm.org/PRXXXX links be redirected to the github issues at some point as part of this migration?)

Hi David,

Yes. The PR links will forward to new GitHub issues. The old bugzilla
links will be available via llvm.org/bzXXXX links (these are live now,
btw).

While I find important information here about migrating
away from Bugzilla in this thread, I kind of lack information
about what we are migrating to.

I've understood that we are migrating from Bugzilla to some
GitHub supported issue handling, but I have zero experience
with issue handling in GitHub.

Here are some of the questions that popped up:

- Where will I find the new tool for PRs?

- Can I find and look at the result of the dry import already
  today to familiarize and learn about it?

- And are there any guidelines specific for the LLVM project
  related to how to deal with PRs in GitHub?

- Will it be possible to setup emails notification similar
  to Bugzilla (e.g. make sure I'm automatically CC:ed on all
  issues that my co-workers are involved in)?

Regards,
Björn

Hello

You could certainly check GitHub docs:
https://docs.github.com/en/issues/tracking-your-work-with-issues/about-issues

- Where will I find the new tool for PRs?

Code reviews are still happening on Phabricator for now.

-- Tobias

All sounds great to me! I wouldn’t be surprised if we need to add some documentation on how to use Github Issues specifically related to LLVM (e.g. what labels to apply etc), although I can’t really say what that content may be. I believe our bugzilla documentation is virtually non-existent (I apologise if I’m wrong), so this isn’t a blocker though.

James

Thanks, James.

Yes, certainly any patches to import our current document wrt the
workflows are very well welcomed :slight_smile:

Looking at the PDF you posted on the other thread, it appears that users who did not fill out the migration google-sheet have their comments migrated with no author-attribution of any kind (not name, not username, or even “Anonymous LLVM Contributor #123”. Is that correct?

Thus, it seems pretty important that at least all of the active contributors have filled out the sheet before the migration – have they? What % of contributors will be converted with comment attribution? (or maybe better: what percent of comments are by someone who filled out the migration sheet vs anonymous?)

I imagine it’d be pretty easy for folks to forget that they didn’t fill out the sheet, since there’s no way to verify whether you did or not.

Looking at the PDF you posted on the other thread, it appears that users who did not fill out the migration google-sheet have their comments migrated with no author-attribution of any kind (not name, not username, or even "Anonymous LLVM Contributor #123". Is that correct?

That's correct. All such contributions will be attributed to
"llvmbot". There is no other way we can enter "anonymous" data to
github. However, we do provide the backlink to the original bz issue.
We have to anonymize the data in order to comply with some regulations
and there is no way to save the attribution without the explicit
consent.

Thus, it seems pretty important that at least all of the active contributors have filled out the sheet before the migration -- have they?

There are more than 1k users with commit access to LLVM repo.
Approximately half of them filled the survey. However, there is no
way we can force the contributor to give consent.

What % of contributors will be converted with comment attribution? (or maybe better: what percent of comments are by someone who filled out the migration sheet vs anonymous?)

I do not have such information. There are no plans to make such an estimation.

I imagine it'd be pretty easy for folks to forget that they didn't fill out the sheet, since there's no way to verify whether you did or not.

I emailed all contributors that were active (submitted a single bug /
comment) over the last 5 years but who did not fill the bz survey back
in April (after the deadline for the form expired). So, all of them
were notified. We also notified multiple times over the mailing list.
I think it's enough – everyone who cared about "saving" the
contributions would've filled the form by now. So far there are ~2450
entries there.

We occasionally have issues created by people that don’t read this list, so it’d be better if we could have a way to disable new issues from GH’s own settings.

I haven’t found a way on my own repos, but perhaps the migration process can do that as part of the setup / teardown.

–renato

Looking at the PDF you posted on the other thread, it appears that users who did not fill out the migration google-sheet have their comments migrated with no author-attribution of any kind (not name, not username, or even “Anonymous LLVM Contributor #123”. Is that correct?
That’s correct. All such contributions will be attributed to
“llvmbot”. There is no other way we can enter “anonymous” data to
github. However, we do provide the backlink to the original bz issue.
We have to anonymize the data in order to comply with some regulations
and there is no way to save the attribution without the explicit
consent.

If we can attribute it to an anonymous entity, e.g. by putting “Anonymous LLVM Contributor 123 wrote:” at the top of a comment by llvmbot, at least readers can understand whether two comments on a bug are from the same person or from different people, for example. Can we at least do something like that?

Thus, it seems pretty important that at least all of the active contributors have filled out the sheet before the migration – have they?
There are more than 1k users with commit access to LLVM repo.
Approximately half of them filled the survey. However, there is no
way we can force the contributor to give consent.

Certainly I wasn’t asking to force anyone. Rather, I wish to ensure that the people aren’t going to be surprised and unhappy when they realize they were omitted! I’d like to make sure folks are given every opportunity to fix that, before it’s too late.

I expect nearly everyone who is actively interacting with LLVM bugzilla wants to be included in the migration mapping. Certainly there’s going to be a rare person who actively doesn’t want it, and a long tail of people who are no longer active in the community, or not easily contacted, who will be excluded because they’re unresponsive. But, if there are many people who are currently active in the community (e.g. active on bugzilla or making commits), and yet have not filled out the survey, I think that indicates that we have a problem with outreach.

And, if such a problem exists, I think we ought to address that problem before migration.

everyone who cared about “saving” the contributions would’ve filled the form by now.

I very much doubt it’s true that everyone who cares will have filled it out already. I mean, just speaking for myself…I think I filled out the form? But maybe I only intended to, but forgot to get around to it? Who knows. Assuming I actually did, I’m certain there are more people in the same situation who actually did not.

And on a more general note –

It worries me how little information has been communicated with the community overall for this project, especially now that the migration is supposed to happen imminently! I completely understand how painful it can be to do a migration like this, and how it can feel annoying to have people bugging you about things, yet not actively helping to complete the migration. But…we are going to be stuck with this conversion for a long time, so it is important to validate that people are going to be overall happy with it, right?

Some other questions that pop into my mind:

  • Has a full test migration been done now? Or is it impossible to do a full test migration?

  • What happens if the migration fails in the middle due to an unforeseen error?

  • What sorts of verifications of correctness have been done on the migration output?

  • What problematic cases are “known issues” which have been deemed unimportant and therefore ignored?

  • How are comments migrated? (e.g. it would appear that some effort was put into ensuring that english prose gets migrated as variable-width text, and code sections get migrated as fixed-width text?)

  • How is all the other data migrated? Is all of it migrated? Or are certain fields deemed useless, so we don’t migrate them?

  • Can we let additional people have access to a test migration, to verify that it seems reasonable? Even if it can’t be made public, can the migration can be done to a private repository which can be opened to other llvm community members? (A couple PDFs is certainly better than nothing, but…) What’s preventing the ability to do that?

  • Is there code that implements this migration that folks can look at and/or suggest changes to?

Or, really, more fundamentally – where’s the document describing the plan? I think that’s the root of the issue – there should be a plan written up, not just one-off answers to those questions that popped up in my mind I listed above, but a document fully describing the final plan for this migration, and why each choice made there is thought to be the best option (or, at least, best practical option, if not actually the best).

I note that people posted suggestions on the previous thread which were dismissed as outdated and unhelpful – but that’s because nobody else has been told any of the information about what tradeoffs have already been considered and dismissed, and what the actual problems are/were. Having a written plan would also help ensure people aren’t going to give unhelpful redundant suggestions…

If we can attribute it to an anonymous entity, e.g. by putting "Anonymous LLVM Contributor 123 wrote:" at the top of a comment by llvmbot, at least readers can understand whether two comments on a bug are from the same person or from different people, for example. Can we at least do something like that?

We do this for issues. They are marked as submitted by "LLVM Bugzilla
Contributor".

And, if such a problem exists, I think we ought to address that problem before migration.

They had more than half a year to submit a survey and received
multiple notifications. We are not going to delay the migration due to
this.

I very much doubt it's true that everyone who cares will have filled it out already. I mean, just speaking for myself...I think I filled out the form? But maybe I only intended to, but forgot to get around to it? Who knows. Assuming I actually did, I'm certain there are more people in the same situation who actually did _not_.

Well, you can simply go and submit it once again. We will certainly
take care of dups.

Some other questions that pop into my mind:

Great! Thanks for the questions. Probably they should have asked 2
years ago. You will be able to check the results by yourself after the
migration.

Hi Renato

We occasionally have issues created by people that don't read this list, so it'd be better if we could have a way to disable new issues from GH's own settings.
I haven't found a way on my own repos, but perhaps the migration process can do that as part of the setup / teardown.

Well... you'd assume that there is a way of doing this. But apparently
there is no :slight_smile: There is a single switch to turn issues on and off, so
it is not possible to "hide" issues but still work on them. And at
some stage we will need to enable them in order to do some migration
steps. The only way to "protect" the issues is to limit the access to
the repo, but I doubt we can do this. Nothing bad will happen, frankly
speaking, the result will be a bit inaccurate though.

Well, the issues in Bugzilla are called PR:s (Problem Reports?). So I was actually asking about where I would file/read problem reports after the migration. I have browsed around looking at https://github.com/llvm/llvm-project but I still can't find any llvm-project "issues" when looking at that page (I'm guessing something suddenly will appear on that page later, but I do not really know since I'm not familiar with github issues).

Maybe I won't be able to get an URL to where I will find bugs in the future until after the migration. That however makes it hard to help out to give any feedback in case I see some troubles with the migration beforehand. Except the feedback that I've already given by saying that I lack information about where to find bug reports at the end of this week.

I'm also a bit curious to know if I automatically still will get notifications for all the issues that I subscribing to (as well as all the people I'm stalking in Bugzilla by using the "User Watching" feature). Or if I would need to configure that myself in github somehow. I think that I'd prefer to setup such subscriptions before the migration takes place to avoid any information loss. But since there are no issues in the llvm-project in github today, I do not really know if that even is possible (or if it automatically will be part of the migration - I actually doubt that it will be handled automatically and therefore I'm even more anxious to find out how to do it).

Btw, I'm sorry if I've totally missed out on this information in all the various discussions that's been ongoing the last couple of years. But I just got this IMPORTANT information about the migration happening this week, and it will take some time to read up on all old mail threads etc to get a summary of what actually is migrated or not and what the final plan looks like.

Regards,
Björn

Hello Bjoern,

Well, the issues in Bugzilla are called PR:s (Problem Reports?). So I was actually asking about where I would file/read problem reports after the migration. I have browsed around looking at https://github.com/llvm/llvm-project but I still can't find any llvm-project "issues" when looking at that page (I'm guessing something suddenly will appear on that page later, but I do not really know since I'm not familiar with github issues).

Right. The issues will be in the standard place. They are disable now
for obvious reasons.

Maybe I won't be able to get an URL to where I will find bugs in the future until after the migration. That however makes it hard to help out to give any feedback in case I see some troubles with the migration beforehand. Except the feedback that I've already given by saying that I lack

We will certainly update the links in the docs and put the banner on
bugzilla to redirect users. Please rest assured that this information
will not be hidden :slight_smile:

I'm also a bit curious to know if I automatically still will get notifications for all the issues that I subscribing to (as well as all the people I'm stalking in Bugzilla by using the "User Watching" feature). Or if I would need to configure that myself in github somehow.

If you were CC'ed on the bugzilla issues, then you will be mentioned
on the issue after the migration, so you should be able to receive the
notifications. Our aim (and this is what we spent enormous amounts of
time over the last 2 years) is to preserve as much information as
possible during the migration and to ensure that the old bz
functionality is somehow modelled via github after the migration
(provided all kinds of limitations github imposes).

If we can attribute it to an anonymous entity, e.g. by putting “Anonymous LLVM Contributor 123 wrote:” at the top of a comment by llvmbot, at least readers can understand whether two comments on a bug are from the same person or from different people, for example. Can we at least do something like that?
We do this for issues. They are marked as submitted by “LLVM Bugzilla
Contributor”.

As I said, the purpose would be to allow disambiguating multiple anonymous contributors, e.g. by suffixing a unique number to each anonymous contributor. The reply misses that point.

And, if such a problem exists, I think we ought to address that problem before migration.
They had more than half a year to submit a survey and received
multiple notifications. We are not going to delay the migration due to
this.

My understanding from what you said is that you have sent a single notification to each user back in April. (Plus a mailing list post, before that, in March.) If that is enough to capture most active users, great! But it sounds like it was not. You can’t blame the users if a large percentage of them have a problem. That points to a problem in the process, not the people.

Some other questions that pop into my mind:
Great! Thanks for the questions. Probably they should have asked 2
years ago. You will be able to check the results by yourself after the
migration.

It feels to me like you’re being intentionally disingenuous here, and that makes me sad. My questions are about the migration plan/process/decisions as it is now finally implemented, not the initial ideas for migration from 2019. I don’t think that a request that the final plan be written down and reviewable by others is out-of-line or unexpected.

Until very recently, it seemed like wasn’t even clear that a migration would be feasible under the proposed scheme at all, and that the tooling was still under active development. Now that it’s clear that it can be done (which is great news!), the next step I expected was a detailed writeup of the final characteristics of the implementation, and what things are expected to look like afterwards. Instead, at basically the first point where it’s known that this is actually feasible, it’s too late to ask any questions? There’s no documentation of what’s been implemented? No description even of what users should expect after migration? I do not understand this.

Certainly it’s possible for a project to turn out successfully without a written design, documentation, or review. But isn’t that unnecessarily risky?

If we can attribute it to an anonymous entity, e.g. by putting "Anonymous LLVM Contributor 123 wrote:" at the top of a comment by llvmbot, at least readers can understand whether two comments on a bug are from the same person or from different people, for example. Can we at least do something like that?

We do this for issues. They are marked as submitted by "LLVM Bugzilla
Contributor".

And, if such a problem exists, I think we ought to address that problem before migration.

They had more than half a year to submit a survey and received
multiple notifications. We are not going to delay the migration due to
this.

I very much doubt it's true that everyone who cares will have filled it out already. I mean, just speaking for myself...I think I filled out the form? But maybe I only intended to, but forgot to get around to it? Who knows. Assuming I actually did, I'm certain there are more people in the same situation who actually did _not_.

Well, you can simply go and submit it once again. We will certainly
take care of dups.

Some other questions that pop into my mind:

Great! Thanks for the questions. Probably they should have asked 2
years ago. You will be able to check the results by yourself after the
migration.

Thank for all your hard work! This issue tracker system has been a pain
for so many people for years.

I think having a small-scale review of the post-migration github
repository can be very useful. It can be 1% (or larger?) of the current
~53000 issues. People will have some idea what the repository will look
like, e.g. what portion of issues are anonymous.

James: I imagine it'd be pretty easy for folks to forget that they didn't fill out the sheet, since there's no way to verify whether you did or not."

Agree. Deep Majumder asked the requestion in the thread. A colleague of
mine asked the same question.

I just visited the Bugzilla / GitHub username mapping form and re-submitted my mapping

There is no email confirmation.

A contributor may have multiple email addresses. They may not submit
their bugzilla email address or they may miss the notification to their
not-regularly-used email address (possible due to closed registration for
a long time).

Yesterday I read
LLVM relicensing update & call for help - The LLVM Project Blog and notified
3 friends who were on the long-tail spreadsheet. Two of them are still
contributing and told me that they missed the email notification because they
had changed their primary email address. I am going to ask them whether
they have submitted the mapping form...

I think bugs.llvm.org has a local patch to display the banner:
"New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name."

Could the banner be toggled a bit to remind the user and show the Bugzilla / GitHub usernam mapping if submitted?