Update on GitHub pull requests

check llvm-project/.git/config
it should be branchPrefix = users/<user>/spr/

2 Likes

You’ll also want this in your .git/config:

[spr]
        requireTestPlan = false

Hello,

I’ve read through the full thread and am confused about current Phabricator shutdown timeline. The original message in the thread states that no new patches should be submitted to Phabricator by Oct 1, later it was extended to Nov 15, and now to Dec 15 (basing on this message, if I interpret it correctly).

I have an accepted revision I’m finally ready to merge, am I still allowed to do this using old steps (rebase onto main and direct push) or do I need to migrate it to Github (seem to be discouraged) and merge from there?

I see a commit be811d161765 from Nov 29 that seem to follow old procedure.
Sorry if I missed something obvious.

You can still use those steps. No need to migrate it to Github.

Understood, thank you!

Seems like reviews.llvm.org now points to an archive, but I can’t find a discussion or announcement about this change.

2 Likes

See discussion at Can't access https://reviews.llvm.org/ and No emails from phabricator since Nov 15 ; basically, Phabricator broke, and the maintainers decided to switch to the static archive instead of trying to fix it.

So the current state is:

The archive is incomplete.
Moreover without a way to search the archive, or at the very list a dump of still-open reviews, a lot of work is going to be lost, and as you said, this does not appear to have been discussed.

1 Like

I saw these, but none of these are really clear on ā€œphab broke downā€.
The first one says that it is inaccessible and the answer is basically ā€œhere is an archiveā€: nowhere I see what happened (it could have been just turned off as well).

In the second thread, folks acknowledge the issue with Phab emails, but I don’t see an indication of other issues with Phab itself.

Yeah, I was planning to save my personal homepage with my list of reviews and on-going patches before it gets turned off, it’ll be impossible to find these anymore I’m afraid :cry:

Yes, could we get list of all reviews ? Even as an static list id + title/author.

I cannot access the Phabricator instance and I’ve asked whether we can make it live even if open for our local area network so that we can crawl more information.

(I crawled /Dxxx pages when I was about to go on vacation and did not have more time to polish it. The crawling process was painfully slow. I have a bit more information on Some reviews on reviews.llvm.org seem to be missing from the static archive - #4 by MaskRay)

I have collected differentials that are not ā€œClosedā€ at
https://gist.githubusercontent.com/MaskRay/798de69eb9e7ec7c3e98507265dc5514/raw/ .
It contains author and subscription mailing lists like *-commits (if present). I hope that can be a poor man’s replacement for /p/username and search functionality…

This was still my understanding of the agreed upon timeline. If a different timeline was posted somewhere, I missed it (in which case, please point me to it). I understand that hosting Phabricator has been a burden (though I don’t understand why or what the challenges have been), but please restore a working instance, at least until we can fully identify what to migrate and/or validate a complete and usable static archive. More transparency in what is happening and why would be much appreciated as well.

4 Likes

The update was posted here today: LLVM Phabricator Turndown

Given the security risks involved and our general lack of forensics capability, I strongly recommend that we don’t reactivate Phabricator ever again, and that any pending review to be migrated to Github PRs. Some comments will be missed, yes, but that’s a small price to pay for the peace of mind of not allowing attackers to make use of our infrastructure, even if briefly.

I strongly suggest we re-generate all keys/tokens from all apps we keep in cloud and self-managed instance, possibly even non-managed, even before we verify what happened.

If attackers have their hand in another instance / app owned by LLVM, then there’s no telling how long we’ll remain vulnerable and what other services will be compromised without our knowledge.

I also strongly suggest that people using email/password in Phabricator to review their passwords, especially if they shared those passwords with other services. However, if they were using auto-generate unique passwords or single-sign-on (like Google/Github), then it should be fine.

If there were other services in the same VM, then we also need to reset passwords, tokens and keys to those services, too and people need to be aware of which services those are, to make sure to reset their passwords to them, too.

Thank you @mehdi_amini and @rnk for the updates, and thank you @MaskRay for the hard work over the holidays!

When the community was first notified of the desire to transition, we were asked to identify critical needs and we explicitly said that the content of Phabricator needed to be retain long-term in some way. (Not the ability to update patches, comments, etc – but the ability to see old reviews and their comments, commits, etc.) So from the most recent announcement:

Regarding the status of the archive, I recommend that everyone review any of their remaining pending patches in Phabricator. Some users have reported that the archive is incomplete, and any efforts to validate the backup are appreciated. @maskray has compiled a complete list of unclosed Differentials in this CSV file , which you can search for your Phabricator username. It is possible to improve the archive in the future as the data still exists, but I don’t want to overpromise.

I think we need something stronger than hopes and wishes for improving the archive. The unclosed differentials are important for people’s current activities and is a good thing to focus on in the short term, but the full archive (is still a community requirement. This is still a critical need in Clang because of how often we need to go back in time to understand why a change was made. We can use the mailing list archives to a decent degree, at least for now, but those are known to be incomplete due to occasional mailing software glitches; the Phab reviews are the source of truth.

Do we have ideas on what the long-term plan is to restore that data in some form or fashion? Are there ways in which the community can proactively help restore that data?

7 Likes

It would be useful to have a date of last update in the csv file.
And I don’t think the csv is complete either

There are 8K+ needs revision PRs and 1.5K+ Accepted PRs in this file.

I feel like we have very little visibility over the amount of work we are losing, independently of the long term value of conversations and link.
It certainly concerns me a bit.

2 Likes

I generated the tsv file using the incomplete read-only archive, which was based on crawling the Phabricator instance between 2023-12-12 and 2023-12-21. So, yes, the tsv file is incomplete, but is the best I can provide (as the Phabricator instance is not accesible at present).

def process_html(html, diff):
    soup = BeautifulSoup(html, "html.parser")
    status = soup.select_one(".phui-tag-core").text
    title = soup.select_one(".phui-header-header").text
    author = soup.select_one(".phui-head-thing-view > strong").text
    sub = []
    for div in soup.select(".phui-handle.phui-link-person"):
        if 'commits' in div.text:
            sub.append(div.text)
    print(diff, status, title, author, ','.join(sub), sep='\t')

+1

Also thanks to @MaskRay and @rnk for doing what needed to be done. I know you both go above and beyond to prioritize the needs of the community at large, and I’m sure it weighed on you to take such a unilateral action. But the world is a dangerous place and I appreciate that you did so.

I don’t have a stake in the data loss, but having been a fly on the wall for the long turn down saga, I had mentally classified the phab instance as running on borrowed time for quite a while. It is no surprise to me that the ledge it was on crumbled before it was meant to.

2 Likes

Thanks for the recognition, I want to acknowledge, this did not go as planned.

Looking forward, I think we should prioritize:

  • Ensure that we have a complete archive. To Aaron’s point, it’s very valuable to be able to refer back to the reasoning behind past decisions, and that often happens in code review.
  • Transfer the ownership of the archive from Google to the LLVM Foundation. This is obviously a better long term home for the data at rest.

To evaluate completeness, is it just a matter of pulling the differentials by iterating from D1 D${max}? The one reported instance is D99750 which returns 404 not found. Is that the only failure mode that we’ve observed so far? It would be good to know the fraction of missing issues. That particular one can be found on llvm-commits, and it doesn’t look remarkable, like a large diff, or an issue with many comments.