Outdated Phabricator version on reviews.llvm.org breaks Google authentication since today

Mehdi, Fangrui: are you willing to take on maintenance?

Otherwise, Shoaib, the cost is currently:
~$300-350 / mo for sendgrid (300-350k emails / month)
~$2k / mo for cloud (we currently run on 1 machine O.O, plus storage & backup)

If y’all care about the workflow but don’t care about diffs in emails, we can switch to stock phab.
I can ask Phacility for a quote; stock phab comes with significantly different emails though.

Re: patch chains - perhaps we can ask github on support for that?
I think what would help is somebody providing a doc with the features that phab provides that github PR doesn’t, so we can get some consensus on what the diff is.

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

What this means for LLVM is that everyone will have to completely stop using history rewriting operations. No more rebase, squash, amend, etc.

A common theme in LLVM development is carefully curating patch histories for easy reviewability. For example, someone might tell you to split a patch up into two patches and have them reviewed separately. If there are dependent changes between the two, you have to choices:

  1. You can split this into two commits on a single branch.
  2. You can make 2 branches A and B, with B based off A. Request a PR from A into master and B into A.

If you do 1 and someone requests changes on A, you cannot rebase -i your fixes so that they still appear in the diff for A, because that destroys comments.

If you do 2 and someone requests changes on A, you then have to merge (not rebase) those changes into B. The problem is compounded the more pieces you have to split your patch into and it becomes quite a maintenance headache.

IMO, switching to GH PRs means abandoning LLVMs philosophy on incremental development for the above reason.

One more interesting quirk of GitHub PR: it is actually impossible to make a comment on lines in a PR that aren’t changed as part of the PR (eg surrounding code). See
https://github.com/isaacs/github/issues/284 which has been open a very long time

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

What this means for LLVM is that everyone will have to completely stop using history rewriting operations. No more rebase, squash, amend, etc.

This is also incorrect. Most GitHub projects I work on use a rebase-oriented workflow without issue, so clearly that’s possible…

The only issue in this area that I’m aware of is that you can’t easily see what changed between two revisions of a pull request if you both a) rebase onto master and b) amend commits at the same time. For review purposes it is most useful if additional changes are not squashed into existing commits, but pushed on top. The squash is best performed when landing the changes.

A common theme in LLVM development is carefully curating patch histories for easy reviewability. For example, someone might tell you to split a patch up into two patches and have them reviewed separately. If there are dependent changes between the two, you have to choices:

  1. You can split this into two commits on a single branch.
  2. You can make 2 branches A and B, with B based off A. Request a PR from A into master and B into A.

If you do 1 and someone requests changes on A, you cannot rebase -i your fixes so that they still appear in the diff for A, because that destroys comments.

If you do 2 and someone requests changes on A, you then have to merge (not rebase) those changes into B. The problem is compounded the more pieces you have to split your patch into and it becomes quite a maintenance headache.

This is a real issue (minus the comment about destroying comments). The option 3. here is to open a PR that includes the base changes and say “review only the last commit, the rest are dependent changes”, which is how I have seen this handled commonly. GitHub allows you to limit the review view to individual commits.

IMO, switching to GH PRs means abandoning LLVMs philosophy on incremental development for the above reason.

One more interesting quirk of GitHub PR: it is actually impossible to make a comment on lines in a PR that aren’t changed as part of the PR (eg surrounding code). See
https://github.com/isaacs/github/issues/284 which has been open a very long time

This is a real issue.

Nikita

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

Try this experiment:

  1. Create a PR
  2. Have someone leave a comment on a line of code. You should get an email.
  3. Make another change locally and squash it
  4. git push -f
  5. Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

Try this experiment:

  1. Create a PR
  2. Have someone leave a comment on a line of code. You should get an email.
  3. Make another change locally and squash it
  4. git push -f
  5. Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”

In other words, the context is gone. The state that the reviewer saw is no longer necessarily retrievable (via GitHub).
Perhaps that is mitigated by the e-mail archive, which could make it possible to reconstruct that context.

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I'm misunderstand something, or this is just blatantly incorrect. Assuming we're talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.
Try this experiment:

1. Create a PR
2. Have someone leave a comment on a line of code. You should get an email.
3. Make another change locally and squash it
4. git push -f
5. Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”
In other words, the context is gone. The state that the reviewer saw is no longer necessarily retrievable (via GitHub).
Perhaps that is mitigated by the e-mail archive, which could make it possible to reconstruct that context.
[Keane, Erich] Well, you get a preview of the line commented on/surrounding lines, but yes, the history disappears. That said, we explicitly discourage force-push on my downstream for exactly this reason. I don’t know if there is a way to, but it would be wonderful if we could just prevent force-pushes from happening on reviews.

Disagree. I usually want to be able to review the actual commits that
will be landed, not some imagined combination that may or may not line
up with what the author will end up squashing things to.

Remember, the goal is for authors to be able to have a review of
multiple commits in a patch series, i.e.

   Patch 1
   Patch 2

... and then make changes to both of those patches based on reviews.
If you just push delta commits, then in the best case, if people
properly use `git commit --squash`, you get:

   Patch 1
   Patch 2
   squash! Patch 1
   squash! Patch 2

or something along those lines, and things just become really messy to
keep track of. The Phabricator Stacks UI isn't ideal for this, but it
tends to work quite a bit better than what GitHub has.

Cheers,
Nicolai

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM. My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered. That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than swift and the mono-repo is more diverse (I would imagine).

~ Johannes

Mehdi, Fangrui: are you willing to take on maintenance?

Sure, let’s work out a transition plan offline!

Otherwise, Shoaib, the cost is currently:
~$300-350 / mo for sendgrid (300-350k emails / month)
~$2k / mo for cloud (we currently run on 1 machine O.O, plus storage & backup)

If y’all care about the workflow but don’t care about diffs in emails, we can switch to stock phab.
I can ask Phacility for a quote; stock phab comes with significantly different emails though.

I think it is a flat $12k/year above 50 users? (see: https://www.phacility.com/pricing/ )

Re: patch chains - perhaps we can ask github on support for that?
I think what would help is somebody providing a doc with the features that phab provides that github PR doesn’t, so we can get some consensus on what the diff is.

Indeed, I’ll try to start a doc on this.

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM.  My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered.  That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than swift and the mono-repo is more diverse (I would imagine).

Yes, but this part of the problem is trivial to solve. We can add a github workflow which triggers on pull-requests being uploaded or modified, which will @ people who ask to be, based on a configuration file matching commit paths/descriptions. There may even be something existing which does this, but even if not, it should be no more than a week to fully implement.

If resolving this would actually let us migrate to GitHub, I’d volunteer to implement it myself, but it seems like there’s other issues which are less trivial to solve that might also block it.

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM. My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered. That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than
swift and the mono-repo is more diverse (I would imagine).

Yes, but this part of the problem is trivial to solve. We can add a github
workflow which triggers on pull-requests being uploaded or modified, which
will @ people who ask to be, based on a configuration file matching commit
paths/descriptions. There may even be something existing which does this,
but even if not, it should be no more than a week to fully implement.

That would be great :slight_smile:

Thanks. Some things that are high on my list:
Phabricator has persistent per-user inline comment collapsing.

Phabricator can show all inline comments over the lifetime of the patch in-line (even if the placement is not perfect) in the full diff view.

Relatedly, Phabricator doesn’t stop you continuing a comment chain for reasons I have yet to follow, which Github sometimes does.

Some others:

  1. I believe Github also doesn’t have an easy way to respond to multiple comments simultaneously, if you are not in “review” mode, (which is always the case if you are replying to out-of-line comments).

  2. Typically in our Phabricator, the description and summary of a patch is considered to be roughly what goes in the final commit message (same with Github). However, updating said commit message isn’t possible to my knowledge without force-pushing in Github (which as discussed elsewhere in the thread causes other problems).

  3. Marking comments as “Done” in Phabricator does not auto-hide them, whereas in Github marking a comment as “Resolved” does. Spoken from experience, this leads to situations in Github where a developer thinks they’ve resolved a reviewers comments, marks them as Resolved, but actually they weren’t. The reviewer in turn then has to browse the comments in the summary page, to see if they have any unaddressed comments that were supposedly resolved, which given the limited context available there is a non-trivial task sometimes.

There are probably others, but those are the ones not already mentioned that I can come up with so far.

P.S. Thanks Mehdi/Fangrui for stepping up! Whilst I’m not opposed to working with Github towards getting the feature parity with Phabricator sorted, I don’t want to switch until the two are on a par with each other, which in my opinion is not yet, so being forced to do so prematurely due to lack of maintainers would have made me very sad (P.P.S thank you Manuel for Phabricator in the first place!).

Relatedly, Phabricator doesn't stop you continuing a comment chain for
reasons I have yet to follow, which Github sometimes does.

Some others:
1) I believe Github also doesn't have an easy way to respond to multiple
comments simultaneously, if you are not in "review" mode, (which is always
the case if you are replying to out-of-line comments).
2) Typically in our Phabricator, the description and summary of a patch is
considered to be roughly what goes in the final commit message (same with
Github). However, updating said commit message isn't possible to my
knowledge without force-pushing in Github (which as discussed elsewhere in
the thread causes other problems).
3) Marking comments as "Done" in Phabricator does not auto-hide them,
whereas in Github marking a comment as "Resolved" does. Spoken from
experience, this leads to situations in Github where a developer thinks
they've resolved a reviewers comments, marks them as Resolved, but actually
they weren't. The reviewer in turn then has to browse the comments in the
summary page, to see if they have any unaddressed comments that were
supposedly resolved, which given the limited context available there is a
non-trivial task sometimes.

In practice, I’ve found that these two functionalities are used in different ways. In Phabricator, the implementer marks the comments as done. In GitHub, the reviewer marks comments are resolved. Actually, I would like to have both, but I’m not aware of a review software that has both.

Relatedly, Phabricator doesn’t stop you continuing a comment chain for
reasons I have yet to follow, which Github sometimes does.

Some others:

  1. I believe Github also doesn’t have an easy way to respond to
    multiple
    comments simultaneously, if you are not in “review” mode, (which is
    always
    the case if you are replying to out-of-line comments).
  2. Typically in our Phabricator, the description and summary of a patch
    is
    considered to be roughly what goes in the final commit message (same
    with
    Github). However, updating said commit message isn’t possible to my
    knowledge without force-pushing in Github (which as discussed elsewhere
    in
    the thread causes other problems).
  3. Marking comments as “Done” in Phabricator does not auto-hide them,
    whereas in Github marking a comment as “Resolved” does. Spoken from
    experience, this leads to situations in Github where a developer thinks
    they’ve resolved a reviewers comments, marks them as Resolved, but
    actually
    they weren’t. The reviewer in turn then has to browse the comments in
    the
    summary page, to see if they have any unaddressed comments that were
    supposedly resolved, which given the limited context available there is
    a
    non-trivial task sometimes.

In practice, I’ve found that these two functionalities are used in
different ways. In Phabricator, the implementer marks the comments as
done. In GitHub, the reviewer marks comments are resolved. Actually, I
would like to have both, but I’m not aware of a review software that has
both.

That might be a cultural thing then, and would need documenting under some sort of “Best Practices when Reviewing” doc, since I ran into this exact problem in our downstream Github repo, where we use PRs for reviews.

  1. Typically in our Phabricator, the description and summary of a patch is considered to be roughly what goes in the final commit message (same with Github). However, updating said commit message isn’t possible to my knowledge without force-pushing in Github (which as discussed elsewhere in the thread causes other problems).

you can definitely edit your PR description in GH. The thing is that your commit message and PR description are mostly unrelated.

Your PR description is initialized from the commit message of the first patch you push. But you can edit it via the GH UI by clicking the three dots.

When you go to finally merge your PR, you then have another chance to edit it (and you usually have to since it joins all commit messages together into a single string, whereas I almost always just want the original PR description)

It is not quite a full replacement for Herald rules, but Github supports labeling PRs depending on what files they modify:

https://github.com/actions/labeler

You can see it in action in the wasmtime repo, with its labels:

https://github.com/bytecodealliance/wasmtime/blob/main/.github/labeler.yml

and then using the subscribe-to-label action:

https://github.com/bytecodealliance/subscribe-to-label-action

people can get automatically CC’d to whatever areas they like. The only downside is that information lives in the repo itself, so people who want to be notified have to submit “fixes” to the repo.

I know Herald lets you automatically add reviewers for given paths; it seems like that functionality ought to be supportable through Github actions as well.

-Nathan

It is not quite a full replacement for Herald rules, but Github supports labeling PRs depending on what files they modify:

https://github.com/actions/labeler

You can see it in action in the wasmtime repo, with its labels:

https://github.com/bytecodealliance/wasmtime/blob/main/.github/labeler.yml

and then using the subscribe-to-label action:

https://github.com/bytecodealliance/subscribe-to-label-action

people can get automatically CC’d to whatever areas they like. The only downside is that information lives in the repo itself, so people who want to be notified have to submit “fixes” to the repo.

I know Herald lets you automatically add reviewers for given paths; it seems like that functionality ought to be supportable through Github actions as well.

From the documentation:

For the record,
Phabricator is even worse for that purpose as it doesn't really guarantee that what's been reviewed is actually what will be committed.
In fact, I've got bit by that before svn->git transition (error in reapplying the patch on svn for submission).
GitHub PRs solve this problem. In fact, there wouldn't be any need for the "getting commit rights" process and all those pings from the new contributors, since it'd be easy to make PR mergeable by anyone as long once they are approved.
Although some of the friction disappeared after the move to git and after I discovered moz-phab.

For the record,
Phabricator is even worse for that purpose as it doesn’t really guarantee that what’s been reviewed is actually what will be committed.
In fact, I’ve got bit by that before svn->git transition (error in reapplying the patch on svn for submission).
GitHub PRs solve this problem. In fact, there wouldn’t be any need for the “getting commit rights” process and all those pings from the new contributors, since it’d be easy to make PR mergeable by anyone as long once they are approved.

I think merging a PR on GitHub requires permission on the repo: a new contributor would require someone with “commit rights” to click on the merge button for them.