Outdated Phabricator version on reviews.llvm.org breaks Google authentication since today

r4nt · June 25, 2020, 8:12am

Mehdi, Fangrui: are you willing to take on maintenance?

Otherwise, Shoaib, the cost is currently:
~$300-350 / mo for sendgrid (300-350k emails / month)
~$2k / mo for cloud (we currently run on 1 machine O.O, plus storage & backup)

If y’all care about the workflow but don’t care about diffs in emails, we can switch to stock phab.
I can ask Phacility for a quote; stock phab comes with significantly different emails though.

Re: patch chains - perhaps we can ask github on support for that?
I think what would help is somebody providing a doc with the features that phab provides that github PR doesn’t, so we can get some consensus on what the diff is.

Zachary_Turner2 · June 25, 2020, 9:21am

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

What this means for LLVM is that everyone will have to completely stop using history rewriting operations. No more rebase, squash, amend, etc.

A common theme in LLVM development is carefully curating patch histories for easy reviewability. For example, someone might tell you to split a patch up into two patches and have them reviewed separately. If there are dependent changes between the two, you have to choices:

You can split this into two commits on a single branch.
You can make 2 branches A and B, with B based off A. Request a PR from A into master and B into A.

If you do 1 and someone requests changes on A, you cannot rebase -i your fixes so that they still appear in the diff for A, because that destroys comments.

If you do 2 and someone requests changes on A, you then have to merge (not rebase) those changes into B. The problem is compounded the more pieces you have to split your patch into and it becomes quite a maintenance headache.

IMO, switching to GH PRs means abandoning LLVMs philosophy on incremental development for the above reason.

One more interesting quirk of GitHub PR: it is actually impossible to make a comment on lines in a PR that aren’t changed as part of the PR (eg surrounding code). See
https://github.com/isaacs/github/issues/284 which has been open a very long time

nikic · June 25, 2020, 9:42am

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

What this means for LLVM is that everyone will have to completely stop using history rewriting operations. No more rebase, squash, amend, etc.

This is also incorrect. Most GitHub projects I work on use a rebase-oriented workflow without issue, so clearly that’s possible…

The only issue in this area that I’m aware of is that you can’t easily see what changed between two revisions of a pull request if you both a) rebase onto master and b) amend commits at the same time. For review purposes it is most useful if additional changes are not squashed into existing commits, but pushed on top. The squash is best performed when landing the changes.

A common theme in LLVM development is carefully curating patch histories for easy reviewability. For example, someone might tell you to split a patch up into two patches and have them reviewed separately. If there are dependent changes between the two, you have to choices:

You can split this into two commits on a single branch.

You can make 2 branches A and B, with B based off A. Request a PR from A into master and B into A.

If you do 1 and someone requests changes on A, you cannot rebase -i your fixes so that they still appear in the diff for A, because that destroys comments.

If you do 2 and someone requests changes on A, you then have to merge (not rebase) those changes into B. The problem is compounded the more pieces you have to split your patch into and it becomes quite a maintenance headache.

This is a real issue (minus the comment about destroying comments). The option 3. here is to open a PR that includes the base changes and say “review only the last commit, the rest are dependent changes”, which is how I have seen this handled commonly. GitHub allows you to limit the review view to individual commits.

IMO, switching to GH PRs means abandoning LLVMs philosophy on incremental development for the above reason.

One more interesting quirk of GitHub PR: it is actually impossible to make a comment on lines in a PR that aren’t changed as part of the PR (eg surrounding code). See
https://github.com/isaacs/github/issues/284 which has been open a very long time

This is a real issue.

Nikita

Zachary_Turner2 · June 25, 2020, 10:04am

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

Try this experiment:

Create a PR
Have someone leave a comment on a line of code. You should get an email.
Make another change locally and squash it
git push -f
Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”

reinterpretcast · June 25, 2020, 2:50pm

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I’m misunderstand something, or this is just blatantly incorrect. Assuming we’re talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.

Try this experiment:

Create a PR

Have someone leave a comment on a line of code. You should get an email.

Make another change locally and squash it

git push -f

Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”

In other words, the context is gone. The state that the reviewer saw is no longer necessarily retrievable (via GitHub).
Perhaps that is mitigated by the e-mail archive, which could make it possible to reconstruct that context.

erichkeane · June 25, 2020, 3:05pm

I can’t really provide a doc, but i can describe what I believe to be the biggest problem.

In a GH PR, comments are associated with commit hashes. If a commit hash ceases to exist, so do all comments associated with it. The comments are quite literally destroyed and irretrievable.

Either I'm misunderstand something, or this is just blatantly incorrect. Assuming we're talking about pull request reviews here, review comments do not get lost, regardless of how you rebase the pull request branch.
Try this experiment:

1. Create a PR
2. Have someone leave a comment on a line of code. You should get an email.
3. Make another change locally and squash it
4. git push -f
5. Go to your email and click the link for the comment that was made in step 2.

GH should tell you that the comment does not exist.

Now that i think about it, the comment is probably still visible in the web ui by clicking “Show Outdated”. But you cannot see the code that was requested to be changed. That comment has a link associated with it, and if you click that link you just get “the commit does not exist”
In other words, the context is gone. The state that the reviewer saw is no longer necessarily retrievable (via GitHub).
Perhaps that is mitigated by the e-mail archive, which could make it possible to reconstruct that context.
[Keane, Erich] Well, you get a preview of the line commented on/surrounding lines, but yes, the history disappears. That said, we explicitly discourage force-push on my downstream for exactly this reason. I don’t know if there is a way to, but it would be wonderful if we could just prevent force-pushes from happening on reviews.

nhaehnle · June 25, 2020, 3:10pm

Disagree. I usually want to be able to review the actual commits that
will be landed, not some imagined combination that may or may not line
up with what the author will end up squashing things to.

Remember, the goal is for authors to be able to have a review of
multiple commits in a patch series, i.e.

Patch 1
Patch 2

... and then make changes to both of those patches based on reviews.
If you just push delta commits, then in the best case, if people
properly use `git commit --squash`, you get:

   Patch 1
   Patch 2
   squash! Patch 1
   squash! Patch 2

or something along those lines, and things just become really messy to
keep track of. The Phabricator Stacks UI isn't ideal for this, but it
tends to work quite a bit better than what GitHub has.

Cheers,
Nicolai

jdoerfert · June 25, 2020, 3:23pm

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM. My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered. That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than swift and the mono-repo is more diverse (I would imagine).

~ Johannes

mehdi_amini · June 25, 2020, 4:28pm

Mehdi, Fangrui: are you willing to take on maintenance?

Sure, let’s work out a transition plan offline!

Otherwise, Shoaib, the cost is currently:
~$300-350 / mo for sendgrid (300-350k emails / month)
~$2k / mo for cloud (we currently run on 1 machine O.O, plus storage & backup)

If y’all care about the workflow but don’t care about diffs in emails, we can switch to stock phab.
I can ask Phacility for a quote; stock phab comes with significantly different emails though.

I think it is a flat $12k/year above 50 users? (see: https://www.phacility.com/pricing/ )

Re: patch chains - perhaps we can ask github on support for that?
I think what would help is somebody providing a doc with the features that phab provides that github PR doesn’t, so we can get some consensus on what the diff is.

Indeed, I’ll try to start a doc on this.

jyknight · June 25, 2020, 4:33pm

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM.  My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered.  That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than swift and the mono-repo is more diverse (I would imagine).

Yes, but this part of the problem is trivial to solve. We can add a github workflow which triggers on pull-requests being uploaded or modified, which will @ people who ask to be, based on a configuration file matching commit paths/descriptions. There may even be something existing which does this, but even if not, it should be no more than a week to fully implement.

If resolving this would actually let us migrate to GitHub, I’d volunteer to implement it myself, but it seems like there’s other issues which are less trivial to solve that might also block it.

jdoerfert · June 25, 2020, 4:33pm

When I looked into this possibility (GitHub PR) when we merged MLIR in the monorepo: just Herald rules was already a blocker and I couldn't find a replacement readily available on GitHub. The handling of stack of revisions on GitHub was another problem mentioned.

Fair enough - I haven’t investigated the technical issues with adopting GitHub PRs for LLVM. My experience with the Swift community has been extremely positive though, I’d be surprised if there is something particularly crazy that only LLVM needs that GitHub hasn’t encountered. That said, again, I don’t have any practical experience with this part of the stack,

I am very much relying on Herald rules to filter patches (=PRs).

We need some replacement, IMHO, as we have many more patches/commits than
swift and the mono-repo is more diverse (I would imagine).

Yes, but this part of the problem is trivial to solve. We can add a github
workflow which triggers on pull-requests being uploaded or modified, which
will @ people who ask to be, based on a configuration file matching commit
paths/descriptions. There may even be something existing which does this,
but even if not, it should be no more than a week to fully implement.

That would be great

reinterpretcast · June 25, 2020, 9:38pm

Thanks. Some things that are high on my list:
Phabricator has persistent per-user inline comment collapsing.

Phabricator can show all inline comments over the lifetime of the patch in-line (even if the placement is not perfect) in the full diff view.

jh7370 · June 26, 2020, 7:53am

Relatedly, Phabricator doesn’t stop you continuing a comment chain for reasons I have yet to follow, which Github sometimes does.

Some others:

I believe Github also doesn’t have an easy way to respond to multiple comments simultaneously, if you are not in “review” mode, (which is always the case if you are replying to out-of-line comments).
Typically in our Phabricator, the description and summary of a patch is considered to be roughly what goes in the final commit message (same with Github). However, updating said commit message isn’t possible to my knowledge without force-pushing in Github (which as discussed elsewhere in the thread causes other problems).
Marking comments as “Done” in Phabricator does not auto-hide them, whereas in Github marking a comment as “Resolved” does. Spoken from experience, this leads to situations in Github where a developer thinks they’ve resolved a reviewers comments, marks them as Resolved, but actually they weren’t. The reviewer in turn then has to browse the comments in the summary page, to see if they have any unaddressed comments that were supposedly resolved, which given the limited context available there is a non-trivial task sometimes.

There are probably others, but those are the ones not already mentioned that I can come up with so far.

P.S. Thanks Mehdi/Fangrui for stepping up! Whilst I’m not opposed to working with Github towards getting the feature parity with Phabricator sorted, I don’t want to switch until the two are on a par with each other, which in my opinion is not yet, so being forced to do so prematurely due to lack of maintainers would have made me very sad (P.P.S thank you Manuel for Phabricator in the first place!).

mjacob · June 26, 2020, 8:53am

Relatedly, Phabricator doesn't stop you continuing a comment chain for
reasons I have yet to follow, which Github sometimes does.

Some others:
1) I believe Github also doesn't have an easy way to respond to multiple
comments simultaneously, if you are not in "review" mode, (which is always
the case if you are replying to out-of-line comments).
2) Typically in our Phabricator, the description and summary of a patch is
considered to be roughly what goes in the final commit message (same with
Github). However, updating said commit message isn't possible to my
knowledge without force-pushing in Github (which as discussed elsewhere in
the thread causes other problems).
3) Marking comments as "Done" in Phabricator does not auto-hide them,
whereas in Github marking a comment as "Resolved" does. Spoken from
experience, this leads to situations in Github where a developer thinks
they've resolved a reviewers comments, marks them as Resolved, but actually
they weren't. The reviewer in turn then has to browse the comments in the
summary page, to see if they have any unaddressed comments that were
supposedly resolved, which given the limited context available there is a
non-trivial task sometimes.

In practice, I’ve found that these two functionalities are used in different ways. In Phabricator, the implementer marks the comments as done. In GitHub, the reviewer marks comments are resolved. Actually, I would like to have both, but I’m not aware of a review software that has both.

jh7370 · June 26, 2020, 11:36am

Relatedly, Phabricator doesn’t stop you continuing a comment chain for
reasons I have yet to follow, which Github sometimes does.

Some others:

I believe Github also doesn’t have an easy way to respond to
multiple
comments simultaneously, if you are not in “review” mode, (which is
always
the case if you are replying to out-of-line comments).

Typically in our Phabricator, the description and summary of a patch
is
considered to be roughly what goes in the final commit message (same
with
Github). However, updating said commit message isn’t possible to my
knowledge without force-pushing in Github (which as discussed elsewhere
in
the thread causes other problems).

Marking comments as “Done” in Phabricator does not auto-hide them,
whereas in Github marking a comment as “Resolved” does. Spoken from
experience, this leads to situations in Github where a developer thinks
they’ve resolved a reviewers comments, marks them as Resolved, but
actually
they weren’t. The reviewer in turn then has to browse the comments in
the
summary page, to see if they have any unaddressed comments that were
supposedly resolved, which given the limited context available there is
a
non-trivial task sometimes.

In practice, I’ve found that these two functionalities are used in
different ways. In Phabricator, the implementer marks the comments as
done. In GitHub, the reviewer marks comments are resolved. Actually, I
would like to have both, but I’m not aware of a review software that has
both.

That might be a cultural thing then, and would need documenting under some sort of “Best Practices when Reviewing” doc, since I ran into this exact problem in our downstream Github repo, where we use PRs for reviews.

Zachary_Turner2 · June 26, 2020, 2:25pm

Typically in our Phabricator, the description and summary of a patch is considered to be roughly what goes in the final commit message (same with Github). However, updating said commit message isn’t possible to my knowledge without force-pushing in Github (which as discussed elsewhere in the thread causes other problems).

you can definitely edit your PR description in GH. The thing is that your commit message and PR description are mostly unrelated.

Your PR description is initialized from the commit message of the first patch you push. But you can edit it via the GH UI by clicking the three dots.

When you go to finally merge your PR, you then have another chance to edit it (and you usually have to since it joins all commit messages together into a single string, whereas I almost always just want the original PR description)

Nathan_Froyd · June 26, 2020, 3:11pm

It is not quite a full replacement for Herald rules, but Github supports labeling PRs depending on what files they modify:

https://github.com/actions/labeler

You can see it in action in the wasmtime repo, with its labels:

https://github.com/bytecodealliance/wasmtime/blob/main/.github/labeler.yml

and then using the subscribe-to-label action:

https://github.com/bytecodealliance/subscribe-to-label-action

people can get automatically CC’d to whatever areas they like. The only downside is that information lives in the repo itself, so people who want to be notified have to submit “fixes” to the repo.

I know Herald lets you automatically add reviewers for given paths; it seems like that functionality ought to be supportable through Github actions as well.

-Nathan

Zachary_Turner2 · June 26, 2020, 6:25pm

It is not quite a full replacement for Herald rules, but Github supports labeling PRs depending on what files they modify:

https://github.com/actions/labeler

You can see it in action in the wasmtime repo, with its labels:

https://github.com/bytecodealliance/wasmtime/blob/main/.github/labeler.yml

and then using the subscribe-to-label action:

https://github.com/bytecodealliance/subscribe-to-label-action

people can get automatically CC’d to whatever areas they like. The only downside is that information lives in the repo itself, so people who want to be notified have to submit “fixes” to the repo.

I know Herald lets you automatically add reviewers for given paths; it seems like that functionality ought to be supportable through Github actions as well.

From the documentation:

Danila_Malyutin · June 26, 2020, 7:41pm

For the record,
Phabricator is even worse for that purpose as it doesn't really guarantee that what's been reviewed is actually what will be committed.
In fact, I've got bit by that before svn->git transition (error in reapplying the patch on svn for submission).
GitHub PRs solve this problem. In fact, there wouldn't be any need for the "getting commit rights" process and all those pings from the new contributors, since it'd be easy to make PR mergeable by anyone as long once they are approved.
Although some of the friction disappeared after the move to git and after I discovered moz-phab.

mehdi_amini · June 27, 2020, 12:20am

For the record,
Phabricator is even worse for that purpose as it doesn’t really guarantee that what’s been reviewed is actually what will be committed.
In fact, I’ve got bit by that before svn->git transition (error in reapplying the patch on svn for submission).
GitHub PRs solve this problem. In fact, there wouldn’t be any need for the “getting commit rights” process and all those pings from the new contributors, since it’d be easy to make PR mergeable by anyone as long once they are approved.

I think merging a PR on GitHub requires permission on the repo: a new contributor would require someone with “commit rights” to click on the merge button for them.

Topic		Replies	Views
Phabricator Creator Pulling the Plug LLVM Dev List Archives	24	580	October 1, 2021
Phabricator -> GitHub PRs? LLVM Dev List Archives	162	2018	September 16, 2020
Update on GitHub pull requests Code Review	188	18658	October 4, 2024
Is LLVM going to use GitHub PRs as the main review tool? Beginners	16	1065	May 5, 2022
Code Review Process Update LLVM Project	131	9393	February 3, 2024

Outdated Phabricator version on reviews.llvm.org breaks Google authentication since today

Related topics