Code Review Process Update

Can I just say thank you for this. I’ve been using GitHub reviews for years and never noticed the gear icon and that the most useful thing was hidden behind it.

5 Likes

Hi all, I was pointed here (first time contributor) so please feel free to point me elsewhere if this isn’t the best place for a reply. I am one of the developers on Graphite (https://graphite.dev), which is a open-source CLI / web dashboard that connects to your Github account and makes it easier to stack PRs (The Graphite workflow - Graphite Documentation), along with giving you a bunch of the niceties of Phab (incl. the review queue, macros, etc.)

I’m not sure what next steps are here, but I know that stacking on GH is a huge pain (we originally built Graphite b/c we moved from Phabricator to Github ourselves and felt this exact pain), and so I wanted to offer to be helpful - either by setting up individuals on Graphite or just offering to talk about what I’ve learned. If anyone is interested, either reply here or email me at first name at graphite.dev

6 Likes

Can you make some kind of early-access for LLVM devs to test the workflow? Ideally someone would rather not sign-up explicitly but just be a member of LLVM GitHub organization to try the features?

Absolutely - is there an email domain I can approve or a list of emails?

Not quite. It’s more about granting access to anyone from committers list (for example).

Hmm, we are updating our waitlist to support this, but won’t be done for a couple of weeks. Happy to circle back when we get there, and let anyone who wants have access now if they message me?

The bot which adds a comment describing the stack is one of the possibilities I had in mind, so it’s nice to see I’m not alone in that :slight_smile: Does Graphite’s stack support work when developer’s branches are in their own personal forks instead of the main repository?

2 Likes

More arcanist issues are piling up: arc diff fails with php 8.1 on Ubuntu 22.04 · Issue #57271 · llvm/llvm-project · GitHub.

I don’t want to be responsible for maintaining more and more patches to keep the lights on.

3 Likes

Arcanist could be deprecated (in a sense that we highly recommend people not to use it at all, and scrub every mention of it from our documentation, perhaps even the running Phabricator instance) without deprecating Phabricator itself. Git can generate the raw diffs which can be set up with an alias so you do not forget the -U99999999999, patch files can be manually downloaded (either through the browser, or via wget/curl).

(Personally, never have I ever used Arcanist. Having the PHP interpreter installed on my system always seemed wrong.)

4 Likes

The protocol used by arc isn’t equivalent to uploading a patch though: there are metadata about the git parent for example, and we’ve seen in practice arc patch succeeds when uploaded with arc and not otherwise (useful for pre merge amongst other). It also records the author better which is useful when landing a patch for someone else.

4 Likes

Just a reminder for folks who don’t otherwise know, you can use --author="Name MoreName <email@address.com>" with a git commit command to manually specify who the author is when landing a patch for someone else.

(FWIW, I’m also someone who doesn’t use arcanist, I generate patches from the command line and interact with Phab through the web UI.)

5 Likes

For me I consider arc pretty essential for the workflow I am using. It gives me a much nicer workflow without having to juggle files from remote machines, uploading etc. I was very hesitant to install PHP on my machine as well - but now I can’t even consider working without it - so it’s very annoying when it breaks.

I have heard some people having luck with moz-phab but I haven’t checked it out yet.

4 Likes

How do I find the email to associate if I am landing a patch from a non committer from phab?

1 Like

I usually ask the submitter, “What name and email address would you like me to use for patch attribution?” so I’m not guessing. After a few reviews with them (basically, at the point I stop needing to ask how to commit for them), I usually recommend they get their own commit access.

1 Like

As an infrequent contributor of relatively small patches:

I think this change is long overdue. And yet – I have a great deal of sympathy for those who review lots of patches today and have frustrations with the GH reviews. I have several gripes about the way GH does reviews myself. I would not be surprised if this terribly clever community could find technical workarounds for some of the gaps.

I think it’s a slight bit weird that the LLVM Foundation weighs in on this kind of developer process issue. But if there were a strong-willed BDFL in this community who made the same decision unilaterally I think it would be less controversial. It’s perhaps due to LLVM’s flatter hierarchy (or hierarchy-less) nature that there’s an expectation among some that the community at large would arrive at decisions like this one.

Whatever the case, I think it will be a net benefit to the LLVM community to make this change. Though you can weight my opinion appropriately: by my small and infrequent contributions.

To this (very salient) point, I’ve proposed an RFC: [RFC] Proposed Changes to Clang's Code Ownership

FWIW, I think that having more code owners will largely address my nontechnical concerns with code review velocity if switching to GitHub PRs. This discussion helped to clarify that the reason review velocity was so important to me was mostly because of the code owner situation. I think getting a larger group of folks doing more active code reviews in smaller parts of the frontend will take a lot of that velocity pressure off.

At this point, the only concerns I have are actionable technical ones that I would hope GitHub would be willing to address because they’re pretty egregious usability concerns for a code review tool, but they’ve been covered by myself and others in the thread so I won’t belabor those concerns. I can live with a tooling switch even if those concerns are not addressed, but I would also expect some minor regressions in code review output (we’ll miss some changes because they’re hidden by default, that sort of thing) as a result rather than anything earth shattering.

1 Like

Thanks Aaron, and thank you again for your work to improve the code owner situation in clang. That’s a huge step forward for the scalability of the project, and I hope other subprojects take that as inspiration!

1 Like

Specific to this point: ✩ Going Public – Phorge has recently made an announcement regarding maintenance and support, so this is a viable possibility for us to consider in order to retain our current workflow.

7 Likes

Here’s my 2 cents on reviewable.io and git spr:

We’ve been using reviewable.io for a few months on the project GitHub - llsoftsec/llsoftsecbook: Low-Level Software Security for Compiler Developers. We have not done a huge number of reviews with the tool yet, but all of us using it prefer it very much over using the native github review interface.
I haven’t explicitly checked if it fixes all of the shortcomings that have been reported on the github review interface, but we haven’t found any issues with it. It seems it is very good at making sure no review comment gets lost and it’s easy to keep track of which feedback still needs to be addressed.
All review comments made in reviewable.io also appear well on the native github PR interface.
It is recommended though to not mix writing comments in the native github UI and the reviewable.io UI on a given pull request.

I’m not sure if it’s very useful, but here is a link that shows PRs that have used reviewable and had at least a bit of forth-and-back review: Pull requests · llsoftsec/llsoftsecbook · GitHub. To see a review in the reviewable interface for a specific pull request, click on the button in the text saying “This change is Reviewable”.

Overall, without having done a detailed investigation, it seems reviewable.io solves most issues that have been reported with github PR reviews, apart from support for stacked pull requests.

I did try out the git spr project once to create a stacked pull request. I had made a small mistake in preparing the stacked pull request in my downstream repo. When I used the tool to push the stack of PRs, it ended up creating a PR for every commit in the history of the project! (Luckily I could interrupt the tool quickly when I realized something wasn’t right).
That experience makes me believe that a tool like git spr isn’t suitable for a project the scale of LLVM: it’s too easy to make a mistake using it, resulting in huge pollution in branches or pull requests in the upstream repo.

The best I can think of for doing the equivalent of “stacked” pull requests using github is to simply maintain the full stack of commits in a github fork of the LLVM repo, to let reviewers see the full stack. Then, to get commits into the main LLVM repo, start by only creating PRs on the main repo for the commits in the stack that are independent of other commits. As these PRs get merged, other commits from the fork get “unblocked” to create a PR for. It is something that some authors already do today when they have a huge stack of commits that they would like to integrate into mainline LLVM.

1 Like

The problem here is the policy that we have for new targets: We only allow merging the first commits when all reviews are approved in the initial stack.

This policy minimises the hindsight problem and avoid big reverts, big refactoring with code in tree, and heated discussions about alternatives, etc.

It also avoids the “prior art”, where people may believe some pattern is “the right way” to solve a problem just because it’s in-tree, and proliferate the heated discussions to other domains.

What we need is a tool that can create stacked PRs in tree, so we can give just the tool rights to create new branches in the main repo, and make sure the tool cleans up after they’re merged or abandoned. That’d work with Github PRs without the need of another review tool (and probably also with other tools).

Of course, if Github could do that behind the scenes (which wouldn’t be too complicated), it would be much better. But expecting anything from Github seems to be folly nowadays.

What we don’t want, really, is every user to be responsible for creating, maintaining and cleaning up branches in the main repo, especially if tools make that kind of mistake.

3 Likes