RFC: Encouraging Reviewers to Merge on Behalf of PR Authors without commit access

This thread is to discuss issues related to, but not part of RFC: Commit Access Criteria.

The current state of things is that you do not need write permissions, push permissions or commit access (some of which aren’t quite the same thing but are for this discussion’s purposes) to contribute code to LLVM.

If you do not have the ability to merge your own Pull Request, someone with that ability can merge it on your behalf.

See the very end of Contributing to LLVM — LLVM 20.0.0git documentation, which itself should be made more obvious IMO.

So one aspect of this is how do authors know that they can ask for a merge on behalf?

We can break out that statement in the contribution guide and make it harder to miss.

The other problem is, how do reviewers know whether the author needs the PR to be merged on their behalf?

We have a stand off here, exacerbated by the fact that, at least in my experience, the LLVM way has not been to merge someone’s PR for them, in case they still had work they wanted to do on it. This worked fine when we all had to have commit access to contribute, but now less so.

(also this was never documented, it’s just a habit I picked up, rapidly becoming a bad one)

I have noticed in other GitHub hosted projects that once reviewers have approved and CI passes, maintainers generally merge things without asking any more questions. So perhaps we would move toward that model more explicitly.

There is a risk there of landing PRs that blow up builds, while the author is away from their desk. If we stick to reverting to green, this is less of a problem. They return to a land and a revert, no great harm done. I would understand reviewers not merging on behalf when they are about to stop work though, as someone has to push that revert.

There is also the issue of pushing someone’s PR and the buildbot emails going to them not you, but anyone can revert a PR easily from the GitHub web interface, which mitigates that as long as everyone feels comfortable doing so.

I have tried to address the fact that reviewers don’t know they can, or should, merge on behalf in two ways:

  1. In a comment on the PR after a review is approved but the author doesn’t have commit access. This turned out not to work because the GitHub event that this approval creates, cannot write to the repository, see Revert "[GitHub][workflows] Ask reviewers to merge PRs when author cannot" by DavidSpickett · Pull Request #81722 · llvm/llvm-project · GitHub.

If anyone knows GitHub better than I and can prove me wrong on that, that would be wonderful. Or if anyone has contacts there who can tell us whether we’re barking up the wrong tree entirely.

See Elevante GITHUB_TOKEN permissions in pull_request_review workflow · community · Discussion #26651 · GitHub and `pull_request_review` event doesn't allow setting token permissions · community · Discussion #55940 · GitHub.

  1. Adding a label to all new PRs from authors without commit access - [GitHub] Add workflow to check author's commit access on new PRs by DavidSpickett · Pull Request #123593 · llvm/llvm-project · GitHub. I’m 99% sure this can work, because we already label and comment on new PRs.

This has the advantage of always being visible in the labels list, and you can do searches based on it.

It is however, not done at the time of approval, so someone will have to see the label later and know what it means.

So, do people think there are other issues to tackle here, do you have other ideas for how automation should work for this, if at all?

@tstellar you suggested a new thread for this, so feel free to reply with your own agenda.

5 Likes

@MacDue This answers what you asked on the other thread. If it were possible I agree that it would be the better way to do this.

I was thinking maybe that an agent external to GitHub could do this, given that it would act like any other user, and users can write comments. But I have no experience with this style of automation.

I think the Rust community’s bot might work this way, @nikic do you recall if they do anything like this and how they implemented it?

1 Like

Thanks for starting this discussion, David!

Personally, I’d prefer to continue merging my own patches :slightly_smiling_face:.

Timing is one concern, as you mentioned:

In addition to that, some of us might want to integrate our LLVM patches into downstream projects before landing to double-check things.

Another point for me is GitHub’s UI for merging patches - it defaults to using the PR summary as the commit message, which I find less than ideal (can this be re-configured?). I almost always manually edit my commit messages before landing for a few reasons:

  • After the review process, the PR summary is often no longer accurate.
  • During review, I sometimes realize that the summary needs improvement or clarification.

When merging my own patches, I can ensure the commit message is precise. But when merging patches for others, I usually stick with GitHub’s default to preserve what the author created, which means the author loses control over the final commit message.

That said, +1 for better advertising the option to have others merge patches for you - it’s definitely a useful flexibility to have.

-Andrzej

2 Likes

If I don’t recognize the name, I’ve been checking if the committer is a member of the llvm organization. If they are, I haven’t been pushing and trying to remember to push if they are not.

One irritation I do have is with respect to waiting for the testing to complete before merging. Once I’ve completed my review, I want to get the PR out of my inbox. However if I’m expected to merge it and the tests haven’t completed, I still need to manually intervene. There is no merge-when-tests-pass option, and no GitHub notification once the prechecks have completed

2 Likes

Thank you for looking for ways to improve this – I’ve run into this problem numerous times because GitHub does not make it easy to know whether someone can merge their own PRs. I usually resort to hovering the mouse over the author’s avatar to see if they’re in the llvm-committers role, but it is super easy to forget to do that.

That’s unfortunate as this is the workflow that I think would be the easiest on reviewers and patch authors.

Would it be possible to set up an automated job that looks for this label and then adds a comment (and removes the label)?

I have clarified the title, I’m thinking of those who don’t have the option. But no harm in reiterating this, and it should be clear in any potential policies.

And this applies regardless of the author’s access. So we could say that the comment and/or label tells reviewers to ask the author whether it should be merged, not that it should immediately be merged.

It can but I think the reason this was chosen was because the squashed messages of N commits rarely makes any sense. Though it does make it clear that some editing needs to be done.

I don’t have a strong opinion on that.

I have seen people ask the author to update the PR description before they do the merge so perhaps that’s what we add to this. If the PR looks ready to go, ask the author to update the PR description and confirm whether there’s any reason to delay.

I would think there is an event we can hook to do something on tests finishing. Perhaps it’s more tricky with Buildkite but GitHub actions is the future and I’d hope that would be better.

I suppose other projects handle this with a merge queue, which is a larger discussion I think, there has some been some talk of gatekeeper bots and such elsewhere.

I think it will be easier to address things like this once we put all the annoyances together, maybe it will form some logical workflow :unicorn:

I had no idea! Great tip.

It could look for the label and approved and any other properties we want. You would have a delay between the review approval and the “please merge on behalf if ready” comment", but it wouldn’t be days like it can be now.

And to @arsenm 's point, this comment would be a notification to reviewers.

Maybe there are reasons to keep the label, tracking PRs that have been stagnant but approved perhaps, but removing it is simpler overall. Keep it simple and see how it goes.

Aside from the best but not possible approach, this seems like the new best approach.

Please don’t change the current default behaviour. It was chosen with good reason. Apart from anything else, the commit message (i.e. the PR description) should be reviewed as part of the review process, to ensure it is useful for posterity. Of course, there’s also the problem outlined elsewhere about squashed commits etc.

Personally, I think reviewers need to get into the habit of checking if the user has commit access when approving and then should ask if the PR author would like the reviewer to merge. This ensures that the PR is actually ready (description updated, any local testing done etc), because it gives the author the opportunity to do all this before responding. I do think that where user’s have the ability to merge their own PRs, they should be left to do so at their own pace.

6 Likes

Smaller projects can handle the noise more easily. In LLVM we often have the grace period to wait a day after the first approval and we also don’t have a good pre-commit coverage, so that is not as trivial.

There’s more. One of our patches was reverted and the author was not informed and we only realized when we merged the new LLVM and didn’t see his code. This is not common, but landing and reverting while the author is away has the same effect.

I think that’s fine for now.

Indeed. Currently, people ask for others to merge after approval on their own speed. We should not impose a time frame to get the PR merged faster than the author is able to handle. Special cases where the PR is stale after approval can be handled separately.

Agree.

There is, I just can’t remember how to enable it, but I’ve used it in previous Github repositories. Maybe be a side effect of Github Actions, but there’s also the automation and webhooks that can have the permission to merge/close PRs.

Not sure how trivial, but it’s possible.

BTW, a common problem I met when I did so is, a lot first-time contributors may use github autogenrated mail address, which is not wanted.

Sometimes. I’ve seen people not do anything for a while, or assume there’s some automated process that happens after approval, which of course we don’t have. Some reviewers ask the author whether they need someone to push for them. It seems this RFC is looking for a more automatic mechanism.

1 Like

Not that this is a solution, but couldn’t we ask the authors to be explicit on the instructions on how to open a PR? Or the default text that shows up? At least it would cover the gap until we find a better way.

There is automation to detect this but it does not seem to run consistently. We can include this detail in the documentation and automated comments for extra awareness.

In fact, assuming we have this convert label to comment job, it can check again whether the author has added their email address.

The “please merge on my behalf” part of the docs is buried right now, so I will address this part first.

If you mean the default text in the PR message, that has all sorts of problems where it ends up in the commit message if you forget to remove it.

If you mean the opening comments like the ones we add when labeling, yes, that’s definitely possible.

It’s liable to get ignored by the time the review concludes, so I think the idea of a label on open, then converting that into a comment on review approval, is better, if I can get it to work.

It won’t take me long to test that, so that’s what I’ll try first.

So I intend to:

  1. Make the documentation more clear for PR authors and reviewers, with reference to the current way things are expected to happen.
  2. Work on workflows to first add a comment on open, then, on a timer, find labelled and approved PRs and add comments to them with instructions for PR author and reviewers.

Correct; the LLVM model is extremely unusual for GitHub so you should expect PR authors to be surprised by this. :slight_smile:

I think I am in the org due to the bugzilla migration, but I don’t have commit access. So this is an imperfect heuristic.

Github merge queues provide such an option (assuming the CI status is reported to Github and shows up in PRs). They might make other assumptions though that are not suited for LLVM, I don’t know your development model well enough to foresee possible issues here.

In Rust, the PR author never controls the time the PR lands. The reviewer approves the PR (by writing a special message recognized by the bot, we do not use the github approval UI for this), which makes it enter the queue of ready-to-land PRs. A bot picks up a PR from the queue, integrates the latest master, runs our full test suite, and then if that succeeds fast-forwards master to this merge – or, if it fails, posts a message in the PR. This ensures that “master is always green”; we very rarely have to back out PRs (if it happens it’s usually due to unexpected performance issues). CI takes around 2h so this means we can only land 12 PRs a day; that’s not enough, so we have people creating rollups that bundle a bunch of low-risk PRs into a single PR that enters the queue. (There’s tooling for creating the rollup but it is triggered manually.) See Not Rocket Science for more details on the underlying philosophy.

This is quite similar to Github Merge Queues, except that merge queues do some form of rollup automatically. Rust transitioned to merge queues for all its projects except the main repo; for that repo we need more control over the merge and rollup process than merge queues can provide.

It is worth noting that “full test suite” here includes the entire release process – we produce the full set of distributable artifacts that make up a Rust release, across all architectures, for each PR that lands on master. The actual process of doing a nightly/beta/stable Rust release consists mainly of “just” taking those artifacts and putting them in a different location. Only a small part of that test suite (and none of the “prepare distributable artifacts” part) runs in the PR itself – more specifically, we run all the tests, but only on one architecture and configuration. There’s no automatic integration of PR CI results and the aforementioned queue, but reviewers won’t add a PR to the queue until PR CI is green.

LLVM is a lot bigger than Rust so this exact model seems unlikely to be workable for you, but maybe there’s some inspiration to be drawn from it.

Thanks for the explanation!

To move everything at once, yes, but we do have the sub-project libcxx that has all its testing in pre-commit already. The other side of that though is that each sub-project has its own development “style”, over in LLDB we aren’t so strict, for better or worse, it depends on the use case of that project.

Small moves towards a “standard” contribution flow will hopefully give us short term benefit, and let us consider a change of approach in the future, without it being such a big upheaval.

My first attempt at the 2 part workflow is now in review: [GitHub] Add workflows to manage merging of PRs from authors without commit access by DavidSpickett · Pull Request #124910 · llvm/llvm-project · GitHub

I noticed a while ago that the automation checks the commits on the PR, but the commits on the PR can use a real email address, but when you click the “squash merge” button the email that will be used is a no-reply GitHub one.

I think this depends on user settings that might not be detectable ahead of time.

Ok that makes sense, GitHub can’t control what your local git settings might be.

My current design replies to the PR author with a set of things to do before confirming the PR can be merged, and email address will be included there. If they know they’ve set it up before, they can skip it.