RFC / FYI: Pull Request Greetings For New Contributors

As of this change, users who are new contributors to LLVM and/or new to GitHub itself will recieve this comment on their first PR to LLVM:

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the “Reviewers” section on this page.
If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy “ping” rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

This message is posted before labelling is done and therefore before anyone else is subscribed to the PR. If you are not a first time user, the greeting is skipped and labelling runs as usual.

The content is based on my experience in other projects, suggestions from new users and common questions myself and others have been answering.

This RFC is to make you all aware that:

  • This exists and if you see it not working, let me know.
    (no new user PRs so far today, otherwise I’d link an example)
  • If you want to edit the comment content, it is in this file.

If this gives you ideas for a workflow of your own, I can report that although a bit tedious, it is quite simple to test workflows on your own fork of llvm. You do not need any special tokens or keys from the llvm organisation itself.

A few tips:

  • When you change the target of a PR, the UI doesn’t update immediately. Be patient or risk opening a bunch of rubbish PRs to llvm itself (like I did).
  • The workflows will have a github.repository == 'llvm/llvm-project' line. Edit this to be your fork, change it back when you PR it back to llvm itself.
  • If you need to test a rare property, like being a first time user, pick another property you can more easily test and replace it when you’ve finally got the syntax right.
  • Use a YAML formatter, it’ll save you a lot of frustration.

One suggestion has been to also give a message like this to folks who have been away for a while, maybe more than a year. I think that would be best done by querying a GitHub API instead of crawling the repo ourselves, but haven’t tried either.

If you’ve done that sort of stuff before, I’d love to know what the possibilities are.

11 Likes

…and it’s being skipped for reasons unknown - Implement acos operator in MLIR Math Dialect · llvm/llvm-project@4025dac · GitHub

Bear with me while I figure this out.

I wrote up the problem here: What is the correct way to check for author_association == FIRST_TIME_CONTRIBUTOR in a workflow? · community · Discussion #78038 · GitHub

If anyone has experience with GitHub workflows and is able to spot the issue from that, it would be much appreciated.

The PR that you mentioned was mine and I clearly remember that I got a message like yours during the process of creating this PR. Does this action run before the creation of the commit has been completed? It is possible that I had the unfinished PR open for a while, closed the tab, and re-opened the PR. Perhaps that’s why it did not run on the PR that was actually created?

I should have thought to ask you if you’d seen anything, thanks for replying here!

Do you remember any specifics of it, was it specifically as a comment?

The greeting part of the workflow only runs on opened events of Pull Requests. Which I would assume means the PR has finished being opened, and is ready for anyone to interact with.

I don’t see more than 1 run of the actions for your PR, however that doesn’t mean it didn’t happen. Perhaps GitHub only shows 1 run per commit.

If we assume that this did happen, then I could imagine on the first run you being a FIRST_TIME_CONTRIBUTOR, then on the next one not being that because it’s counting the same PR. As if you’ve made a PR before, but it’s invisible.

However, if that was the case the comment would still be there from the previous run. Unless the whole PR really was recreated but why would that PR then have the “First Time Contributor” badge on it as well?

I’m not sure this is possible, but I’m learning not to make assumptions with workflows so thanks for the ideas :slight_smile:

I’ve just landed a possible workaround [GitHub] Try a workaround to get the new contributor greeting to work by DavidSpickett · Pull Request #75036 · llvm/llvm-project · GitHub, which is testing a different theory. That the new contributor status values are never propagated to actions.

Unfortunately, I don’t remember it. I could open a test PR with a different user that has not contributed to LLVM yet and take screenshots if that helps. I guess the that’s within reason.

Thanks for the offer, but I might be endorsing breaking GitHub’s terms of service if we start down that road (iirc it’s limited to 1 free account per user). It shouldn’t be necessary.

According to all the docs there shouldn’t be any difference between checking for an established user and checking for a new user, logic wise. So I’ll escalate it with GitHub, somehow, if I don’t find a workaround.

Well, my silly workaround worked. The association is not being set after all, despite being visible in the UI!

Run echo "$AUTHOR_ASSOCIATION"
NONE

[RISCV][MC] Add support for experimental Zimop extension by JivanH · Pull Request #75182 · llvm/llvm-project · GitHub got the official first welcome comment.

Side notes for anyone stumbling on this thread who is trying to write a workflow in future:

Thanks for this! LLVM works quite different than the default Github project, so giving contributors some guidance is very welcome.

I think using the “new contributor” status will unfortunately mean that I would never have seen these messages, not even on my first PR: Github considers me to be a “member” of the LLVM org ever since the Bugzilla→Github issue transition. Looks like you want the bot to be stateless so I am not sure how you could do better, but this is something to be aware of. (If you are curious what other communities are doing: Rust used to use some HTTP JSON queries to check if the PR author is new; nowadays the bot uses GraphQL.)

Another point: what about getting the PR merged? At least shortly after the transition to PRs, it was the case that the PR author was expected to find someone to press the merge button for them. That is extremely unusual on Gitub (usually whoever approves the PR also does the merge), so if this is still the process, it would be good to give contributors some guidance. Ideally this can be avoided entirely.

I wonder if this means anyone opening an issue is in the same situation then.

It’s not that I wanted it to be stateless specifically, it’s that I saw other ways of doing this that iterate PRs or checked out the repo. Which wasn’t going to be viable for something as large as LLVM without some kind of caching somewhere.

I wondered if this was possible, I’ll certainly look into this. I figured GitHub themselves must have a way to generate the author associations.

It is mentioned in MyFirstTypoFix — LLVM 18.0.0git documentation but not in the GitHub guide for some reason. I’ll at least find a place to move it to where it can be found easily.

I’m sure there’s an event for approvals so we could hook that too, I’ll see what I can do.

In theory we could start a norm that reviewers merge the PR, however we have some infrastructure issues that prevent that (which ideally contributors wouldn’t have to consider, but it’s what we have at the moment).

One being that our Buildbot reports go to the committer not the author (I think). So landing someone else’s change basically means I have to be at my desk for the next hour or so to revert it if needed. Also, for folks who land their own changes, they don’t want me landing them in the middle of the night their time when they aren’t able to handle the fallout as they’d wish.

There are initiatives ongoing that will address this (e.g. pre-commit CI) but in the meantime I’m trying to get us closer to “default” GitHub practices where possible.

I don’t think so. The bot that did the transition added people as project members so that we could be assigned to issues. In bugzilla anyone can be assigned, in Github it’s only project members.

So for new contributors I think this is not a problem. But for anyone who ever reported a bug on Bugzilla, it will be.

If I look at my most recent commit, it seems to only have me and not the person that pressed the button in the web UI. Not sure what that means for buildbot though.

There has been suggestions, which probably would be least deviance from our current practices, that we could have a bot which places a comment, saying “I see that you approved this PR from author X, who doesn’t have write access - please consider merging it for them.” That keeps the current workflow for the majority of contributors with write access themselves, while avoiding the awkward phase of having to ask for it to be merged for new contributors.

AFAIK it’s the other way around - the committer’s email doesn’t get recorded in the commit anywhere. So when landing someone else’s patch (who themselves doesn’t have write access) you won’t get notified about the fallout, so you won’t know if you should do an emergency revert for example.

Yeah this would be great, just need to confirm whether we can detect the permissions (or we can start with only new contributor PRs where we already know they won’t have write access).

Thanks for the correction. This was brought up elsewhere, I wonder if it’s happening on the GitHub side, if the info survives maybe we can patch buildbot. Anyway, beside the point here.

Good job I’m looking at the bots frequently anyway then! (but ideally let’s not make people do that :slight_smile: )

Landed the change to do this today - [GitHub][workflows] Add buildbot information comment to first merged PR from a new contributor by DavidSpickett · Pull Request #78292 · llvm/llvm-project · GitHub

3 Likes

One of the first instances: [mlir] Use `create` instead of `createOrFold` for ConstantOp as folding has no effect (NFC) by nujaa · Pull Request #80129 · llvm/llvm-project · GitHub

2 Likes

I think this is simply awesome. Thank you very much for driving and implementing this change!

2 Likes

Thanks!

Which reminds me, you had asked about doing this for infrequent contributors and I think the way forward for that (and any other more nuanced workflows) is GraphQL as mentioned earlier in the thread.

That’s next on my list to try out.

2 Likes

Only applies to new contributors right now but we can generalise it later.

1 Like

The PR above was abandoned in favour of [GitHub][workflows] Ask reviewers to merge PRs when author cannot by DavidSpickett · Pull Request #81142 · llvm/llvm-project · GitHub which does this for any PR.

I got some notification of a workflow failure: Prompt reviewers to merge PRs on behalf of authors · llvm/llvm-project@80c9a4e · GitHub

Not sure why individuals seem to be originated as submitted this?