Pull requests in GitHub

Hi I know that runtimes / libc++ were allowed to use pull requests in GitHub for a while, what are your impressions?
@ldionne

I am iterating here: [libc++] Test GitHub PRs for libc++ patches by ldionne · Pull Request #63273 · llvm/llvm-project · GitHub

Honestly they were enabled and then the ramp up towards the LLVM 17 release happened so I had no time to really look into it. I am resuming that now.

What I envision is that we’ll use Github Actions to trigger the current BuildKite pipeline for a few months. Then, once we eventually drop support for Phabricator, we should be able to trigger the jobs directly from Github Actions without going through BuildKite, if we want to do that.

That’s nice, could you please share CI code for that?! As you might have guessed I am thinking about “moving” (reimplementing really) checks for pull requests to [preliminary] buildkite.

So you think that we should try to minimally depend on Buildkite for this setup, right? Is that due to general idea of having less dependencies or there are some specific concerns?

Few days ago I have tried to setup a very simple build for PR / every commit on my fork to trigger a buildkite build and it worked immediately sample PR. We will need something better than a fixed set of tests I’ve set but that should be fine for the starts. So I was thinking about writing buildkite piplines to llvm-project/utils/ci, like we discussed briefly in [RFC][premerge] Moving premerge CI scripts to https://github.com/llvm. I am not sure that leaving every project to figure out their setup is a way to go as some project will simply do nothing / keep relying on post-builds.

Another question: do you think it’s important to have a main build for the same configurations to run on every commit so that authors can understand if failures are coming from main rather than their change? As I understand, in libc++ you expect everything to pass and if something as a miss then author should figure out what and probably revert / fix themself.

I will look into GitHub actions to see if we can easily have something similar to buildkite setup and utilize the same VMs - if there is no way to set up that without a big investment then I will likely try to go with triggering things through Buildkite.

Of course! All the code is in [libc++] Test GitHub PRs for libc++ patches by ldionne · Pull Request #63273 · llvm/llvm-project · GitHub really.

I believe that we should trigger the various BuildKite pipelines we already have from GitHub Pull Requests for the time being. That way, we can keep the exact same pre-commit CI setup with both GitHub PRs and Phabricator, and avoid rocking the boat during the transition period. Then, once we make Phabricator read-only (and only after that), we can decide whether we want to keep going through BuildKite or simplify the setup and use GitHub Actions.

Setting up the bridge to trigger BuildKite pipelines from GitHub PRs is quite simple, I did that yesterday and it seems to work fine in my PR. It doesn’t provide the best experience cause you have to click through 1-2 pages from the GitHub PR UI to see the CI results instead of seeing them directly on GitHub, but it’s really not that bad.

I actually disagree a bit here. I believe that the only way to get relevant tests that stay green and that are taken seriously is if each project starts to own their pre-commit CI. It doesn’t make sense for you and I to decide what tests LLDB (for example) should be running – we don’t really know. There’s probably a lot more to test than just running check-lldb: there’s probably back-deployment tests and various other things they’d want to run, and we can’t set that up for them. I also believe that the mitigated success of the current pre-commit CI system for some projects is due to the fact that it was centralized. They didn’t really know what was being run (so they didn’t really understand the value of it) and didn’t really understand what was going on when it failed (so it became a source of frustration).

However, I believe that we should definitely provide the infrastructure and give all the subprojects an extremely simple and clear path towards adding the tests that are relevant to them. This is actually not hard to do and we do it for Clang/libc++. We simply have CI pipelines defined in libcxx/utils/ci/buildkite-pipeline.yml and clang/utils/ci/buildkite-pipeline.yml, and the infrastructure loads them based on what files were modified in the monorepo (see .ci/generate-buildkite-pipeline-premerge). We can bikeshed the names and locations of everything, but the important point is that the CI configuration is now under the control of the individual projects, and it can be changed with a normal patch (in which the CI will actually test the proposed changes!). That way, there’s is virtually no barrier for the project to maintain, improve and expand on these tests.

In the PR I posted above, I am also trying to move the original x64 debian and x64 windows jobs (which basically build everything in the monorepo and do check-all) to this system to decouple them from the code in premerge_checks.py. That would remove that barrier for all the other non-Clang non-libc++ projects. Once/if that is done, we should disable those steps in premerge_checks.py to avoid running them redundantly.

Having a scheduled build that runs every X hours (I think X=4) has proven useful to us so far since it allows immediately catching build breaks and fixing them. Build breaks can happen when someone merges something and they thought it was green but it wasn’t, or maybe there was some rebase conflict, or something else. To catch those, I peek at the scheduled builds from time to time and as long as those are green, I know everything’s good. When a scheduled build is red, that’s when I know there’s some CI instability and I (or someone else) needs to take a look.

I would retain these scheduled builds throughout the GitHub PR transition to avoid rocking the boat. In the long run, we might want to do things differently. I could envision a world where

  1. Direct pushes are not allowed.
  2. PRs all have CI requirements (i.e. the PR can’t be merged without some CI passing).
  3. We use merge queues to make the CI requirements as transparent as possible.

If we did that, we could probably ensure an almost-always-green trunk and that might influence whether we still want scheduled builds. But this is all up in the air and I expect everyone to have very different and strong opinions about the workflow we should adopt for the future. So personally, I am focusing on trying to nail the migration to a “basic” PR workflow for the time being.

I also wanted to take a look at that. I think we could use self-hosted runners to achieve this and we’d be able to basically cut the middleman (BuildKite) and trigger everything via GitHub Actions directly. I think that would be quite nice, however we still need to support BuildKite until Phabricator is shut down, so until that happens I don’t think it really makes sense to try to move our runners away from BuildKite.

This is all just my opinion and how I personally think we should do things, but I’d love to know what you think.

Quick update: I have managed to run basic “llvm; ninja check-all” + sccache + self hosted runner (on my machine) Pull Requests · metaflow/llvm-project@7f745f7 · GitHub. The workflow’s code turned out to be quite simple https://github.com/metaflow/llvm-project/blob/7f745f792533754390056e8e3c95df57dd088b15/.github/workflows/pull-request.yml
as @tstellar has already implemented basic actions to run. We can definitely make overall CI much simpler after disabling Phabricator by moving to GitHub actions + give release builds bigger machines to work with.

I agree that for now it would be easier to keep using Buildkite to support both modes while we are migrating to Pull Requests. Will ask github owners to give us permissions to wire Buildkite.

and yes, moving towards merge queues is a great goal! :sunglasses:

1 Like

This is already done, I worked with @tstellar on this yesterday! I think we’re almost all set for the pre-commit CI part of things with GitHub PRs now. Obviously that’s only for the “transitional” setup, I haven’t worked on anything “post-transition” yet.

oh, so buildkite is now allowed to post statuses to llvm project?! Then I will proceed to moving CI scripts to repo