Best practices for rebasing nascent backend?

johnwbyrd · March 12, 2021, 11:50pm

Dear llvm,

A few of us are working on a novel LLVM backend in a separate repository, and our new branch uses a lot of the fancy new MLIR stuff. We’d like the group’s opinions as to best practices for keeping in sync with llvm’s main branch.

Some of us opine that we should be periodically rebasing our backend on the tip of main. This has the advantage that we benefit from new main features, but it has the disadvantage that main seems to usually be broken in many of the test suites. So it’s hard to find a stable commit in main, which passes all the tests on all the buildbots, that we can rebase onto.

And some of us opine that we should be merging our work with main. This has the advantage that we never rewrite history, but it also means that it will be painful to squash or rebase our commits, if we ever decide to submit our work upstream.

We’ve considered doing our work based on one of the release branches, but until recently the development docs recommended against this.

Wisdom would be appreciated; thank you.

Neil_Nelson · March 13, 2021, 3:02am

Not aware that main is broken in many of the test suites. It will be helpful to provide the bugs you find. Or if you can provide a command sequence I can do that here on Ubuntu.

Perhaps if you give specific detail about the issues you are facing it will allow the other contributors to make suggestions.

Neil Nelson

androm3da · March 13, 2021, 3:27am

If the “pre-merge checks” presentation from recent dev-meeting [1] is to be believed, many revisions proposed for review fail build/tests. But now that we have those checks, hopefully John’s experience for plucking an arbitrary commit from main might give better results than in years past.

Another approach might be to review the buildbots to find a commit that is green across the board.

[1] https://llvm.org/devmtg/2020-09/slides/Goncharov-Pre-merge_checks.pdf

pogo59 · March 15, 2021, 2:34pm

The Living Downstream Without Drowning talk from the 2015 US Dev Meeting might be helpful.

At Sony, we used to merge from upstream once every 6 months; this would cost roughly 3 person-months of effort, due to the extent of our changes (and for many of those merges, I was the person). We evolved to a continuous-integration model, allowing us to apply automation to the problem; last time I added up the numbers, we were averaging roughly one merge conflict or test issue per day, most of which are very low effort to resolve.

Living at the tip of ‘main’ has advantages, the main ones being (a) you get the latest-and-greatest to work with, (b) if your downstream testing finds issues, it’s much easier to get them resolved right away than it would be six months later. The main disadvantage, of course, is (c) you get the latest-and-greatest bugs too.

We merge from upstream, rather than rebase. This was, initially, because we didn’t know any better; however, once we had users outside the immediate team and multiple releases to maintain, it made sense to merge rather than rebase because having the combined history on our master branch was consistent with the combined history in our releases. Bisecting to find the origin of a problem makes it pretty straightforward to determine which releases are potentially affected (or not).

It does make upstreaming patches more, shall we say, interesting. As it happens, just last week I did a complete diff of our tree with upstream, and we have a couple dozen patches that could be upstreamed; anywhere from typos to significant features. Because we have merged rather than rebased, the need to untangle patch history increases the effort of upstreaming, and so we don’t do it as much as we really ought to. However, reducing the “surface area” of downstream patches means lower chance of merge conflicts, which simplifies merges and makes them less costly; I recommend it highly. The talk mentioned above has a couple other tips and tricks to try in this area.

So, if you have existing releases/users to support, I think the case is stronger for merging; otherwise, the conventional wisdom is to prefer rebasing. Certainly rebasing makes it simpler to upstream a patch, and I think also can make it easier to track down bugs in your own patches versus upstream behavior (something that we often struggle with).

Good luck with your efforts!

–paulr

Topic		Replies	Views
RFC: LLVM incubation, or requirements for committing new backends LLVM Dev List Archives	11	75	July 17, 2012
"Living Downstream Without Drowning" BOF @ Dev Meeting LLVM Dev List Archives	12	92	November 6, 2015
Howdy + GIT LLVM Dev List Archives	32	107	January 20, 2015
LLVM Releases: Upstream vs. Downstream / Distros LLVM Dev List Archives	31	127	June 2, 2016
Use of merge commits in PRs under review Code Review	10	568	January 10, 2024

Best practices for rebasing nascent backend?

Related topics