[Github] RFC: linear history vs merge commits

tstellar · January 29, 2019, 10:33pm

Hi,

As part of the migration of LLVM's source code to github, we need to update
our developer policy with instructions about how to interact with the new git
repository. There are a lot of different topics we will need to discuss, but
I would like to start by initiating a discussion about our merge commit
policy. Should we:

1. Disallow merge commits and enforce a linear history by requiring a
rebase before push.

2. Allow merge commits.

3. Require merge commits and disallow rebase before push.

I'm going to propose that if we cannot reach a consensus that we
adopt policy #1, because this is essentially what we have now
with SVN.

What does everyone think?

Thanks,
Tom

adrian.prantl · January 29, 2019, 10:47pm

A linear history makes it much easier to git-bisect, since each commit can be rolled back individually. Easier cross-project bisection is one of the advantages of having the monorepo, and I wouldn't want to loose that.

-- adrian

David_A_Greene · January 29, 2019, 10:48pm

Tom Stellard via Openmp-dev <openmp-dev@lists.llvm.org> writes:

As part of the migration of LLVM's source code to github, we need to update
our developer policy with instructions about how to interact with the new git
repository. There are a lot of different topics we will need to discuss, but
I would like to start by initiating a discussion about our merge commit
policy. Should we:

1. Disallow merge commits and enforce a linear history by requiring a
rebase before push.

2. Allow merge commits.

3. Require merge commits and disallow rebase before push.

I'm going to propose that if we cannot reach a consensus that we
adopt policy #1, because this is essentially what we have now
with SVN.

I agree with proposing #1 in general. It results in a nice clean
history and will be familiar to everyone working on the project.

However, there is a place for merge commits. If there's a bug in a
release and we want to make a point release, it might make sense to make
a commit on the release branch and then merge the release branch to
master in order to ensure the fix lands there as well. Another option
is to commit to master first and then cherry-pick to release. A third
option is to use the "hotfix branch" method of git flow [1], which would
result in a merge commit to the release branch and another merge commit
to master.

I've seen projects that commit to release first and then merge release
to master and also projects that commit to master and cherry-pick to
release. I personally haven't seen projects employ the git flow hotfix
branch method rigorously.

But GitHub is read-only for almost a year, right? So we really don't
have to decide this for a while. I have not tried using the monorepo
and committing to SVN with it. How does that work? What would it do
with merge commits?

-David

[1] A successful Git branching model » nvie.com

reinterpretcast · January 30, 2019, 12:07am

As part of the migration of LLVM’s source code to github, we need to update
our developer policy with instructions about how to interact with the new git
repository. There are a lot of different topics we will need to discuss, but
I would like to start by initiating a discussion about our merge commit
policy. Should we:

Disallow merge commits and enforce a linear history by requiring a
rebase before push.

I agree that this should be considered the status quo. The other options need compelling benefits to justify a change.

Allow merge commits.

For the purposes of the canonical version of the project, I believe that history of side-branches (the main value of merge commits for me) is not of great interest.

Require merge commits and disallow rebase before push.

I do not believe #3 is enforceable (the disallow rebase part) unless if you meant to just remove the option from the GitHub UI interface.

Jeremy_Lakeman1 · January 30, 2019, 12:55am

[Armchair observer…]

If any merges are allowed, they should be rare, have recent parent commits, or the history graph becomes useless.

Each reviewed bug / feature must be rebased onto the current “known good” commit, merged into a “probably good” commit, tested by build bots, and only then pushed to trunk. Keeping trunk’s history more usable, with most bad patches reworked and resubmitted instead of reverted.
When a new feature is broken up into a patch series, the series should be rebased then immediately merged to help identify the individual patches in the history graph.

Bruce_Hoult3 · January 30, 2019, 3:31am

How about:

Require a rebase, followed by git merge --no-ff

This creates a linear history, but with extra null merge commits
delimiting each related series of patches.

I believe it is compatible with bisect.

reinterpretcast · January 30, 2019, 4:08am

How about:

Require a rebase, followed by git merge --no-ff

For personal use, I like this the most. Unfortunately, I think this is more of a command-line workflow that the GitHub UI is not friendly with.
In particular, the UI has roughly: merge (creates yet another merge commit), squash, and rebase. All of these prevent the intended result.

Hans_Wennborg · January 30, 2019, 4:17pm

I'm strongly in favour of #1: no merges allowed, enforced by the
upstream repository, and that for release branches we use cherry-picks
like we do currently.

Thanks,
Hans

adrian.prantl · January 30, 2019, 4:50pm

Typically the LLVM development model discourages merging big features in one go and instead gravitates towards breaking patches up into small, easy to reason about patches that gradually morph the code towards the goal. This makes it easier for reviewers and to track down regressions after each patch landed. For this reason we rarely have series of connected patches coming in at once.

-- adrian

preames · January 30, 2019, 7:08pm

Strongly in favor of #1.

If we decide to move away from #1, I strongly believe it should be done well after the github migration. (i.e. lets not change everything at once!)

Philip

David_A_Greene · January 30, 2019, 8:42pm

Bruce Hoult via llvm-dev <llvm-dev@lists.llvm.org> writes:

How about:

Require a rebase, followed by git merge --no-ff

This creates a linear history, but with extra null merge commits
delimiting each related series of patches.

I believe it is compatible with bisect.

Git Merge –no-ff Option

We've done both and I personally prefer the strict linear history by a
lot. It's just much easier to understand a linear history.

-David

David_A_Greene · January 30, 2019, 8:52pm

Jeremy Lakeman via llvm-dev <llvm-dev@lists.llvm.org> writes:

4. Each reviewed bug / feature must be rebased onto the current "known
good" commit, merged into a "probably good" commit, tested by build
bots, and only then pushed to trunk. Keeping trunk's history more
usable, with most bad patches reworked and resubmitted instead of
reverted.

If you mean having a submitted PR trigger builds and only allow merging
if all builds pass, that may be doable. Of course by the time it's
merged it will be against some later commit (so it should be rebased)
and there's no guarantee it will build against *that* commit. But it
will tend to filter out most problems.

Trying to keep a branch buildable at all times is a hard problem, but
the above is probably a 90% solution.

-David

echristo · January 30, 2019, 9:18pm

Agreed. Let's go with option #1.

-eric

pogo59 · January 30, 2019, 10:16pm

Systems that I've seen will funnel all submitted PRs into a single queue
which *does* guarantee that the trial builds are against HEAD and there
are no "later commits" that can screw it up. If the trial build passes,
the PR goes in and becomes the new HEAD.
By the time a PR reaches the front of the build queue, it might not apply
cleanly, in which case it gets bounced just like a failed build would.

But this would be a radical redesign of the LLVM bot system, and would
have to wait until we're done with the GitHub migration and can spend
a couple of years debating the use of PRs.
--paulr

mehdi_amini · January 31, 2019, 6:53am

What is the practical plan to enforce the lack of merges? When we looked into this GitHub would not support this unless also forcing every change to go through a pull request (i.e. no pre-receive hooks on direct push to master were possible). Did this change? Are we hoping to get support from GitHub on this?

We may write this rule in the developer guide, but I fear it’ll be hard to enforce in practice.

Krzysztof_Parzyszek · January 31, 2019, 3:24pm

I'm very much in favor of this option.

-Krzysztof

David_A_Greene · January 31, 2019, 5:29pm

Mehdi AMINI <joker.eph@gmail.com> writes:

What is the practical plan to enforce the lack of merges? When we
looked into this GitHub would not support this unless also forcing
every change to go through a pull request (i.e. no pre-receive hooks
on direct push to master were possible). Did this change? Are we
hoping to get support from GitHub on this?

We may write this rule in the developer guide, but I fear it'll be
hard to enforce in practice.

Again, changes aren't going through git yet, right? Not until SVN is
decommissioned late this year (or early next). SVN requires a strict
linear history so it handles the enforcement for now.

My personal opinion is that when SVN is decomissioned we should use pull
requests, simply because that's what's familiar to the very large
development community outside LLVM. It will lower the bar to entry for
new contributors. This has all sorts of implications we need to discuss
of course, but we have some time to do that.

If we don't use PRs, then we're pretty much on the honor system to
disallow merges AFAIK.

-David

David_A_Greene · January 31, 2019, 5:32pm

<paul.robinson@sony.com> writes:

Systems that I've seen will funnel all submitted PRs into a single queue
which *does* guarantee that the trial builds are against HEAD and there
are no "later commits" that can screw it up. If the trial build passes,
the PR goes in and becomes the new HEAD.

The downside of a system like this is that when changes are going in
rapidly, the queue grows very large and it takes forever to get your
change merged. PR builds are serialized and if a "build" means "make
sure it builds on all the Buildbots" then it takes a very long time
indeed.

There are ways to parallelize builds by speculating that PRs will build
cleanly but it gets pretty complicated quickly.

But this would be a radical redesign of the LLVM bot system, and would
have to wait until we're done with the GitHub migration and can spend
a couple of years debating the use of PRs.

Heh. Fully guaranteeing buildability of a branch is not a trivial task
and will take a LOT of thought and discussion.

-David

LebedevRI · January 31, 2019, 5:51pm

Mehdi AMINI <joker.eph@gmail.com> writes:

> What is the practical plan to enforce the lack of merges? When we
> looked into this GitHub would not support this unless also forcing
> every change to go through a pull request (i.e. no pre-receive hooks
> on direct push to master were possible). Did this change? Are we
> hoping to get support from GitHub on this?
>
> We may write this rule in the developer guide, but I fear it'll be
> hard to enforce in practice.

Again, changes aren't going through git yet, right? Not until SVN is
decommissioned late this year (or early next). SVN requires a strict
linear history so it handles the enforcement for now.

My personal opinion is that when SVN is decomissioned we should use pull
requests, simply because that's what's familiar to the very large
development community outside LLVM. It will lower the bar to entry for
new contributors. This has all sorts of implications we need to discuss
of course, but we have some time to do that.

My personal opinion is an opposite of that one.

*Does* LLVM want to switch from phabricator to github pr's?
I personally don't recall previous discussions.
Personally, i hope not, i hope phabricator should/will stay.

Low bar for entry is good, but not at the cost of raising the bar
for the existing development community.
In particular, i'm saying that github's ui/workflow/feature set
is inferior to that one of phabricator.

Is the low entry bar the only benefit?
While it is good, it should not be the only factor.
The contributors will already need to know how to build LLVM,
write tests, make meaningful changes to the code.
Additionally having to know how to work with phabricator
isn't that much to ask for, considering the other prerequisites..

If we don't use PRs, then we're pretty much on the honor system to
disallow merges AFAIK.

-David

Roman.

reinterpretcast · January 31, 2019, 6:26pm

Mehdi AMINI <joker.eph@gmail.com> writes:

What is the practical plan to enforce the lack of merges? When we
looked into this GitHub would not support this unless also forcing
every change to go through a pull request (i.e. no pre-receive hooks
on direct push to master were possible). Did this change? Are we
hoping to get support from GitHub on this?

We may write this rule in the developer guide, but I fear it’ll be
hard to enforce in practice.

Again, changes aren’t going through git yet, right? Not until SVN is
decommissioned late this year (or early next). SVN requires a strict
linear history so it handles the enforcement for now.

My personal opinion is that when SVN is decomissioned we should use pull
requests, simply because that’s what’s familiar to the very large
development community outside LLVM. It will lower the bar to entry for
new contributors. This has all sorts of implications we need to discuss
of course, but we have some time to do that.
My personal opinion is an opposite of that one.

Does LLVM want to switch from phabricator to github pr’s?
I personally don’t recall previous discussions.
Personally, i hope not, i hope phabricator should/will stay.

Low bar for entry is good, but not at the cost of raising the bar
for the existing development community.
In particular, i’m saying that github’s ui/workflow/feature set
is inferior to that one of phabricator.

+1

Topic		Replies	Views
[GitHub] RFC: Enforcing no merge commit policy OpenMP	41	208	May 10, 2019
GitHub can now enforce linear history LLVM Dev List Archives	1	98	December 12, 2019
How soon after the GitHub migration should committing with git-llvm become optional? OpenMP	14	102	October 18, 2019
[RFC] Revisiting linear history vs merge commits LLVM Project	28	1689	August 27, 2022
Llvm-test-suite accepts merge commits GitHub	1	186	January 27, 2023

[Github] RFC: linear history vs merge commits

Related topics