Workflow to commit changes using git alone (?)

Hi all,

I have recently given commit access to LLVM and successfully pushed a test commit from my local master branch.

However, I can’t find which is the recommended workflow for committing more serious stuff using git alone. I have read the docs but everything seems to still require svn before bridging to github. I want to use git alone to commit a patch that I got reviewed.

I currently have a local 'master' branch that I keep identical to the upstream branch. I also have another local branch (let's call it 'patchbranch' for the purposes of this question) where I committed the changes for the patch I want to push. I created the diff file by running git diff to compare my local 'master' with 'patchbranch’ branches and uploaded the file to Phabricator. I got the patch reviewed and I want to commit it now to the upstream master. I make sure my 'patchbranch' catches all the upstream changes by pulling from 'master', merging 'master' into my 'patchbranch and running the relevant tests.

I want to push my local 'patchbranch' to the upstream ‘master’ in GitHub without affecting my local master branch. I also need to make sure that my patch is pushed as a single commit. I do not want to merge my local 'patchbranch' into my local 'master' because I want to keep my local 'master' clean and always identical to, or only slightly behind, the upstream branch.

I have read the documentation but all the described workflows seem to imply the use of svn at some point, which I do not want to, or know how to use. I understand this is a basic question but I used git before with small teams only, so a detailed workflow for LLVM commits using git alone would be appreciated.

Thanks,

John

Here’s roughly what I currently would do to commit a patch in a local branch (which I don’t think involves svn):

$ git branch --set-upstream-to=origin/master patchbranch # Make the branch track upstream master
$ git pull --rebase # Rebase patchbranch against upstream master

$ git llvm push # Commit

Hi Melanie,

Thanks for your reply, but if I understand it well, this implies making changes to the local ‘main’ branch, and push from that, which is what I want to avoid. But still, if I push from ‘main’, how do I fold a number of local commits into a single one, with a single comment, as appropriate for LLVM?.

My workflow consists on creating different local branches to avoid changes on the ‘main’ branch. This allows me a couple of things: First, I always keep my local ‘main’ branch in sync with the remote one, so it’s very easy to spot differences with my working ones by just running diff between them. Second, I can do an undo things, or test them in my working branches as many times as I want and commit often. I can even start from scratch from main again by just creating a new branch from that, without ever messing with the ‘main’ branch. Also, separating work into branches allows for implementing another patch while a previous one is waiting review, (while still never touching ‘main’).

In the past, I worked using a similar environment on a small team. Everyone's local changes were pushed to the remote repo at the end of the day by first merging into our local ‘main’ and then pushing to the remote ‘master’. This works on a small team because it doesn’t matter if a number of local commits get pushed together. Also everybody is happy to fix conflicts created by others if they happen, as there’s no ‘reviews’ to begin with.

In the case of LLVM it’s desirable that every reviewed patch is pushed as a single commit with the appropriate comment. Ideally, I would want to commit and push the difference between a local working branch and the ‘main’ branch, which is what I can’t figure out how do do. I would be surprised if there’s not a simple solution for that.

Thanks.

John

Hi Hiroshi,

Thanks for that. I find “rebase” difficult to use. Maybe I don’t understand it, but it always causes a lot ‘conflicts’ that are very hard to fix according to my experience. I have another question though. LLVM requires that reviewed patches are pushed as a /single/ commit with a standardised message, particularly specifying the Differential Revision url as part of the commit message. How’s that done on your example?

Thanks,

John

Hi Melanie,

I would have hoped for a more automatic way, but I will give “--amend” a try.

Thanks for that!

John

Hi All,

Ok, just for the matter of providing feedback that may be useful for others, I figured out one way to do it based on the setup that I described earlier. It can be something like this

git checkout patchbranch # checkout to the patch branch, this is the one containing the differential patch code
git checkout -b tmp # checkout to a new tmp branch
git reset —soft master # set the tmp head to the master head without touching any files, so now the next commit will contain the delta from master
git commit # commit the delta from master, this is the where to add the required commit message and the 'Differential Revision' URL
git push origin tmp:master # push the tmp branch to the remote ‘master’ branch
(the tmp branch can be deleted now as it will have no more use)

Now, if the master to patchbranch diff has been properly submitted to Phabricator, all the steps above except the last one can be replaced by executing this:

git checkout master
arc patch D<revision>

According to the docs, "this will create a new branch called arcpatch-D<Revision> based on the current master and will create a commit corresponding to D<Revision> with a commit message derived from information in the Phabricator review"

This also sets the current branch to arcpatch-D<Revision>, so the only remaining thing to do is pushing the changes to the remote master:

git push origin arcpatch-D<Revision>:master

That’s all. The git system will inform whether there’s been another commit being pushed while preparing this one, in such case it would require to start over with a fresh pull from master

John

As a person who uses git almost exclusively, your workflow is needlessly confusing

You should almost never push to your GitHub master unless you are updating from upstream or it’s your own project and you are developing it

I am excluding commands relating to archaist since I don’t understand it and can’t figure out how to use Phabricator at all.
I will use upstream to represent the llvm-project remote (https://github.com/llvm/llvm-project.git), and origin to represent my personal fork

git fetch --all -pf
# Make sure you have updated all of remote branches

git checkout patchbranch
git checkout -b tmp
git rebase -i $(git merge-base HEAD upstream/master)
# Runs a rebase on from the commit you branched off that exists in the master branch of llvm. This should be clean as it would only apply changes you have made since you last checked from master
# An editor should open and you can replace all of the `pick`s after the first commit with either `fixup` if you don't care about what message you had, `squash` if you just want to combine the commits but keep the message
# Or `reword` if you want to edit the message. You must have a `pick` at the top.
# In your case, I would just recommend to stick with either squashing or `fixup` on all of the commits except the first one

git rebase -i upstream/master
# This time rebase onto llvm-project master to make sure it will cleanly apply. Do the same here if necessary
# If you get any conflicts, you will need to resolve them. Just goto the file that it says conflicted and look for <<<<<<< and >>>>>>> lines.
# The <<<<<<< to ======= lines are the lines currently in llvm-master. The ======= to >>>>> lines are the ones you have modified in your commit.
# You will need to remove the <, =, and > lines based on what you want to change
# Remove from ==== to >>>> if you want to discard your changes, <<<< to ==== if you want to keep yours, or potentially just remove the <=> if you want both.

# Run the following if you had conflicts
git add <conflicting file(s)>
git rebase --continue

# push to your fork
git push -u origin tmp # add -f if you already had a tmp branch in your online fork

# Later on, if you want to delete the branch
git checkout master
git branch -D tmp
git push --delete origin tmp

For others

# If you need to generate a patch file that can be accepted by `git am`. You can also apply -U9999 if you really want to.
git format-patch upstream/master

# If you want a simple diff file without commit information etc. Unless you
git diff upstream/master > changes.diff

# Both formats of .diff and .patch are accepted by `git apply` or `patch -p1 ...` and are accepted by differential.

After review

arc patch D#####
git branch # Confirm you are on arcpatch-D#####
git push upstream HEAD:master # or wherever.

Note: I’m using the github markdown format of ``` to represent a code block and ` to represent a code or single line to be considered separate, so mentally strip those when reading them

This recipe is not correct in the absolute: the delta from master does not mean it contains exactly what you want, you seem to assume that master didn’t evolve between the time “patchbranch” was created.

Hi Mehdi,

I’m doing it this way to make sure that master /actually/ contains “exactly what I want” (!). Of course the remote master could have evolved slightly during such steps, but that’s a cat-and-mouse game, that I don’t think that can be easily avoided (at least to my knowledge): We use Phabricator to get patches reviewed and this implies that new commits should be based on what we post there. If master has evolved significantly since the review, then a patch update should be sent to Phabricator anyway for further review. Also tests must be run again while master continues evolving. Ultimately, the commit that gets sent to Github must be strictly based on what we got reviewed on Phabricator. My procedure only attempts to enforce that the ‘delta’ commit that I push is predictable, and exactly the one that I previously posted on Phabricator.

The procedure stated on the docs https://llvm.org/docs/Phabricator.html#committing-a-change suggests using ‘arcanist’ to create a ‘diferential’ branch based on master and the Phabricator revision number. The 'arc patch’ command is used for that. To my understanding, the problem is the same: creating that branch may take a few seconds, and while doing so, master can evolve too. The docs even suggest re-running the tests before pushing, which gives even more time for master to have changed.

(Said that, maybe I’m not fully getting ‘git’, because I only used it in the past with small teams and no need for cross-reviews, so everything was pretty straightforward: pulling what others did, merging working branches with master, eventually resolving conflicts, and then pushing)

Since you suggest that my “recipe” is not correct, I would appreciate that you elaborate on the correct one, which is why I opened this subject to begin with.

Thank you very much!

John.

recipe is not correct in the absolute: the delta from master does not mean it contains exactly what you want, you seem to assume that master didn’t evolve between the time “patchbranch” was created.

Hi Mehdi,

I’m doing it this way to make sure that master /actually/ contains “exactly what I want” (!). Of course the remote master could have evolved slightly during such steps

I meant: your list of instructions assume that master didn’t move locally since you branched “patchbranch”.
If you’re juggling with multiple patch branches at the same time, you need to make sure they are all rebased correctly on the current local master.
Your workflow may work very well for you, but I was pointing this out because someone trying to adapt it can mess up their commit fairly easily if they just follow this recipe.
Interactive rebase is much safer from this point of view.

, but that’s a cat-and-mouse game, that I don’t think that can be easily avoided (at least to my knowledge): We use Phabricator to get patches reviewed and this implies that new commits should be based on what we post there. If master has evolved significantly since the review, then a patch update should be sent to Phabricator anyway for further review. Also tests must be run again while master continues evolving. Ultimately, the commit that gets sent to Github must be strictly based on what we got reviewed on Phabricator. My procedure only attempts to enforce that the ‘delta’ commit that I push is predictable, and exactly the one that I previously posted on Phabricator.

The procedure stated on the docs https://llvm.org/docs/Phabricator.html#committing-a-change suggests using ‘arcanist’ to create a ‘diferential’ branch based on master and the Phabricator revision number. The 'arc patch’ command is used for that. To my understanding, the problem is the same: creating that branch may take a few seconds, and while doing so, master can evolve too. The docs even suggest re-running the tests before pushing, which gives even more time for master to have changed.

No: the arcanist command does not suffer from the problem I was raising.
The issue I was referring to is that your reset command will lead to undoing changes from master (unrelated to your branch) when you commit in the end (all the changes that are in master but not in “patchbranch”).
(just try to add git checkout master && git pull && git checkout tmp before your git reset , and then look at the resulting commit).

(Said that, maybe I’m not fully getting ‘git’, because I only used it in the past with small teams and no need for cross-reviews, so everything was pretty straightforward: pulling what others did, merging working branches with master, eventually resolving conflicts, and then pushing)

Since you suggest that my “recipe” is not correct, I would appreciate that you elaborate on the correct one, which is why I opened this subject to begin with.

Avoid git reset unless you are really sure that this is what you need at a given time: I would not advise git reset to any beginner for a “normal” workflow.
Christopher mentioned a more robust set of steps for instance, and in general git rebase -i is the tool: you likely want to invest in knowing it (you mentioned you had to address conflicts, but such situation will lead to resolve conflicts regardless of the workflow, yours included).

Best,

Hi Mehdi,

Hi Mehdi,

No: the arcanist command does not suffer from the problem I was raising.
The issue I was referring to is that your reset command will lead to undoing changes from master (unrelated to your branch) when you commit in the end (all the changes that are in master but not in “patchbranch”).
(just try to add git checkout master && git pull && git checkout tmp before your git reset , and then look at the resulting commit).

But I would never do that!. The commands "git checkout master && git pull” are only run before the whole procedure, never in the middle.

If you have a local patch branch branch for your work, and you guarantee that your local master branch never ever diverge separately, then why do you need another branch in the first place. I may be missing some key part of your workflow.

And the patchbranch is always merged with master before starting the procedure.

If master didn’t move then why do you need to merge it?
(always merging master into patchbranch before git reset is addressing the problem I mentioned though, it just wasn’t part of your steps)

The Phabricator diff is created by comparing both branches after they have merged, and the point of reseting ’tmp’ to ‘master’ is to obtain a fresh commit containing exactly the same diff.

In any case, I would want to understand why the archaist command does not suffer from an “evolving” master, which is the problem that you raised first. The “arc patch” command creates a new branch from the local master, so the remote repo can still have changed even before the actual command is completed (!) or at any time after that. So what makes executing that command different?, or maybe I should ask, what is the archaist command actually doing in terms of plain git commands?

The important part is that there is no git reset involved: after running arcanist you have a branch based of your local master, but you’ll still need to rebase it if the remote master has moved in the meantime before being able to push.

Hi Mehdi,

Hi Mehdi,

See my answer below

If you have a local patch branch branch for your work, and you guarantee that your local master branch never ever diverge separately, then why do you need another branch in the first place. I may be missing some key part of your workflow.

Precisely because I do not want master to “diverge” due to my own changes !!. The remote branch keeps changing all the time, particularly while I am working on my patch. I have a neurological disability affecting my hands and this means that I am slow at making changes, thus the problem you mentioned on the first place about ‘master’ evolving, is a real one. I minimise it by not touching master. Only after I finished my patch I pull the most recent master, merge it into my patch, and if everything is ok (conflicts fixed and tests run again) I send the diff to Phabricator. After the patch is reviewed and committed I can simply delete my local patchbranch (because it’s no longer necessary) and pull master with my changes already incorporated. At this point I can eventually create the next patchbranch if I plan to submit another one. Or I can maintain several patchbranches and work on the latest one while a previous one is waiting review (and possibly requiring changes). All this without touching master. Maybe this demands more discipline than just “rebasing”, but I have more control and I avoid a ton of rebase conflicts.

I hope this makes sense.

John

I use ‘arc diff’ to upload a patch to Phabricator, which I think adds the Differential Revision url to the commit message.

Hi Hiroshi

I use ‘arc diff’ to upload a patch to Phabricator, which I think adds the Differential Revision url to the commit message.

This is what I actually use, as I posted several messages earlier.

John

Sigh... okay now it's my turn to wonder what obvious mistake I made.

git push -v

Pushing to GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
remote: Invalid username or password.
fatal: Authentication failed for 'GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

I was kind of expecting it to prompt me... here are some possibly
relevant git-config settings:

user.name=Paul Robinson
user.email=paul.robinson@sony.com
remote.origin.url=GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

I haven't put my github account name in anywhere, am I supposed to?
I do have a key set on the account, but I might have messed up when
trying to put it in a right place on my Windows system.
FTR the website instructions still talk about using "git llvm push"
which IIUC is Not A Thing anymore.

Thanks,
--paulr

If you want to use your key to authenticate, you need to set your remote URL to the SSH one:

git remote set-url --push origin git@github.com:llvm/llvm-project.git

You can get the SSH URL by going to GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org. and clicking on Clone or download to get the appropriate URL ... it should give you the option to Clone with SSH.

I've not had a ton of luck with HTTPS authentication, but SSH has worked pretty well. I don't know how true that holds for Windows though.

    Sigh... okay now it's my turn to wonder what obvious mistake I made.
    
    > git push -v
    Pushing to https://github.com/llvm/llvm-project.git
    remote: Invalid username or password.
    fatal: Authentication failed for 'GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
    
    I was kind of expecting it to prompt me... here are some possibly
    relevant git-config settings:
    
    user.name=Paul Robinson
    user.email=paul.robinson@sony.com
    remote.origin.url=https://github.com/llvm/llvm-project.git
    
    I haven't put my github account name in anywhere, am I supposed to?
    I do have a key set on the account, but I might have messed up when
    trying to put it in a right place on my Windows system.
    FTR the website instructions still talk about using "git llvm push"
    which IIUC is Not A Thing anymore.
    
    Thanks,
    --paulr

Also, running with something like GIT_SSH_COMMAND="ssh -v" can be useful to diagnose potential SSH issues. Again though, not sure how well that works for Windows.

    If you want to use your key to authenticate, you need to set your remote URL to the SSH one:
    
    git remote set-url --push origin git@github.com:llvm/llvm-project.git
    
    You can get the SSH URL by going to GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org. and clicking on Clone or download to get the appropriate URL ... it should give you the option to Clone with SSH.
    
    I've not had a ton of luck with HTTPS authentication, but SSH has worked pretty well. I don't know how true that holds for Windows though.
     
        Sigh... okay now it's my turn to wonder what obvious mistake I made.
        
        > git push -v
        Pushing to https://github.com/llvm/llvm-project.git
        remote: Invalid username or password.
        fatal: Authentication failed for 'GitHub - llvm/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
        
        I was kind of expecting it to prompt me... here are some possibly
        relevant git-config settings:
        
        user.name=Paul Robinson
        user.email=paul.robinson@sony.com
        remote.origin.url=https://github.com/llvm/llvm-project.git
        
        I haven't put my github account name in anywhere, am I supposed to?
        I do have a key set on the account, but I might have messed up when
        trying to put it in a right place on my Windows system.
        FTR the website instructions still talk about using "git llvm push"
        which IIUC is Not A Thing anymore.
        
        Thanks,
        --paulr

Thanks! I had to generate a new key pair--not sure what happened but
it probably got messed up when I moved to a new PC a few weeks ago.
But with that and the URL trick, the push worked this time.
--paulr