How to deal with multiple patches to the same file

I have submitted a patch to Phabricator that includes the TableGen file TGparser.cpp. Now I want to fix a bug in that file. What is the proper procedure so that the two patches don't get screwed up, either in my repository or in the master repository? Please answer as if I'm a git/Phabricator idiot, because, well, I am.

I should note that all I did in my repository for the first patch was stage the files and then do a diff --staged. Those files remain staged because I'm not sure what to do with them given that we use Phabricator and not pull requests.

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of Paul C.
Anagnostopoulos via llvm-dev
Sent: Monday, August 10, 2020 9:30 AM
To: llvm-dev@lists.llvm.org
Subject: [llvm-dev] How to deal with multiple patches to the same file

I have submitted a patch to Phabricator that includes the TableGen file
TGparser.cpp. Now I want to fix a bug in that file. What is the proper
procedure so that the two patches don't get screwed up, either in my
repository or in the master repository? Please answer as if I'm a
git/Phabricator idiot, because, well, I am.

I should note that all I did in my repository for the first patch was
stage the files and then do a diff --staged. Those files remain staged
because I'm not sure what to do with them given that we use Phabricator
and not pull requests.

(I'm not completely giving you recipes; if you need more explicit steps
then let me know and I'll do a full-on example.)

Even though we don't use pull requests, I still tend to commit my patches
to branches off of master, in my local clone. This allows `git show` to
generate the diff. As master advances, you update your master branch and
then on the patch branch, a simple `git rebase master` updates it.

Assuming the patch is approved, on master you can `git merge --ff-only`
to move the patch to your local master, and then push it from there.

If the bugfix is not near your other changes, and can be made independently,
then you can treat it as a fully independent patch on its own branch.

If the bugfix is near your other changes or is dependent on it, then I'd
create a second patch branch based on the first patch branch. As before,
`git show` generates the diff, and in Phabricator there's a way to mark
the second patch as dependent on the first patch. I haven't done this
personally so I'm unclear about the exact steps, but I see people doing
"patch series" all the time.

HTH and again let me know if you need a more explicit example.
--paulr

I think I understand the concepts, but I sure would appreciate a specific example, and I appreciate the offer. To make your life harder, could you start with what I should do given that I have not created a branch for the first patch? I just have the six files staged.

I have GitHub Desktop installed, if that makes any of the steps easier.

Thanks again, and no rush!

Hi Paul,
Here we go with a full example. I'm using git from the command line,
because that's my comfort zone; you should be able to do equivalent things
from a GUI.

Let's start with a test tweak one of my colleagues made recently.
We'll pretend I'm still experimenting with it.

I owe you a gala dinner at your favorite restaurant. Really.

A few questions:

Why did you 'git pull --rebase' when the branch was up-to-date? Is this just a safety habit?

I don't understand the pushing upstream. Since we use Phabricator, isn't that the job of the person who commits the patch?

Does git keep all my branches up-to-date with the origin? Say I come in tomorrow and start working on a branch. Should I make sure I do a checkout on it again?

Is it sufficient to do a 'git show' and submit with Arcanist, or is there an intermediate step to format the patch file?

From: Paul C. Anagnostopoulos <paul@windfall.com>
Sent: Monday, August 10, 2020 2:13 PM
To: Robinson, Paul <paul.robinson@sony.com>
Cc: 'llvm-dev@lists.llvm.org' <llvm-dev@lists.llvm.org>
Subject: RE: [llvm-dev] How to deal with multiple patches to the same file

I owe you a gala dinner at your favorite restaurant. Really.

A few questions:

Why did you 'git pull --rebase' when the branch was up-to-date? Is this
just a safety habit?

Yes. Frequently I put my patches on my local master branch, rather
than create a separate patch branch, and always rebasing keeps my
commits at the HEAD of the branch. It's harmless when you have no
local commits, so it's a good habit to form.

I don't understand the pushing upstream. Since we use Phabricator, isn't
that the job of the person who commits the patch?

Soon enough (if you keep contributing) you'll get commit access and
be doing that yourself. LLVM is a lot more liberal about commit
privileges than many open-source projects.

Does git keep all my branches up-to-date with the origin? Say I come in
tomorrow and start working on a branch. Should I make sure I do a checkout
on it again?

Git is designed to run "disconnected" and does nothing automatically
to keep your clone up-to-date with its origin. You need to decide for
yourself often to update your local master branch from origin. I
have a nightly cron job on Linux, and the Windows equivalent. Some
people do it weekly, some don't automate it and just pull manually
whenever it's convenient. Partly this depends on how long your build
takes and whether you leave your computers on overnight.

Any branch you create locally will stay until you explicitly delete it.
This project does not want you pushing your branches to origin; that
will change if/when we move to pull requests, but for now, keep them
local.

Is it sufficient to do a 'git show' and submit with Arcanist, or is there
an intermediate step to format the patch file?

I've never used Arcanist so I don't really know, but I expect it's fine.
'git show' works for uploading directly to Phab, as would 'git diff' and
'git format-patch'. There might be an 'arc diff' I don't know.

Note that 'git show' by default only shows you the HEAD commit; if you
add commits to your patch branch, you'll need to specify the range of
commits to diff/show/whatever. Phab doesn't keep track, the revisions
aren't cumulative.

I should say that we want "full context" diffs in Phab, meaning for any
of these commands you should use `-U999999` so people can see all the
context directly in the GUI.

--paulr

Do you work on multiple patches at once this way? If so, how do you tease the files apart when it's time to submit one of the patches?

Just chiming in to mention that there are two ways I know of in Phabricator to say that two patches are dependent on one another.

  1. In the patch description, put “Depends on DXXXXXX” (where XXXXXX is the patch number the new patch depends on).
  2. In the Phabricator UI, there’s an “Edit Related Objects” option on the right of the description, which you can use to enter other revisions that either rely on the patch, or that the patch relies on.

Note that this doesn’t affect the diff at all, so you usually will want a diff of the commit versus the previous commit in your chain, rather than against the head of master, when doing that, to avoid things showing up that were changed earlier in the series.

You can view the patch series/tree by the “Stack” option in the UI, which appears when a patch either has dependent revisions, or it depends on them. The text for this option will appear in red with a number if there are any patches that need to land before that one does, according to the hierarchy - it’s very easy to mix up the order, so it’s good to check out the tab to make sure things look right.

>> Why did you 'git pull --rebase' when the branch was up-to-date? Is this
>> just a safety habit?
>
>Yes. Frequently I put my patches on my local master branch, rather
>than create a separate patch branch, and always rebasing keeps my
>commits at the HEAD of the branch. It's harmless when you have no
>local commits, so it's a good habit to form.

Do you work on multiple patches at once this way? If so, how do you tease
the files apart when it's time to submit one of the patches?

I don't intentionally work on multiple patches at once this way. :0)
But for something simple, I'll work directly on my copy of master,
and then `git pull --rebase` is crucial. Given that --rebase is a
no-op if you have no private commits, it's a worthwhile habit.

I did once find myself working on multiple patches at once this way,
unintentionally. I created separate branches from a point before my
local commits, cherry-picked my commits to the separate branches, and
did `git reset --hard` on master to unwind those commits. Going
forward from there it was mainly a matter of keeping my build trees
organized, one per branch.

>> I don't understand the pushing upstream. Since we use Phabricator,
isn't
>> that the job of the person who commits the patch?
>
>Soon enough (if you keep contributing) you'll get commit access and
>be doing that yourself. LLVM is a lot more liberal about commit
>privileges than many open-source projects.

Ah, I was confused. I thought the final commit of a patch was done from
the patch file in Phabricator, not from the submitting person's
repository.

If I'm committing someone else's patch for them, I retrieve the patch
file from Phab and apply it to my repo, then commit from there. If it's
my own patch I just push from my repo.
Phab doesn't have a way to push; I believe Arcanist does, but again it's
working off of your local repo not what has been posted to Phab.
--paulr

I have a patch being reviewed in Phabricator. A couple of suggestions were made, so I would like to submit a revised patch. I think this patch will take awhile to be accepted, or perhaps never will be.

Meanwhile, I would like to fix an unrelated bug in one of the same C++ files. Here is my plan. Could you tell me if this makes sense?

I will make a branch off master for the bug fix, make the fix, and submit a patch to Phabricator. I will mark that patch as needing to come before the one already submitted.

I will make the fixes to the original patch. I will include the bug fix, since the second bug-fix patch will be committed before the original patch. I will use --amend to merge the suggested changes into the original patch and then submit a revised patch.

Is that right?

From: Paul C. Anagnostopoulos <paul@windfall.com>
Sent: Wednesday, August 12, 2020 10:22 AM
To: Robinson, Paul <paul.robinson@sony.com>
Cc: 'llvm-dev@lists.llvm.org' <llvm-dev@lists.llvm.org>
Subject: RE: [llvm-dev] How to deal with multiple patches to the same file

I have a patch being reviewed in Phabricator. A couple of suggestions were
made, so I would like to submit a revised patch. I think this patch will
take awhile to be accepted, or perhaps never will be.

Meanwhile, I would like to fix an unrelated bug in one of the same C++
files. Here is my plan. Could you tell me if this makes sense?

I will make a branch off master for the bug fix, make the fix, and submit
a patch to Phabricator. I will mark that patch as needing to come before
the one already submitted.

If they are unrelated, you should not need to mark one as dependent on
the other.

I will make the fixes to the original patch. I will include the bug fix,
since the second bug-fix patch will be committed before the original
patch. I will use --amend to merge the suggested changes into the original
patch and then submit a revised patch.

Is that right?

Not really best practice for unrelated fixes, no.

If they are really unrelated, you can develop, post, and push (or get
someone to push for you) those two changes, completely independently,
as if the other patch did not exist. On separate branches, of course.
Whichever one gets into upstream master first, you would then pull
into your local master, and rebase the other patch branch. Poof, the
second branch has both changes.

In effect, ask yourself: How would this work if someone else made the
bugfix? You wouldn't have to incorporate that fix into your first
branch, even if they posted it to Phabricator. If it appears upstream
at some point, then you'd automatically incorporate it into your
first branch the next time you rebased. Just because you are making
two independent changes on independent branches in your local clone
doesn't mean those two changes need to "know" about each other.

Asking questions is all well and good, but it sounds like you could
benefit from a real tutorial on using git. I read a book (Pro Git,
by Scott Chacon), because that's how I learn best, but git-scm.com
has literally the same material online. I'm sure there are plenty
of other tutorials out there. A basic understanding of the project's
SCM system will definitely help your work go more smoothly.
--paulr

I didn't think my plan sounded right. I understand what you are saying, but once the first patch is accepted, I have to remember to rebase the other branch and resubmit the patch. No? Otherwise the line numbers in the second patch will no longer match the source file. Ah, I guess if I were committing my own patch, rebasing and recreating the patch file would be part of my usual flow. But what about when someone else commits it for me?

You are certainly correct about reading a book, which is why yesterday I started REreading Brent Laster's "Professional Git." And I appreciate you putting up with all my questions. This will be the third time I've read a Git book; perhaps this is the time it will "take."

Optional. Phab works off the diff file you give it, not by comparing to
the source in git. (This is why we ask for the full-context diff, so
reviewers can look anywhere in the file you were working from.) If the
patches were dependent, you'd really want to rebase, but for independent
patches it doesn't matter. (Well, if there are a *lot* of changes, then
people might ask for a rebase just to make sure nothing was affected.
And if you're going to make changes in response to comments, my feeling
is you might as well rebase at the same time.)
--paulr