Delete Phabricator metadata tags before committing

Many git commits in the monorepo look like the following:

   [Tag0][Tag1] Title line
      Summary:
   Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque mauris neque, porta nec tristique at, sagittis vel nisi. Fusce pharetra nunc et mauris consequat venenatis.
      Reviewers: username0, username1
      Reviewed By: username0
      Subscribers: username2, username3, llvm-commits

   Tags: #llvm
      Differential Revision: https://reviews.llvm.org/Dxxxxx

These Phabricator metadata lines (`Reviewers:`, `Reviewed By:`, etc) are created automatically when the author uploads the patch via `arc diff 'HEAD^'`.
The summary and metadata can be copied from Phabricator to the local commit via `arc amend`.

Among the metadata tags, `Differential Revision:` is the most useful one. It associates a commit with a Phabricator differential (Dxxxxx) and allows the differential to be closed automatically when the commit is pushed to llvm-project master.

The other tags (Subscribers:, Reviewers:, Reviewed By:, Tags:, and the `Summary:` line) are not useful. They just clutter up the commit message. Shall we mention on LLVM Developer Policy — LLVM 16.0.0git documentation that developers are recommended to delete metadata tags before committing? It'd be nice if some pre-receive hooks can be set up to enforce deleting some really unnecessary metadata tags.

I don’t think they’re sufficiently problematic to worry about adding more steps people need to be aware of/follow to submit.

Maybe more advisory “you can remove the unneeded tags” might not hurt (but still adds to the length of new contributor documentation, etc, which isn’t free) - though “Reviewed By” is useful (in addition to “Differential Revision”) if it accurately reflects who reviewed/signed off on the change (makes it easy when I’m reading the mailing list to see who’s already looked a patch over, etc).

Many git commits in the monorepo look like the following:

[Tag0][Tag1] Title line

Summary:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque mauris neque, porta nec tristique at, sagittis vel nisi. Fusce pharetra nunc et mauris consequat venenatis.

Reviewers: username0, username1

Reviewed By: username0

Subscribers: username2, username3, llvm-commits

Tags: LLVM Project

Differential Revision: https://reviews.llvm.org/Dxxxxx

These Phabricator metadata lines (Reviewers:, Reviewed By:, etc) are created automatically when the author uploads the patch via arc diff 'HEAD^'.
The summary and metadata can be copied from Phabricator to the local commit via arc amend.

Among the metadata tags, Differential Revision: is the most useful one. It associates a commit with a Phabricator differential (Dxxxxx) and allows the differential to be closed automatically when the commit is pushed to llvm-project master.

The other tags (Subscribers:, Reviewers:, Reviewed By:, Tags:, and the Summary: line) are not useful. They just clutter up the commit message. Shall we mention on https://llvm.org/docs/DeveloperPolicy.html that developers are recommended to delete metadata tags before committing?

That’s be great!
I suspect we can provide a bash function to add to one’s .bashrc to make it trivial:

function arcfilter() { git log -1 --pretty=%B | awk ‘/Reviewers: /{p=1; sub(/Reviewers: .*Differential Revision: /, “”)}; /Differential Revision: /{p=0;}; !p’ | git commit --amend -F - ; }

Just running arcfilter before pushing will filter the commit it out automatically!

It’d be nice if some pre-receive hooks can be set up to enforce deleting some really unnecessary metadata tags.

Unfortunately you can’t add pre-receive hook on GitHub as far as I know.

Maybe there’s a way to make an on-save filter that correctly detects that the file being saved is a commit message (bonus if it can detect that it’s a commit message for the llvm repo)?

I also find the “Reviewed by” tag useful (as well as the review link), for the same reasons. In fact, I don’t even use arcanist to push commits, so I do it all by hand, and only include the “Reviewed by” and “Differential Revision” tags.

+1 to keeping “Reviewed by”. Whenever I’m fixing a bug in unfamiliar code, I look at both author and “reviewed by” from the history of that code to help pick reviewers for my change.

-Joseph

I kept `Reviewed By:` before and hestitated whether I should continue
this practice. Nice to see more rationale:)

I am now using a variant of Mehdi's shell function to keep Reviewed By: and Differential Revision:

function arcfilter() { git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:/' | git commit --amend -F -; }

The limitation with the “Reviewed by” line is that if you just use arc it indicates who was added as reviewer on the revision, but not who approved it. Because of this, I am wary of relying on this line for anything.

If you want to know who reviewed a change, better click on the Differential Revision link and go to the source of truth.

Yeah, I tend to prune it down myself - and if the list has only one name on it, it’s usually a pretty good indication they actually reviewed it. If it has many names, good chance it was just everyone someone put on the list and then I go check it manually.

I’m sure I’ve seen many commits with both “Reviewed by:” and “Reviewers:” tags, which look to have been done with arc (though I can’t be sure). How were those generated?

My own commits come through that way, with everyone I nominated listed in “reviewers” and then just those that replied in “reviewed by”. I generate the messages using “arc amend”.

If somebody wants to make sure they don’t push commits with extra tags, it’s possible to set up a local pre-push hook that would complain.

In /path/to/llvm-project/.git/hooks/pre-push
paste the following

#!/bin/sh

Disallow pushing commits that contain in their description the Phabricator

tags that are unwelcome in the upstream LLVM repository.

Can we fix this in reviews.llvm.org’s fork of phab?

https://github.com/phacility/phabricator/blob/cac3dc4983c3671ba4ec841aac8efac10744a80c/src/applications/differential/conduit/DifferentialGetCommitMessageConduitAPIMethod.php

Seems straightforward(-ish) to drop the relevant fields there, that way arc land automatically DTRT.

Jon

I was always assuming that the suggested commit is assembled in the
PHP code run by arcanist command run locally. If indeed the arc
command requests the commit message from the server, we could do some
additional things:

* Remove "Summary:" in front of message
* Line break after 72 columns
* Convert summary markdown formatting to plain texts (e.g. remove
backticks; I don't know any git client that renders as markdown)
* Add/check existence of [component] tag

Alternatively, we could make an upstream feature request

Michael

I was always assuming that the suggested commit is assembled in the
PHP code run by arcanist command run locally. If indeed the arc
command requests the commit message from the server,

I assumed so too until I went digging for it. Seems the client-side stuff only deals with the structured data, and calls out to the server to construct the raw text when needed:

https://github.com/phacility/arcanist/blob/33b9728b5f65fd747b7a15c0e25436c63e82f592/src/workflow/ArcanistDiffWorkflow.php#L757

we could do some
additional things:

  • Remove “Summary:” in front of message
  • Line break after 72 columns
  • Convert summary markdown formatting to plain texts (e.g. remove
    backticks; I don’t know any git client that renders as markdown)
  • Add/check existence of [component] tag

Alternatively, we could make an upstream feature request

From what I can tell, upstream doesn’t seem /that/ interested in it being any more than a one-off thing:

https://secure.phabricator.com/Q268

https://secure.phabricator.com/T12276

https://secure.phabricator.com/T11864

Jon

I was always assuming that the suggested commit is assembled in the
PHP code run by arcanist command run locally. If indeed the arc
command requests the commit message from the server, we could do some
additional things:

  • Remove “Summary:” in front of message
  • Line break after 72 columns
  • Convert summary markdown formatting to plain texts (e.g. remove
    backticks; I don’t know any git client that renders as markdown)

-1 on removing backticks. The lowercase in the middle of a sentence can be confusing without the backticks.

>> I was always assuming that the suggested commit is assembled in the
>> PHP code run by arcanist command run locally. If indeed the arc
>> command requests the commit message from the server, we could do some
>> additional things:
>>
>> * Remove "Summary:" in front of message
>> * Line break after 72 columns
>> * Convert summary markdown formatting to plain texts (e.g. remove
>> backticks; I don't know any git client that renders as markdown)
>>
> -1 on removing backticks. The `lowercase` in the middle of a sentence can
> be confusing without the backticks.

TBH, I don't need markdown to be rendered to make sense to me. If I see
backticks I know that code or commands are meant here, which can
disambiguate common words from variables nicely. I also don't see how
`backticks`, *asterisks*, or __underlines__ hurt the text flow. Maybe it
is just me, since I'm used to read/wrie markdown in vim, but I find they
highlight words just as well in plain text.

I do think we could automate the process in general though. I regularly
forget to run my `arcfilter` command (adopted from Fangrui Song I think):

arcfilter () { arc amend; git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:/' | git commit --amend -F - }

and for now it also breaks single line commit messages (not subjects).

Cheers,
Johannes

>> * Add/check existence of [component] tag
>>
>> Alternatively, we could make an upstream feature request
>>
>> Michael
>>
>>>
>>> Can we fix this in reviews.llvm.org's fork of phab?
>>>
>> https://github.com/phacility/phabricator/blob/cac3dc4983c3671ba4ec841aac8efac10744a80c/src/applications/differential/conduit/DifferentialGetCommitMessageConduitAPIMethod.php
>>>
>>> Seems straightforward(-ish) to drop the relevant fields there, that way
>> `arc land` automatically DTRT.
>>>
>>> Jon
>>>
>>>>
>>>>>
>>>>> Many git commits in the monorepo look like the following:
>>>>>
>>>>> [Tag0][Tag1] Title line
>>>>>
>>>>> Summary:
>>>>> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque
>> mauris neque, porta nec tristique at, sagittis vel nisi. Fusce pharetra
>> nunc et mauris consequat venenatis.
>>>>>
>>>>> Reviewers: username0, username1
>>>>>
>>>>> Reviewed By: username0
>>>>>
>>>>> Subscribers: username2, username3, llvm-commits
>>>>>
>>>>> Tags: LLVM Project
>>>>>
>>>>> Differential Revision: Login
>>>>>
>>>>> These Phabricator metadata lines (`Reviewers:`, `Reviewed By:`, etc)
>> are created automatically when the author uploads the patch via `arc diff
>> 'HEAD^'`.
>>>>> The summary and metadata can be copied from Phabricator to the local
>> commit via `arc amend`.
>>>>>
>>>>> Among the metadata tags, `Differential Revision:` is the most useful
>> one. It associates a commit with a Phabricator differential (Dxxxxx) and
>> allows the differential to be closed automatically when the commit is
>> pushed to llvm-project master.
>>>>>
>>>>> The other tags (Subscribers:, Reviewers:, Reviewed By:, Tags:, and the
>> `Summary:` line) are not useful. They just clutter up the commit message.
>> Shall we mention on LLVM Developer Policy — LLVM 18.0.0git documentation that
>> developers are recommended to delete metadata tags before committing?
>>>>
>>>> That's be great!
>>>> I suspect we can provide a bash function to add to one's .bashrc to
>> make it trivial:
>>>>
>>>> function arcfilter() { git log -1 --pretty=%B | awk '/Reviewers:
>> /{p=1; sub(/Reviewers: .*Differential Revision: /, "")}; /Differential
>> Revision: /{p=0;}; !p' | git commit --amend -F - ; }
>>>>
>>>> Just running `arcfilter` before pushing will filter the commit it out
>> automatically!
>>>>
>>>>>
>>>>> It'd be nice if some pre-receive hooks can be set up to enforce
>> deleting some really unnecessary metadata tags.
>>>>
>>>> Unfortunately you can't add pre-receive hook on GitHub as far as I know.

FYI: I patched our Phab instance and now arc land will omit “Subscribers”, “Reviewers”, and “Tags”.
It’ll also print the description without prefixing with “Summary:”.

I was always assuming that the suggested commit is assembled in the
PHP code run by arcanist command run locally. If indeed the arc
command requests the commit message from the server,

I assumed so too until I went digging for it. Seems the client-side stuff only deals with the structured data, and calls out to the server to construct the raw text when needed:

https://github.com/phacility/arcanist/blob/33b9728b5f65fd747b7a15c0e25436c63e82f592/src/workflow/ArcanistDiffWorkflow.php#L757

FYI: I patched our Phab instance and now `arc land` will omit "Subscribers", "Reviewers", and "Tags".
It'll also print the description without prefixing with "Summary:".

--
Mehdi

Awesome!

I have just noticed this happening. Thank you, thank you, thank you!

Jay.