!!! 3.2 Release branch patching and the Code Owners

Hello,

Recent code owner activities have led to
what I would call loss of referential integrity
in the CODE_OWNERS.TXT file.

Changes are fine but the information in the
CODE_OWNERS.TXT does not allow to positively
identify code owner of the particular
file or patch.

The problem stems from the usage of the
"description (D)" field which is overloaded
with meaning. Most people put only a textual
description of the code they own.
This approach is fine for casual reader but
does not work for scripting or any automated
way of dealing with the build.

I would like to propose addition of the
"folder/file (F)" field. The format
would be the same as used by Joe,Owen
and Justin

F: (path/relative/to/llvm/src/root/*)

F: (path/relative/to/llvm/src/root/file.name)

Examples:

F: (lib/Bitcode/* include/llvm/Bitcode/*)

F: (lib/CodeGen/SelectionDAG/*)

F: (lib/Target/NVPTX/*)

Situation is particularly bad for the 3.2 branch as
"old" owner is not necessarily the same as new one.
So for the time being I have adopted the policy
of using code owner from the trunk as the one
responsible for approving the patches.

This worked for a while but there are more and
more patches requested with no clear way of
identifying the owner. Situation has got to
the point where AI (lame as it is) embedded
in the 3.2 integration bots simply says:

DOES NOT COMPUTE!

Therefor I have no choice but to suspend
accepting all of the 3.2 patches until the
situation gets resolved.

Pawel

This approach is fine for casual reader but
does not work for scripting or any automated
way of dealing with the build.

Will you please clarify how the automation / scripting helps with the
patch approval process?

I would like to propose addition of the
"folder/file (F)" field. The format
would be the same as used by Joe,Owen
and Justin

This won't cover all the cases, since code in question might be
scattered across the dirs, or single dir can contain several
maintainers depending on the subject.

Therefor I have no choice but to suspend
accepting all of the 3.2 patches until the
situation gets resolved.

I'd expect from release manager to understand the llvm/clang internal
organization and deduce the correct owner in most cases.

Hi Pawel,

That's a bit draconic. The code owners file is a text file and not important to the execution of LLVM. Therefore, it's fine to omit patches for it if there are conflicts.

-bw

That is "omit patches for the code owners file", not all patches. :slight_smile:

-bw

This approach is fine for casual reader but
does not work for scripting or any automated
way of dealing with the build.

Will you please clarify how the automation / scripting helps with the
patch approval process?

Generally release patch process works like this:

- patch gets checked-in on the trunk

- developer sends message to the code owner who
approves the patch and sends the *APPROVED* to the
release manager

- somebody (not specified who in the current flow)
merges the patch on to the release_branch

- release manager creates rc1/rc2/and final branches
from the release_branch verifying that each patch
has been approved by the code owner.

This process looks good on the screen but breaks down
in practice because:

- patches get checked-in onto the release_branch (rare)

- patches get sent to the release manager bypassing code owner

I think the root cause for this is simply the problem in
identifying the code owner by the developer.

We can solve this problem by providing code owner tool that can
map patch to the code owner or owners who should be
notified for approval.

I would like to propose addition of the
"folder/file (F)" field. The format
would be the same as used by Joe,Owen
and Justin

This won't cover all the cases, since code in question might be
scattered across the dirs, or single dir can contain several
maintainers depending on the subject.

In such case it would require the code owner to add multiple
F: fields for all of the dirs and files.

To simplify selecting multiple files we would allow
wildcard matching (globing patterns) at the end of the path

F: (lib/Bitcode/* include/llvm/Bitcode/*)

F: (include/llvm/*.h)

Since code ownership might overlap files and dirs we
should allow F: field to express that.

Code owner tool could easily walk-up the hierarchy
until it reduces the number of owners to 1 or 2.

Therefor I have no choice but to suspend
accepting all of the 3.2 patches until the
situation gets resolved.

I'd expect from release manager to understand the llvm/clang internal
organization and deduce the correct owner in most cases.

Understanding the internal llvm/clang structure is easy,
deducing the correct code owner is not due to the
vague and changing nature of the CODE_OWNERS.TXT

"Exception handling, Windows codegen, ARM EABI"

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

Pawel

For this release, please just use the CODE_OWNERS.TXT file from 3.1.

-bw

From: "Bill Wendling" <wendling@apple.com>
To: "Pawel Wodnicki" <root@32bitmicro.com>
Cc: "llvmdev" <llvmdev@cs.uiuc.edu>, "clang-dev Developers" <cfe-dev@cs.uiuc.edu>
Sent: Friday, November 16, 2012 6:19:30 PM
Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners

>
>>> This approach is fine for casual reader but
>>> does not work for scripting or any automated
>>> way of dealing with the build.
>> Will you please clarify how the automation / scripting helps with
>> the
>> patch approval process?
>
>
> Generally release patch process works like this:
>
> - patch gets checked-in on the trunk
>
> - developer sends message to the code owner who
> approves the patch and sends the *APPROVED* to the
> release manager
>
> - somebody (not specified who in the current flow)
> merges the patch on to the release_branch
>
> - release manager creates rc1/rc2/and final branches
> from the release_branch verifying that each patch
> has been approved by the code owner.
>
> This process looks good on the screen but breaks down
> in practice because:
>
> - patches get checked-in onto the release_branch (rare)
>
> - patches get sent to the release manager bypassing code owner
>
> I think the root cause for this is simply the problem in
> identifying the code owner by the developer.
>
> We can solve this problem by providing code owner tool that can
> map patch to the code owner or owners who should be
> notified for approval.
>
>
>
>>
>>> I would like to propose addition of the
>>> "folder/file (F)" field. The format
>>> would be the same as used by Joe,Owen
>>> and Justin
>> This won't cover all the cases, since code in question might be
>> scattered across the dirs, or single dir can contain several
>> maintainers depending on the subject.
>
> In such case it would require the code owner to add multiple
> F: fields for all of the dirs and files.
>
> To simplify selecting multiple files we would allow
> wildcard matching (globing patterns) at the end of the path
>
> F: (lib/Bitcode/* include/llvm/Bitcode/*)
>
> F: (include/llvm/*.h)
>
> Since code ownership might overlap files and dirs we
> should allow F: field to express that.
>
> Code owner tool could easily walk-up the hierarchy
> until it reduces the number of owners to 1 or 2.

For this release, please just use the CODE_OWNERS.TXT file from 3.1.

I don't think this helps the situation. Part of the point of declaring new code owners was to help with the 3.2 release process. If these designations are ambiguous, then we should make them more specific, or, make it the responsibility of the requester to identify the appropriate code owner(s) in the request.

-Hal

Understanding the internal llvm/clang structure is easy,
deducing the correct code owner is not due to the
vague and changing nature of the CODE_OWNERS.TXT

Does not seem to me and many people around. If in doubt - ask at ML or IRC.

"Exception handling, Windows codegen, ARM EABI"

Just for your information - this covers some lines in some files in
llvm/CodeGen, some files in lib/Target, some files in lib/Target/X86
and so on. Note that subjects are split across the files, we do not
have one-to-one or one-to-many mapping between subject and file. If
we'd follow your approach we'd need either specify too wide wildcards
(e.g. lib/CodeGen/*) or specify line ranges in the files, which does
not make any sense.

Understanding the internal llvm/clang structure is easy,
deducing the correct code owner is not due to the
vague and changing nature of the CODE_OWNERS.TXT

Does not seem to me and many people around. If in doubt - ask at ML or IRC.

I have been through somewhat similar situations before
and the only solution that works is to have a quick and
unambiguous way of determining the required information.
Indirection through ML or IRC does not work particularly
well for a geographically spread team like llvm.
It might work locally but we are working in so many
different timezones that lag introduced by the indirection
is just another headache.

"Exception handling, Windows codegen, ARM EABI"

Just for your information - this covers some lines in some files in
llvm/CodeGen, some files in lib/Target, some files in lib/Target/X86
and so on. Note that subjects are split across the files, we do not
have one-to-one or one-to-many mapping between subject and file. If
we'd follow your approach we'd need either specify too wide wildcards
(e.g. lib/CodeGen/*) or specify line ranges in the files, which does
not make any sense.

I have checked-in this llvm/utils/wciia.py utility
(aka Whose Code Is It Anyway for lack of a better name).

It works of the CODE_OWNERS.TXT file as it tries to
determine the owner of the particular path or file.
Even this first rough version shows that it can be useful
without burdening the code owner with too much extra work.
It is not perfect and I have listed limitations in the
source code but it deals with overlapping ownership,
by listing all of the owners who have laid claim to the
particular path.

Anyway, it can be improved if needed, patches are welcome!

Here are couple of examples:

$ utils/wciia.py .
The owner(s) of the (.) is(are) : ['Chris Lattner']

$ utils/wciia.py lib/
The owner(s) of the (lib/) is(are) : ['Owen Anderson', 'Joe
Abbey','Justin Holewinski', 'Chris Lattner']

$ utils/wciia.py lib/Target/NVPTX/
The owner(s) of the (lib/Target/NVPTX/) is(are) : ['Justin Holewinski',
'Chris Lattner']

$ utils/wciia.py lib/Target/NAN
path (lib/Target/NAN) does not exist

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

Pawel

Hi Pawel, I guess the code owner could be noted in each source file.
Eg: in SelectionDAGBuilder.cpp, there could be a line in the comment
block at the start:

// Code owner: Owen Anderson (resistor@mac.com)

Then anyone working on a bug or with questions about code instantly knows
who to turn to.

Ciao, Duncan.

I think that the code owner process is becoming complicated and I am not sure if it serves Chris's original intent. I don't think that we need to change every file nor do we need an automatic tool to find the owner. I think that a simple text file, or a section in the docs is enough.

^^ this

Joe

From: "Joe Abbey" <joe.abbey@gmail.com>
To: "Nadav Rotem" <nrotem@apple.com>
Cc: llvmdev@cs.uiuc.edu
Sent: Saturday, November 17, 2012 1:25:04 PM
Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners

> I think that the code owner process is becoming complicated and I
> am not sure if it serves Chris's original intent. I don't think
> that we need to change every file nor do we need an automatic tool
> to find the owner. I think that a simple text file, or a section
> in the docs is enough.

^^ this

Pawel, please correct me if I'm wrong, but it sounds like your underlying problem is that people are sending you merge requests directly, and you're not sure from whom you need to make sure to get approvals. This being the case, you should stop accepting such requests. Requests should be sent directly to the code owners (on list). Only those code owners should communicate directly with you (either to instruct you to merge in certain patches, or better yet, to merge in approved changes directly). I think this matches the original intent of the system: it partitions the workload among domain experts instead of forcing you to deal explicitly with many of the requests.

-Hal

From: "Joe Abbey" <joe.abbey@gmail.com>
To: "Nadav Rotem" <nrotem@apple.com>
Cc: llvmdev@cs.uiuc.edu
Sent: Saturday, November 17, 2012 1:25:04 PM
Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners

I think that the code owner process is becoming complicated and I
am not sure if it serves Chris's original intent. I don't think
that we need to change every file nor do we need an automatic tool
to find the owner. I think that a simple text file, or a section
in the docs is enough.

^^ this

Pawel, please correct me if I'm wrong, but it sounds like your underlying problem is that people are sending you merge requests directly, and you're not sure from whom you need to make sure to get approvals. This being the case, you should stop accepting such requests. Requests should be sent directly to the code owners (on list). Only those code owners should communicate directly with you (either to instruct you to merge in certain patches, or better yet, to merge in approved changes directly). I think this matches the original intent of the system: it partitions the workload among domain experts instead of forcing you to deal explicitly with many of the requests.

Hal, this is exactly what is happening.
The problem for me is compounded by the fact
that we have new code owners and I am spending
a lot of time verifying the ownership and then
I need to determine whether this is *approved*
or not.

As I have mentioned in the initial message I have
stopped processing the requests. I am just queuing
them up until situation gets resolved.

Workflow looks simple:

Developer -> patch -> Code owner -> *approved* -> Release Manager

but currently it breaks for me when this happens:

Developer -> patch -> Release Manager

Release Manager -> is this *approved* ? -> Code owners(?)

Code owner -> *approved* -> Release Manager

From: "Pawel Wodnicki" <pawel@32bitmicro.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Joe Abbey" <joe.abbey@gmail.com>, llvmdev@cs.uiuc.edu, "Nadav Rotem" <nrotem@apple.com>
Sent: Saturday, November 17, 2012 2:04:10 PM
Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching and the Code Owners

>> From: "Joe Abbey" <joe.abbey@gmail.com>
>> To: "Nadav Rotem" <nrotem@apple.com>
>> Cc: llvmdev@cs.uiuc.edu
>> Sent: Saturday, November 17, 2012 1:25:04 PM
>> Subject: Re: [LLVMdev] [cfe-dev] !!! 3.2 Release branch patching
>> and the Code Owners
>>
>>
>>
>>> I think that the code owner process is becoming complicated and I
>>> am not sure if it serves Chris's original intent. I don't think
>>> that we need to change every file nor do we need an automatic
>>> tool
>>> to find the owner. I think that a simple text file, or a section
>>> in the docs is enough.
>>
>> ^^ this
>
> Pawel, please correct me if I'm wrong, but it sounds like your
> underlying problem is that people are sending you merge requests
> directly, and you're not sure from whom you need to make sure to
> get approvals. This being the case, you should stop accepting such
> requests. Requests should be sent directly to the code owners (on
> list). Only those code owners should communicate directly with you
> (either to instruct you to merge in certain patches, or better
> yet, to merge in approved changes directly). I think this matches
> the original intent of the system: it partitions the workload
> among domain experts instead of forcing you to deal explicitly
> with many of the requests.

Hal, this is exactly what is happening.
The problem for me is compounded by the fact
that we have new code owners and I am spending
a lot of time verifying the ownership and then
I need to determine whether this is *approved*
or not.

As I have mentioned in the initial message I have
stopped processing the requests. I am just queuing
them up until situation gets resolved.

Workflow looks simple:

Developer -> patch -> Code owner -> *approved* -> Release Manager

but currently it breaks for me when this happens:

Developer -> patch -> Release Manager

Release Manager -> is this *approved* ? -> Code owners(?)

Code owner -> *approved* -> Release Manager

I think that the solution to your problem is simple:

1. Enforce the intended workflow: If a developer sends a patch to you directly, then send a message back instructing them to send it to the code owner (on list).
2. Trust the code owners: Part of the responsibility of being in the CODE_OWNERS file is knowing for what you're responsible and for what you're not. It is not (or should not be) your job to verify a code owner's approval. All you need to verify is that the person giving the approval is a code owner (of something). I don't think that we have a rogue-code-owner problem, do we?

We all appreciate the work that you're putting into this.

Thanks again,
Hal

I agree. What problem are we trying to solve here? Are people approving patches that they shouldn't?

-Chris

I think that the solution to your problem is simple:

1. Enforce the intended workflow: If a developer sends a patch to you directly, then send a message back instructing them to send it to the code owner (on list).
2. Trust the code owners: Part of the responsibility of being in the CODE_OWNERS file is knowing for what you're responsible and for what you're not. It is not (or should not be) your job to verify a code owner's approval. All you need to verify is that the person giving the approval is a code owner (of something).

This makes perfect sense to me.

We all appreciate the work that you're putting into this.

Definitely!

-Chris

Rather it is the opposite, people are not approving patches they should.

Pawel

>
>
>> I think that the code owner process is becoming complicated and I am
not sure if it serves Chris's original intent. I don't think that we need
to change every file nor do we need an automatic tool to find the owner. I
think that a simple text file, or a section in the docs is enough.
>
> I agree. What problem are we trying to solve here? Are people
approving patches that they shouldn't?
>
> -Chris
>

Rather it is the opposite, people are not approving patches they should.

Can you provide some examples of the problems you are seeing?

Here is what happens.

I get a message "could you please include/add/merge this r16xxxx into
3.2?". And my immediate reaction is sure, no problem this fixes
PR/issue/crash so it is important. But are you the code owner
and do you approve? So I have to go and start checking because
that is the process. In the past few days CODE_OWNERS.TXT
on the trunk has been changing while 3.2 has been stable,
I work on 3.2 branch so I have sent couple of e-mails
to wrong people.

Anyway, it was not my intention to cause message storm and this is
taking way too much bandwidth on the list. As always, change is
causing breakages until we all learn how to do it efficiently.
I have now a way to identify the code owners.

All I am asking is for the code owners to state clearly that
the change they ask for is approved. Couple of examples
from real e-mails:

approved APPROVED *approved*

Pawel

P.S.
I'll even take
"I am the code owner and I approve this message!"