[RFC] Proposed Policy for committing to LLVM-libc

There’s been some discussion about if there should be new requirements for committing to the LLVM repository as a whole (see New requirements for committing to main branch). Regardless of what’s decided for the project as a whole, we should probably decide on what our policy for just the libc project should be.

Based on what I have heard from people via discord and other chats, as well as my own experience, here is my proposal:

  1. All libc patches must be reviewed and approved before landing.
    1. Even changes that seem obvious can and have broken the build, so everything needs review.
    2. If a patch breaks the build, then the default solution should be to revert, or get an emergency review on the fix-forward.
  2. Developers for specific areas (e.g. GPUs, Fuchsia, Floating Point) can claim specific files so that they must approve before those files can be changed.
    1. If a change touches a claimed file without approval, then the owner may revert it
    2. If a change touches a claimed file and a review is requested, then an owner needs to respond within a reasonable time (e.g. 2 business days)
    3. If possible, I would like to enable some sort of automated enforcement (e.g. automatically adding an owner as a reviewer) though I don’t know if github supports that.
  3. When possible, wait for the presubmit checks to succeed before landing a change.
    1. Some presubmit checks are too slow or flakey to rely on. If you are landing while these have not succeeded, then add a comment to the PR explaining that you did it, and why.

Feel free to add comments/suggestions below, once we reach a policy that we can agree on I will add it to the libc website.

2 Likes

I would like to enable some sort of automated enforcement

Github will suggest reviewers if someone recently touched that file. However, that suggestion is only visible to committers.

@nickdesaulniers once mentioned to me that before committing a TODO in the code, it is recommended to add an associated link to an issue (create a new one if needed). This may also be a good practice to add. :smiley:

1 Like

Thanks for starting the discussion with some thoughtful ideas and suggestions / proposed wording.

All libc patches must be reviewed and approved before landing.

What about emergency “revert to green” patches?

What about emergency “revert to green” patches that either:

  1. don’t actually revert to green
  2. trade one (or more) green build for one (or more) red builds?
  1. If a change touches a claimed file without approval, then the owner may revert it

That is very aggressive and I suspect leads to fiefdoms where “don’t touch my code” becomes an utterance. Do we really want that?

e.g. 2 business days

I feel like it’s common for folks to take week long increments off for vacation. If you don’t hear back from a requested reviewer within 1 week, that’s when I tend to either ping the reviewer again, or take other means.

I don’t know if github supports that.

Looks like github supports a top level CODEOWNERS file.

Code owners are automatically requested for review when someone opens a pull request that modifies code that they own. When someone with admin or owner permissions has enabled required reviews, they also can optionally require approval from a code owner before the author can merge a pull request in the repository. For more information, see “About protected branches.”

(That sounds to me like what you want).

Though different parts of LLVM seem to do “code ownership” differently, maybe:

  1. llvm-libc could demonstrate the benefit of using github codeowners files (by being first).
  2. now that llvm-project has completed the move to github, maybe llvm-project as a whole code unify around having one top level code owners file (github’s format).

Developers for specific areas (e.g. GPUs, Fuchsia, Floating Point) can claim specific files so that they must approve

It would be nice to require that we have more than person for any specific area like these. Personally, I find strongly dislike when open source projects have 1 maintainer, who doesn’t think about succession, then leaves a project without appointing a maintainer. That also helps ensuring some level of review if 1 maintainer is on vacation for a week (rarely, all maintainers are on vacation on the same week).

1 Like

A drive-by comment from an avid watcher of LLVM’s libc, but not an active contributor to it: If every LLVM sub-project introduces its own policies, then I think that hurts LLVM as a whole. To the extent these things can be phrased as suggestions and recommended working practices that seems just fine to me, but libc-only rules doesn’t seem the right way to go.

1 and 3 in your list seem like they could be rephrased as suggestions with no real loss, while for 2 it seems more difficult. I think I’d be somewhat against adding the rules described in 2 as a libc-only policy, rather than something agreed project-wide (e.g. more restrictive rules for committing to parts of the codebase deemed as more sensitive, which may include libc).

7 Likes

@SchrodingerZhu

Github will suggest reviewers if someone recently touched that file. However, that suggestion is only visible to committers.

This seems useful, but I don’t think it’s strong enough for the sort of code ownership being discussed.

@nickdesaulniers once mentioned to me that before committing a TODO in the code, it is recommended to add an associated link to an issue (create a new one if needed). This may also be a good practice to add. :smiley:

That’s probably a good policy to put in place, though there are a ton of existing todos in the codebase already so it might be significant effort to make issues for all of those.

@nickdesaulniers

What about emergency “revert to green” patches?

I forgot to add this to the original list, but I was thinking that any “revert” patch could land without review, since you’re just returning the repo to a previous state. Relands should still be reviewed though.

That is very aggressive and I suspect leads to fiefdoms where “don’t touch my code” becomes an utterance. Do we really want that?

If we can get the automated codeowners working, then this might be unnecessary since the owner could just request changes to a bad change before it lands. If that’s not possible though, I’m not sure what else there is to do other than “tell people not to do that” or “revert the change”. Perhaps telling people to not do that will be enough.

I feel like it’s common for folks to take week long increments off for vacation. If you don’t hear back from a requested reviewer within 1 week, that’s when I tend to either ping the reviewer again, or take other means.

I agree that people should be able to take time off, so I would say that if a file has any owners it should have at least two (as you said later). We can also go with a longer time for reviews, 2 days was just a number I pulled out of thin air.

Looks like github supports a top level CODEOWNERS file.

That does sound like what I want, though it seems like this would need coordination with the wider project to actually use it.

@asb

If every LLVM sub-project introduces its own policies, then I think that hurts LLVM as a whole.

I agree to some extent, but I would say that different projects within LLVM may have different types of users. Since the libc project is still changing a lot, our users tend to live at head. This means they get the latest changes faster, but they’re more affected by a bad commit turning the bots red then a user that is only taking major versions. For that reason I think having more things checked before commit as opposed to after makes sense.

To the extent these things can be phrased as suggestions and recommended working practices that seems just fine to me, but libc-only rules doesn’t seem the right way to go.

This is probably getting a bit pedantic, but I’m not sure where the difference between “suggestion” and “policy” is. If it’s a suggestion and we enforce it, then it’s a rule in everything but name.

As for having some sort of code owners, ideally the strict ownership would be very limited in scope. I don’t want the whole of the libc directory to require my approval for any change, just small and specific things like FuchsiaTest.h requiring approval from a Fuchsia developer or the GPU RPC mechanism requiring approval from a GPU developer.

Well for 1) I think the difference between having it as a “rule” and a recommendation/practice is probably slim. The majority of patches for a given part of the LLVM codebase normally come from a set of regular contributors and norms develop between how reviews are handled, including the size or complexity of a change that needs review. 3) is mostly what is done at the moment, except the comment explaining flakeyness (buildkite is failing the majority of the time right now so it doesn’t seem worthwhile).

I would question whether the advantages of having a stricter code owner review mechanism than the rest of the project outweigh the disadvantages. From what I can see, people seem very cautious about landing unreviewed changes in parts of the codebase they’re not a regular contributor to, outside of the odd spelling fix or trivial API refactor.

Hi @michaelrj-google,

Just to +1 @asb’s comments above, we want LLVM to have consistent policies across subprojects. This reduces barriers to contribute and makes it easier for contributes to cross between subprojects - encouraging collaboration and new contributors and growth.

LLVM’s policies need to improve in general though, so if you think that there are things that can/should be improved, please bring them up as an RFC on the main LLVM Project channel like Tom did.

Thank you!

-Chris

@asb

From what I can see, people seem very cautious about landing unreviewed changes in parts of the codebase they’re not a regular contributor to,

That’s a good point. From the discussion here it seems like having 1 and 3 as suggestions seems like the best fit (e.g. “We prefer all libc patches to be approved before landing.”) Generally our problems have come from existing contributors who missed an unexpected side effect of a change (myself included), so making sure that our policies are welcoming to new contributors makes sense.

@clattner
Thanks for the comment. I agree that having consistent policies across projects is important, so I’m going to see about writing an RFC for the main LLVM project to add support for the CODEOWNERS file. I think that it is a useful tool that would help not just us but also the project as a whole.

3 Likes

As another contributor having caused downstream breakage I’m not opposed to tightening our policy but I think having more tests and better (i.e., faster, more robust) pre-commit build bots would be equally useful, if not more.

2 Likes

Let me see if I can find out who owns the (presubmit) build bots, how they work, and how we can add to the testing they run.

3 Likes

I would like to -1 the point from @asb and @clattner, but not because I think the intent is off-base but because I think it’s not practical given the diversity of projects in the monorepo at this point. Individual projects have individual needs and having our policies forced to be applied to every project in the monorepo is both helpful and harmful. It helps for folks who work across project boundaries because the policies are consistent, but that is a minority situation for some projects. For example, Clang and LLVM do need to occasionally interface with llvm-libc, but nowhere near as much as say LLVM and MLIR need to interface with one another. More often, we get sub-communities and we already acknowledge this by having those communities split out in places like Discourse and Discord.

I do not think it makes sense to disenfranchise communities by forcing policies to apply everywhere – that leads to tension for having projects in the monorepo at all and it makes the RFC process unbearable to participate in. I think the goal should be for policies to apply everywhere when possible but there should be allowance for project-specific needs when they arise. As @michaelrj-google points out:

This is usually the end result when communities are not allowed to manage themselves, and I think this is a worse outcome for everyone. Rather than allowing us to write down our actual policies (which are the things that are actively enforced by the sub-community of contributors to that specific project), they become tribal knowledge that causes tension, especially for new contributors.

I’m okay with allowing llvm-libc to manage their own processes as they see fit so long as their policies are aligned in value with the wider project policies and do not cause undue burden on the community as a whole. (e.g., I think it’s reasonable for a project to say “our revert policy is slightly stricter|more lenient than the overall LLVM policies because ” but it’s not reasonable for a project to say “our committer policy is that only so-and-so may make commits to the project”.)

4 Likes

I think there needs to be a global policy in the sense that it needs to be written out in the LLVM developer policy document. We have a monorepo, so people can easily commit code anywhere, and everyone with commit access has access to everything. Having special rules hidden away in a project-specific policy document means people are very likely to make mistakes. And substantial changes to the global policy document should get project-wide review, to ensure there aren’t any bad interactions with existing policies.

That said, I don’t see a problem with making the global policy say something like “commits to files in directories X, Y and Z require special approvals.”

I think the question to ask is, are these proposed policies good ones we’d like to have? If so, would we like to have them at the LLVM project level?

I see @clattner’s point that it makes sense to propose (and implement) such policy changes at the project level. But the discussion in New requirements for committing to main branch shows how controversial well-meaning changes (e.g. “everything should be reviewed”) can be.

The libc community is still fairly small, and we have the opportunity to adopt practices that we think are good for the libc project, even if they diverge from the LLVM project as a whole. But I think it can be easier to make positive changes to LLVM commit policy by first adopting those changes in smaller sub-projects – i.e., grow the new policy upwards and outwards, rather than all at once from the top down. This also makes it faster and easier to tune the policy before it gets adopted broadly: if there are problems with the policy, it’s better to discover them and fix them in a smaller community like libc than one as large as Clang.

Possibly related to the RFC - is there a set of requirements for introducing a new port to libc? I have been maintaining a branch that adds hexagon architecture support for linux and I wonder what would be the acceptance criteria?

Must it pass all the regression suite before it lands? It doesn’t yet, but IMO might still be useful.

(sorry for digressing on this thread, I can start a new one if appropriate)

@androm3da Yes, better have a separate thread for your Hexagon port. If you don’t hear from people offline what the requirements are, go ahead with RFC, and figure those out along the way.

1 Like