While discussing changes to Clang’s code ownership, @rengolinasked some very good high-level questions about code ownership. I think it would be healthy for us to have a community discussion around code ownership in general to try to get to a shared understanding of what code owner responsibilities are and the policies we have around code ownership. As a reminder, what we document about code ownership can be found on the developer policy page.
The questions raised were:
As a strawman to get the discussion started, I’ve pasted my response from that thread:
What are your thoughts on the topic?
My hope with this discussion is that we can update the documentation to reflect our latest thinking, if needed.
Thank you Aaron for starting this thread. This is an area I have been thinking a lot about. I’ve had many conversations through our Community.o events and at round tables at the LLVM Developers’ Meetings that Code Owners is an area that we need to rethink and improve. I think that there are three main parts to this discussion.
1 - How to determine areas that require code owners (such as you did with Clang)
2 - Defining the responsibilities of a code owner
3 - Defining the process to become a code owner
I think part 1 can be discussed in a separate thread in the specific parts/sub-projects areas on Discourse and I won’t go into it here.
Responsibilities of a Code Owner
Currently, the role of a code owners is mostly focused on code review but I feel its a missed opportunity to provide some real leadership in the LLVM community. Code owners should not being doing all the work, but should be fostering the next generation of contributors and mentorship is key. I would like to see code owners selecting other individuals to review code first and then they can do another review. How they respond to that first reviewer should be positive and encouraging, but also provide suggestions. Often individuals need someone to recognize them and encourage them to feel they have permission to do a review.
I also believe that code owners should be a neutral party that helps mediate difficult design decisions and resolve conflict. This is probably what Aaron means by “final decision”, but I would like to view it as a mediator type role first. This actually isn’t really described in the current documentation except for when it comes to releases (AFAIK).
So at a high level, I believe Code Owners main responsibilities should be: Ensuring reviews/code quality, providing Mediation/Conflict Resolution for design decisions or changes, and providing mentorship to grow the community of reviews/contributors.
I would love to see a Code Owner guide and code owner training as well.
Becoming a code owner
Regarding the process to become a code owner, it is very vague right now. Do you nominate yourself? Does someone have to nominate you? How many “votes” do you need to be “approved”? It’s very intimidating and it feels like you have to be “well known” in order to even think about being a code owner. It feels like this happens behind closed doors if you know the right people. This process needs an overhaul. I think we need a way to provide anonymous feedback and have a higher level body that is overseeing code owners to ensure fairness and an even distribution of representative companies involved and diversity within the community. It is for sure a balance of “who does the work” versus having an open source community just catering to one organization’s needs.
I also feel that we need to encourage more minorities within our community to become Code Owners. We can’t just rely on people to raise their hand as that historical has not worked to improve diversity. We need to be actively thinking about this when we “nominate” folks to the role. We need to accept that we may have never met these individuals or have any relationship with them, but they are working hard to make LLVM and it’s codebase better. It can’t just be people we know that get promoted into these roles. We have to get outside our comfort zones.
Personally, I would love to see more women in the code ownership role. While this may not matter to some, it matters a lot to me (and I suspect many other women) and I believe it has an impact on the next generation of compiler engineers and open source contributors. Women and other minorities have often not volunteered for roles because they don’t feel “expert enough” or that it would positively impact their career. Open source work is often done in someone’s free time and we need to get companies on board with valuing its worth and considering it important enough to impact promotion and career trajectory.
We should also have a clear path for people to step down as code owners or be removed from their position. We often have folks who change jobs and don’t update their information and seem to drop off from contact. It shouldn’t be a big deal to remove them and move forward. While I want to thank and reward folks that have served in Code Owner roles in the past, I’m not sure exactly where we do that. We want the Code Owner documentation to be clear and easy to understand. It should be maintainable for years to come.
Lastly, there have been situations where we need a higher level group of individuals to help mediate disputes between an individual and a code owner. This doesn’t happen often, but if it does, we do need to have an escalation path so that all viewpoints are considered fairly. Otherwise, there is no way for an individual contributor to appeal a code owner’s decision.
That is a bit of a brain dump, but I look forward to discussing this more.
To me, this is mostly “doesn’t rubber stamp reviews” along with “provides actionable feedback, enforces community coding standards, pulls in other reviewer when needing their expertise rather than guessing, thoroughly reviews all parts of the patch (doesn’t ignore test coverage, comments, etc)”.
+1 to all you said about this. I believe code ownership is more about leadership than code reviews.
We had this discussion before and so far the consensus has been that we don’t want to move to a model that only code owners do review and consequently yield power over other developers. This has been a contentious point in other communities and I wouldn’t want it replicated here.
Good list, but I wouldn’t completely remove the technical parts either. Having a healthy and growing community around your code is the best way to improve quality long term, but it’s not the actual goal.
I believe code owners need to own the code. Code quality, design, maintenance and direction are key. The social side of it is a fundamental element, but mostly in relation to achieving the other key goals.
Same thing for diversity: it’s fundamental that we achieve the right balance, and code owners need to be acutely aware of those issues and play an important role, but that’s not the core job of the code owner.
Working with other parts of the community (for example community.o) would allow people to focus on what they do best, without putting too much on the plate of people that just want to help.
There are a lot of people that are really good with code and really bad socially, and forcing those people to be social leaders when all they can do is to be code leaders will mean less code owners, not more. Or worse, more code owners that care more about the social parts than the code parts, and that’s not a good trend.
Intimidating is the right word, but not because of people.
Back a decade ago, I nominated myself upon requests from my parent company, and it was intimidating, even though none of the people I interacted with were, themselves, intimidating.
It’s like asking someone for a date, with only knowing them for a little while. You’re not sure they like you, you fear rejection, ridicule (even if not public), or even being pushed to do things you otherwise wouldn’t. It’s really stressful, even if it’s all in your head.
Having a recipe surely helps!
We have the CREDITS.TXT file that nominates people for what they did. We don’t use as often as we should, but adding everyone would be madness. Perhaps that could be a place for all current and past code owners, and when removed from ownership, they’d remain there.
Reinstating previous code owners shouldn’t also be hard. Pre-approval by default. And it shouldn’t matter how many code owners we have per area, as long as they’re not fighting each other.
This is a good point, and AFAIK it has worked well so far. I wouldn’t want to create policy for no strong pressure, so perhaps we can look at this one once a hard case come by (like most other policies we wrote).
Thank you for all this feedback, I think it’s fantastic and really appreciate the depth of thought you’ve put into the topic!
100% agreed!
I like the idea behind this, but I worry about this being a bottleneck if we try to require this (which it doesn’t sound like you’re proposing?). Speaking from my experience in Clang, there’s often only 1-2 people who really know any given area of the compiler. We absolutely want to grow that knowledge so there’s more than 1-2 experts where we can! But if we do it by code owners waiting or poking others to review, I worry that we will continue to have reviews that take months to complete. Our current model of encouraging anyone to leave review feedback sort of works for us in that I often see non-code owners leaving quality review comments (in fact, up until the current code ownership changes, it’s been predominately non-code owners doing the reviews at least in Clang).
That said, your point that people don’t always feel empowered to perform reviews or provide the final sign off on a review is a real problem and we should find ways to improve that situation, and mentoring others is a fantastic way to improve that. (FWIW, that’s why I really love the community office hours that @kbeyls initiated – I think it’s a great opportunity for mentorship, and I’d highly encourage code owners to have office hours.)
Agreed (and that really was what I had in mind), though “neutral” party is worth clarifying. I would expect code owners to have strong opinions on the area of code they’re responsible for, and I don’t think that’s a bad thing. But I do think they need to mediate difficult design decisions and resolve conflict in a manner that fits with our collaborative culture.
One of the pain points I see in Clang is with determining consensus for a high-level decision, and I think code ownership can really help us nail that down somewhat. We often have people proposing extensions to the compiler, and we ask for an RFC, and then there’s no way to know whether that RFC has been accepted or not. I would love to see a model where people propose an RFC and once community discussion has started to wind down, instead of assuming the RFC is accepted, impacted code owners weigh in on the final consensus determination. As a concrete example, we recently had an RFC to add HLSL support to Clang – once the community showed support for the idea, I would have loved to see the code owners impacted (driver, frontend, templates, code gen, general owner, etc) call consensus on the RFC. This way, the person proposing it knows when the RFC has been accepted, and the code owners know what to expect and are aware of community feedback on the thread.
100% agreed, this is a fantastic idea!
I agree that having a way to provide anonymous feedback would be a useful tool to help code owners improve. I’m less certain how I feel about a higher-level body that oversees code owners. To me, that’s the community. I’m extremely wary of adding a higher-level decision-making body. By definition, consolidates power directly with a smaller group, which is the opposite of what I think you’re aiming for with the suggestion.
FWIW, with the Clang code ownership changes, I was very mindful about keeping the ownership spread between various corporate and non-corporate interests. I totally agree we need to make sure this remains balanced, but I don’t think we need a higher-level body to enforce that. Or, put a different way, I think the members of that hypothetical higher-level body should be sufficiently empowered to point out that nominations are getting heavily weighted in a problematic way should that happen.
Honestly, I think there’s an underlying problem that’s being touched on here but not addressed – we do not have a functioning mechanism for making non-technical decisions in the community. The RFC process works okay-ish for technical decisions where there’s a clear set of community stakeholders who can weigh in (like when adding some kinds of extensions to the compiler), but the process breaks down entirely for non-technical decisions like community infrastructure, role responsibilities, etc. I think this is a problem we should address (separate from code ownership discussions) because it’s been a significant point of pain for quite a while now.
100% agreed, but it’s worth noting that nominating people for code ownership is largely about who is doing the review and architecture work, which narrows the pool of nominees down considerably as we have far more people producing code than we do reviewing code.
I hope that as we improve mentorship in the community, it will become easier and easier to have more diverse nominations over time. As we get more people reviewing more code, we’ll have a wider pool of experts to pull from when nominating. The trick is in getting more people to review more code, and a good chunk of that boils down to corporate investment in code review. I don’t know how to solve that – we have a lot of big companies who write a lot of code for the community, but we really need help (IMO) in areas like code review, bug triage, documentation, build bots, etc.
Very strongly agreed!
100% agreed! I don’t think I know of a time when we’ve needed to ask a code owner to resign while they were still actively doing code ownership duties, but it’s easy to think of times when the code owners file has gotten stale. When I was reaching out to folks for the Clang nominations, the most stressful part was reaching out to owners I was suggesting to “replace” because it just felt genuinely gross to ask someone to step down. Imagine how surprised I was when one person had already been replaced and the code owners file was never updated, and another mentioned that they appreciated my helping them to step down because they weren’t sure how to do it or who to nominate as a replacement!
FWIW, I’m not certain I can think of a situation where we’ve had this need in Clang. I would imagine that if we had a sufficient body of code owners that this would be resolved by other code owners weighing in, though.
I’m skeptical of the idea that a higher-level body of authority than code owners makes sense for helping to reach technical decisions. For example, I would be strongly opposed to the LLVM Foundation Board stepping into such a role as the board’s composition lacks technical expertise in many areas of the project. If the appeal is on technical grounds, the people involved in the decision should be those who have the strongest technical background in that area. If the appeal is on non-technical grounds (e.g., personality conflicts), that almost strikes me as trending into CoC territory.
I think code owners should be strongly encouraged to collaborate amongst one another. After all, we don’t want code fiefdoms where everything is siloed; the spirit of the project has always been around layered components that compose well together, and that only succeeds when everyone is collaborating. So, to me, if there are problems (at least for technical topics) that might be solved by a higher-level group, those can be equally solved by pulling in more code owners/community members and building consensus.
Agree. This feels like jumping the gun. We have a code of conduct and a reporting policy community-wide. Creating one just for code-owners is creating the hierarchy that we tried to avoid since ever.
Love this! It reinforces my earlier point of asking to be a code owner. Socialisation is hard, and there isn’t necessarily an intersection of code and people skills for people who would be good code owners.
Also, I don’t think we should enforce or require all code owners to have all skills. That’s the whole point of having multiple code owners.
The more requirements we add to the list, the less people will find that they have the union of all items. But if all they need is a sub-set, the whole is the union, it becomes a lot easier (and more productive) to accept being a code owner.
I’m reminded again of a lightning talk from @kbeylsCan reviews become less of a bottleneck?
At that time, the average number of reviews per patch was about 2.5; however, 85% of patch authors did fewer than the average of 2.5 reviews per patch. So, yes, we say anyone can review anything, but perhaps we need to be a little more welcoming to new reviewers. I mean, for a brand-new contributor, sure, they’re still learning the ropes; but for people who are past that point, we should be somewhere between encouraging and expecting reviews.
+1. Every time someone asks me about upstream I always find a way to wedge in this key phrase: “The more you review people’s patches, the more they’ll review yours”.
I’m not sure how effective that is community-wide, but on the new back-ends (m68k, CSKY, LoongArch, SPIRV) where I beat that drum a lot, I think there was enough cross-reviews that made me happy.
+1 that this is an issue. Quite often, somebody makes a proposal, there’s good discussion, and then it just sort of… stops. And it’s very difficult for folks to tell whether that means the proposal was accepted or not.
In general, what I would expect primarily from code owners as the guiding value is ensuring responsiveness.
Responsiveness can take many forms. For example, for code reviews, it can mean personally being responsive to code reviews in their area. It can also mean watching code reviews from others and just making sure that there aren’t any patches that “fall through the cracks”. For RFC discussions, it means helping drive towards a clear “Yes” or “No” (or a “Yes/No, if…”).
I see a common recurring pattern of what people want from code owners, but there’s a fundamental mismatch between some of them.
On one hand, we want to have multiple owners per area, so that we can cover each other when not available. On the other hand, we want owners to be more authoritative in decision making. What if two or more owners don’t agree with each other, but the majority of the community backs one and not the other? How can one call their support base numbers a winning argument in a way that is different to the problem that we have today (re. silent consensus)? What if one is on holidays and decisions are taken without their input?
On one hand, we want owners to have more authority, but on the other hand, we don’t want to create a hierarchy. If code owners’ opinions have more weight than another experienced developer (who isn’t a code owner), how is that not a hierarchy based on status rather than experience?
On one hand, we want owners to have more responsibility, ranging from reviewing a lot of code, making hard decisions and driving diversity into the project. On the other hand, we still have to write code to our parent companies, universities or even other pet projects (for those who do it pro-bono). It’s hard to convince companies to allow their engineers to spend this much time on upstream discussions if their authority isn’t publicly recognised and actionable (like other OSS projects).
Many of the arguments I see in this thread are converging to a model similar to GNU projects, where we have recognised leaders and sub-leaders, owners of particular areas that receive code from owners of their sub-areas and so on, being a full time job. If this is the direction that the community wants to go, then there will be a lot more important things to do than just to find some people to help with code review.
Our community will radically change and that will take a toll on all of our works, upstream and downstream. I’m not saying this will be worse, or better. I’m just saying it will change and we need to understand what people are asking for.
For the record, let me try to answer my own questions with LLVM’s history in mind, trying to describe what I see as the status quo, not necessarily my opinions. But, given that I work with LLVM for a long time, my opinions are quite close (and biased towards) the status quo.
My understanding is that a code owner is responsible for a “piece of code” by performing any combination of:
Guiding developers towards quality in their areas, following the multiple documented guidelines.
Being the last line of defence, if no one has reviewed a patch or responded to an RFC, they:
Add people they think will be able to help
Reply with their own reviews when all else fails
In that order, to encourage community participation
Participate in substantial code review in their areas (in addition to above)
Adding substantial code to their areas without affecting other developers’ contributions
This is particularly important for fast-paced sub-projects
Less so for localised parts, like back-end targets, tools
Being on the lookout for changes in other components and how that’ll affect their parts
Being on the lookout for people that are having trouble contributing and guiding them through
If we just add more parts and more people per part, with the role definition above, we can continue to scale our works with more redundancy and less conflicts. If we want to add more roles to ownership, then we’ll have to think hard about how that’s going to affect this growth.
To me, this is a trivial converging exercise. We split, and see what works, then move around, reassess, until we find the right combination. New areas will show up, but once stable, it should be mostly adding things.
The existing idea is to create a code owner for new targets in LLVM, but other areas are more fuzzy. Other sub-projects don’t even have an OWNERS file. I think here, each sub-community should splice their own ownership, like you did with Clang, with LLVM being a bit of everyone’s job.
New targets, projects, dialects, sanitizers, libraries are easy and already encoded in policy: the people proposing the new code must agree to be the code owners.
Existing areas could have an influx of proposals. I wouldn’t add people based on their contributions, without their explicit consent, but I’d encourage high contributors to apply for ownership (by some RFC or other). Having been asked to do it by the community reduce the fear of being rejected, and is much easier to do. Existing code owners should take their time to ask people around to participate in their areas.
I wouldn’t have more than 2-4 owners per area. Some areas will have more people willing and able to be owner than others, so some areas will have more core contributors than owners, others will probably only be contributed to by owners. No problems there.
When there are more active people for a demonstrable amount of time. Ownership isn’t about making decisions (so far), so losing ownership doesn’t mean you’re less important, just means other people are more active than you are.
If you don’t code but still review patches and participates in discussions, I wouldn’t remove your ownership. But if your last interaction (commit, email, review) was years ago, then I’d move you to the CREDITS part and allow new people to help.
We write a policy and we encourage code owners to ask those people to join.
If we have a clear policy on being a code owner, then people will more likely be encouraged to ask. They could ask first the existing code owners, who would give an idea of how many people work on the project, how many of them could be owners and how hard it would be for people to accept a new owners, etc.
If we had owners actively seeking to step down, look for others, and understand that this isn’t a life title, but it’s based on contributions, then we’d see more shuffles and more people contributing to LLVM at that level.
One final point I think I should add to the discussion is the idea of moving around.
I have worked in very different parts of LLVM (back-ends, code-gen, front-end driver, vectorizer, mlir) but my code ownership is “Arm Linux”. That does not reflect what I do in LLVM for quite a few years, but it’s stuck there.
I think there should also be some thought in any policy we write about moving across ownership boundaries, to recognise people’s work for project wide contributions. I’m not sure how to state this, so I never proposed anything, but it would be interesting to know what other people think of that and how would they make that work.
A lot of these situations feel pretty hypothetical to me, but then again, we’ve never had multiple code owners for the same area before either. Personally, I think of these situations as ones where the default answer is “no changes are made because there’s not clear agreement yet” and the community needs to work together to reach a medium we can all live with. They’re less about code ownership and more about effective collaboration, in some ways.
Hmmm. I think a hierarchy is inevitable because we’re naming some people by name and giving them specific responsibilities. Even with the best of efforts to not create a hierarchy, one is already implied by that.
I think there’s some nuance here worth chatting about. I don’t mean to pick on @beanz or put words into his mouth, but I think the recent incoming support for HLSL is a pretty good case study.
I think it would make perfect sense to add @beanz as the code owner for HLSL given that he’s the driving force behind the project and he has significant experience in providing quality reviews in the project. However, one of the big reasons for upstreaming HLSL work is so that the quality is better than what would be produced internally (at least to some extent) and so I think having a non-@beanz code owner would provide significantly more value to Microsoft as the company backing that work. But we still need @beanz to own the decisions around what is and isn’t a correct HLSL implementation (at least to some extent, because there’s not really a language standard for it).
Personally, I would like to see a requirement that these sort of larger scale projects have two code owners: one from the proposing party and one from the community. Having a code owner from the proposing party who understands the responsibilities and expectations of code ownership (whatever we decide those to be) signals that there’s an expectation of effort beyond “dumping” code onto the community, and having a code owner from the community signals that the community is willing to put in the effort to support the endeavor. If we can’t find code owners who make sense from either of those parties, it’s a signal that the community might not be able to take on that effort at this time.
Commenting here specifically as someone that isn’t behind the closed doors:
I fail to understand how creating a committee that would serve as kingmakers would make the process more open and less intimidating. Not only would the potential code owner need to establish themselves within the community that they would be a code owner of (which is already daunting for junior dev), but they would need to start playing politics with a second body of individuals with their internal own agendas and sets of goals that will likely be at odds with the potential code owner’s community.
This is a level of politics that many qualified candidates with suitable technical skills and (potentially) project management skills won’t want to engage in, especially if they aren’t employees of the aforementioned “representative companies” that you mentioned, and they will end up self-selecting themselves out.
@AaronBallman +1! (and always feel free to pick on me )
I would be happy to be a code owner for HLSL and take ownership over things that are exclusive to HLSL and relate to how the language is implemented in clang. I’m not really a definitive expert on HLSL, but I do work with a few so I have good channels to get answers when I need them .
To echo Aaron’s point, it is incredibly valuable to our project (and I suspect to the community as a whole) to have feedback from owners of the architectural parts of clang when it comes down to design and implementation details, adequate test coverage, and more. I certainly don’t think a project of this scope can be successful without the support of code owners across clang, and I’ve been greatly appreciative of the support we’ve received.
I think this points out a need for recognizing the responsibilities of overlapping ownership. While it might be appropriate for me to be a code owner for HLSL, it wouldn’t be appropriate for me to blanket approve changes to the AST, even if they are HLSL-specific.
THIS. THIS. THIS. I cannot agree with this more. Yes please.
Until reading this sentence I hadn’t drawn a connection between becoming a code owner and general career growth, but I think this is a problem our industry has as a whole, where the path from college student to “senior whatever” is very nebulous.
What if we make a path? Define roles and progressions. Maybe we have one or a few “Code owners” for an area, but two or three times as many “code curators”, and “curators” could be a clearly step toward being an “owner”.
Personally I learn a lot through following along and being in the room for a conversation. Having a formalized group of people that are maybe not code owners, but in the path to becoming owners in an area might be a good way of getting those people “in the room”.
What do you think about making part of the responsibility of a code owner nominating other people to be “curators” and helping them grow? We shouldn’t require that someone be a code owner to nominate a curator. Anyone should be able to nominate a “curator” and anyone should be able to be a “curator”. Maybe the only requirement for being a “curator” is a GitHub/Phabricator account.
Me too. Do you think having other defined roles that lead to code ownership would help or be a hurdle here? I strongly agree with your point that we need to assume people won’t nominate themselves. The leaders in the community need to be proactive about nominating and generally about recognizing the contributions of others.
I recently jokingly asked for a review from a code owner (and friend) who hasn’t contributed code to LLVM since 2015… I think acknowledging removing ownership isn’t an insult is a big first step.
I think that we should recognize that to make room for new people in leadership roles sometimes other leaders need to be willing to step aside. I really know what the right solution here is. Maybe an “owner emeritus” designation, but I think defining successful code ownership as passing the torch should be part of our perspective.
I remember a few occasions in the past where a code owner has held parts of the codebase hostage, so I very much feel this concern. I wonder if this is something we could take direction from the proposal process that @clattner brought in a few years back. That process hasn’t really been widely used, but I feel like it was a good way to provide a structured way to address disagreements.
I wasn’t removing anything technical but adding to it. It’s listed as the first thing. “Ensuring reviews/code quality”. I want to spell out the other parts of the job that are directly related to being a leader in our community. If we just have code owner’s focus only on the technical, then we won’t have adequate replacements when they need to move on. While some code owners are already doing this, I think it would be good to really go into details on how to achieve the other parts of the job.
I like this idea.
I believe this is happening more than we know. Right now, individuals have no where to turn to for help if it does happen and they just give up. This effectively drives people away from our community.