[RFC] Directions for modules support in clangd?

The review process for the initial support patch for modules in clangd ([clangd] [C++20] [Modules] Introduce initial support for C++20 Modules by ChuanqiXu9 · Pull Request #66462 · llvm/llvm-project · GitHub) has been suspended for nearly half year. I want to discuss how can we proceed.

The support for modules in clangd is pretty important. On the one hand, I guess we don’t need to prove how important the code intelligence is nowadays. On the other hand, many people think (or thought) the modules as the most important feature in C++20. I used the term “thought” since I find some people are losing confidences in modules due to the premature support from tools. So I want to emphasize that it is pretty important for the users to have some support for modules in clangd.

IMPORTANT NOTE: this is not a push letter to @sam-mccall! It is time-consuming for any one to take a high-quality review for such an initial and relative large patch. And it is not easy to take the sufficient time for members of the open source community. In fact, the suggestion from @sam-mccall has helped me a LOT to improve the quality of the patch.

Another important note is that, the patch linked above is only an initial patch. It is expected to be not efficiency enough for even small size real world projects. We need several following optimizing patches to make it better. So this is not a requirement to merge the patch. But a discussion for the future directions for modules support in clangd.

So based on the assumption that we may not have qualified enough reviewers, what can we do?

My proposal may be:

  1. Encourage more people to do shallow reviews and allow or encourage them to accept the patch.
  2. Encourage more volunteers to provide testing feedbacks and treating these feedbacks as review results.
  3. We can land the patches with the good feedbacks.

The final note here is that the scope of the RFC only limits to the modules support in clangd. I know it is much harder to make a wider policy. While that is pretty valuable, I prefer to make some progress in this direction first.

9 Likes

Thank you for raising awareness of this issue, and thank you also for clearly identifying this as a systemic issue rather than an issue with a specific reviewer. Sam is brilliant, does amazing work, and like many folks in our community, I wish we could clone him so we can reduce review workloads to a more healthy and sustainable level.

I definitely agree we should be encouraging more people to do shallow or drive-by reviews, as well as providing additional testing feedback. Getting any amount of feedback is better than getting silence! And landing the patches once everyone actively involved in the review thinks things are good enough is acceptable (we do not require code owner approval to land patches, generally speaking).

Another thing I think we should consider is adding co-code owners where possible. This has worked out very well in Clang itself in terms of spreading review workload around. Perhaps Sam knows of some folks he thinks could be good candidates?

Also, I think we should remind folks who do review work in Clang that clang-tools-extra (and clangd), while a separate part of the monorepo, has a ton of overlap with Clang work. So Clang reviewers providing feedback within clang-tools-extra is a very welcome and sensible thing! Some of the review work needed is “how do we fit this into clangd” which needs clangd-specific knowledge, but there’s plenty of other review work that doesn’t need clangd-specific domain knowledge (does the code match the coding style? do the changes have corresponding test coverage? is the documentation complete and have good grammar, etc? is there a release note? etc). Folks should feel empowered to leave that kind of feedback on a review even if the review is not in an area of the monorepo you feel like an expert in, as it definitely helps reduce the review workload on folks with domain-specific knowledge and it helps the patch author materially improve their work.

At the end of the day, this issue is not a clangd issue – we feel this pain everywhere in the project because we have an unhealthy level of expectation with how much code we can reasonably add to the monorepo while still producing production-quality tools. Long term, I think we should recognize that we have two different kinds of folks in the community: contributors (people whose primary goal is to add code to the project) and maintainers (people whose primary goal is to ensure the long-term health and viability of the project). Our current model with code owners expects someone to be the person who knows every last detail about their area and helps make final decisions, which makes individuals a bottleneck and contributes to burnout. I would love to see a more flexible model centered around maintainership where they’re not expected to be a sole(ish) point of contact who knows everything, but are instead expected to be a group of people who are willing to do review work, triage issues, answer questions, figure out who else can help solve problems, etc. Then we should strive to build up a large roster of maintainers who can support one another across a broad range of tasks within the project rather than code owners that feel more siloed into particular areas or tasks. The distinction is subtle but the overall goal should be to share the work around to many people rather than creating bottlenecks, and for us as a community to help continually transition as many folks with the interest and time to go from “contributor” to “maintainer” as we can.

Specific to getting you unstuck in clangd, maybe we can get feedback on your patch from @HighCommander4 @kadircet Younan Zhang as folks who contribute to clangd, and @Bigcheese as someone highly familiar with modules and tooling?

2 Likes

I’m the manager of the team at Google that wrote and maintains Clangd. I will speak on behalf of the whole team.

Thanks for the patch and starting this thread and sorry for not getting back to patches.

We know that we have been doing less active work on Clangd lately and it really shows with this patch. We also agree that modules support is important. Moreover, it is very high stakes for us too as we use header modules internally in a unique setting (our build system produces module maps for C++ libraries users define and our #includes are automatically replaced by module imports). We believe that nobody actually deploys it at the same scale in a similar manner.
We also tightly integrate Clangd with our Cloud IDE.

So getting the modules support wrong (C++20 standard modules or header modules) may have significant costs in the long term.

However, as I mentioned before, we don’t actually actively invest in improving Clangd anymore (it mostly works for us). But this one is really an exception as it’s needed for long-term sustainability of our existing integrations.

We will find someone from our team (likely @kadircet or @sam-mccall) to finish the review soon and move forward with this particular patch.

That being said, I do feel this patch is an exception, and we would definitely appreciate if more people would help with general development of Clangd, including reviews.

2 Likes

Hi folks, apologies for letting this slip as far as it has, and thanks @ChuanqiXu for raising it and for your patient and constructive approach overall.

I’m going to make time to work on it this week and prioritize getting you timely reviews on any followup pieces from myself or Kadir.

In general: I haven’t been able to give clangd as much attention as I’d like, and this will probably continue :frowning: . I agree that it would be great to have more people feel empowered to change/review. I think @kadircet and (especially) @HighCommander4 are fulfilling the code-owner role today and should be owners if they’re OK with it. (Can we have multiple code-owners now? IIRC it used to be limited).

This change was a bit of a perfect storm: it tackles hard problems (modules, threading, cache invalidation), it adds significant new design in both the most performance-sensitive and complex parts of clangd, and has far-reaching scalability implications that we have no way to test right now.

Reviews of earlier (phab!) iterations took me most of a week and I couldn’t find that time very often. We absolutely need this feature, a shallower review of the design would have been risky, there are few people familiar enough to review this part (maybe just Kadir). So I’m not sure there was a better way, beyond me being (much) more disciplined about returning to this when possible. Interested if others have experience here.

3 Likes

The Clang experiment with multiple code owners has been highly successful and so I would strongly encourage use of that approach. I’d also happily support the nomination for both @kadircet and @HighCommander4! I would recommend making their nominations in a new thread, however, just for wider visibility within the community.

1 Like

Sadly the review process for the following patch ([clangd] [Modules] Support Reusable Modules Builder by ChuanqiXu9 · Pull Request #106683 · llvm/llvm-project · GitHub) get stucked again. The intention of the patch is to make different TUs can share the BMIs of imported modules. The performance of modules in clangd may be unacceptable without this patch.

I tried to talk with @HighCommander4. But he was too busy and he like to wait for reviews who had experience with the previous reviews. And I believe @kadircet has other important business to do.

I understand that the mechanisms of code owners is one thing. But if every one can be available every time is another thing. I can’t promise I can maintain the modules things in clang forever too. So let’s try to focus on how should we proceed on this particular problems.

For the short term, I like to land the above patch in 2 weeks if no more comments come in, given:

  • In our downstream, we’ve landed this patch for more than a year and it seems running well. And also in the open source world, I tried to send it to GitHub - ChuanqiXu9/clangd-for-modules: A preview for C++20 Named Modules Support in clangd.. Everyone experienced it didn’t complain about it for bugs. So I believe this is well tested.
  • There are other following patches and I really wish we can have something workable in the next release.
  • This patch doesn’t change the shape of modules support in clangd. The major change lives in ModulesBuilder.cpp, which we can think it as an implementation detail instead of some global things.

For the longer term, I’d like to take the ownership of modules support in clangd. So that I can land the patches after I reviewed it and tested it by the LLVM’s policy.

For the previous concern, “we’re afraid when we come back, when we want scalable, high quality and high efficient implementation for C++20 modules in clangd, it will be more expensive to refactor the not-so-well-designed codes than creating something fresh new”. I think:

  • It is not an argument strong enough to ask clangd and modules users to wait for it without a particular time line.
  • It is more or less not political correct to say only certain organizations or particular person can write good codes.
  • I do think it won’t be too bad. Since the number of following changes in my mind is not so many, we won’t fall into irreparable case. And given we have enough users who are waiting for this, I believe we can have enough eyes on this and we won’t be in some regression to deeply and we can receive enough feedbacks to make improvements step by step.

CLARIFICATION: this is not a blaming post for particular organization or particular person. I understand this as a common “hit by a bus” problem especially in open source area. Let’s focus on how to proceed on this particular problem.

1 Like

Hi @ChuanqiXu, sorry for silence and the delay on the patch.

It has definitely been on my radar, and I am hoping to get to it soon. I think we should definitely have it ready by next release. I’ll make sure it’s reviewed on my side at least in a timely manner so that we don’t only land it, but also have enough time to test and address issues before the release.