Ok let me try to address a few points that so far have been missing:
- We have a strong signal for ourselves that the approach is both useful for and desired by a large-ish chunk of internal developers. I’ll share what I can about that below.
- Despite that, we are currently relatively early in the development. As good citizen, we’ve sent out the RFC early to gather feedback. As our work progresses we’ll definitely provide additional data and more precise specifications but for now we’re working out the details ourselves.
So we’re running into what sounds to me like an LLVM policy conflict: on one hand, we (as a community) encourage avoiding large downstream efforts and sunk cost fallacies, on the other hand we’re encouraged to have everything figured out before we begin(?) So even regardless of this effort, I want to arrive at a shared understanding of how we as a community intend to approach such situations. So below I lay down the practical situation we’re in, hoping that it’ll resolve the current concerns, as well as act as a model upon which we can resolve similar situations in the future.
Our (Positive) Experimental Data So Far
Basically, there’s been dozens of internal projects who have already been, for several years, voluntarily enforcing the proposed set of warnings upon themselves. We’ve been circulating an unofficial compiler patch that implements a basic version of the warnings we’re proposing, that they incorporated into their workflows. Instead of hardened libc++ they were using a custom class that acted like std::span with bounds checks. They didn’t have fixits so they needed to adopt spans manually. They’ve already found the performance tradeoff acceptable. Their sole purpose was improving security of their codebases.
Our goal with this project is to take the experience that we’ve gained from the prototype to inform the development of a warning that is available to all clang users and, in particular, improve the user experience beyond what we had in the basic prototype. Even though this tool is clearly not for everybody and we definitely don’t envision it being applied to all codebases, the large amount of interest we’ve observed so far is alone a good indication that there may be other parties all over the world interested in a similar solution.
So, while definitely experimental, our tool is much less experimental than a lot of other similarly-looking suggestions. Our preliminary investigation goes way deeper than just saying “Hmm these cppcoreguidelines sound cool, let’s build a clang tool to enforce them”.
Given that; And given the overall positive feedback we see in this discourse thread; I am optimistic that despite a significant effort-for-security tradeoff we still have relatively large audience. I also think that this amount of data is roughly as much as we can expect for any similar proposal; I don’t think it’s realistic to ask for much more data at this early stage of development. But we’ll definitely keep providing more and more data points as things progress.
Our (Positive) Relationship With Incremental Development
We’ve sent out this RFC early in the spirit of LLVM’s policy on incremental development, tring to avoid long-term downstream development branches.
So far we’ve seen both our company and other companies develop large features downstream and then only publish them when they’re finished enough for day-to-day use. However we believe that developing them openly from the start would have been a better thing to do in a lot of these cases. Not only it adheres to the policy, but it also saves effort on merge conflicts, it avoids wasted work when the discussion leads to change of direction, it keeps everyone informed to avoid duplication of effort, the benefits are numerous. We’ve observed that such open model works really well in other areas, such as my area – the clang static analyzer, so we want to do more of that.
Of course, LLVM-style incremental development on trunk is impossible without some initial preparation. There needs to be a staging area where features can live temporarily until they mature. There need to be separate criteria for accepting a feature in a staging area versus accepting a feature as stable. For example, the static analyzer has “alpha” checkers that are hidden from the user until they mature, and it has a somewhat specific checklist for “moving out of alpha”. In order to develop clang warnings incrementally, we’ll need a similar staging area in clang warnings. Say, treating clang-tidy as a staging area of clang warnings doesn’t sound quite right to me. Not only because clang-tidy is a lot more than that and it’s unfair to devalue its other strengths, but also because there’s a non-insignificant amount of effort necessary to move a warning from clang-tidy to clang proper. Incremental development is supposed to be composed of steps in the right direction, while putting a warning into a wrong tool is a step in an opposite direction. Whatever tool we ultimately choose, I think it shouldn’t be just because it’s “experimental”, we’d much rather be able to focus on the ultimate greater good when making this decision.
So we’re interested in trying to “marry” these two concepts – the usefulness of incremental development with the understandably high bar for compiler features. We’re interested in working out the requirements and criteria for initiating incremental development and managing the early-stage uncertainty. We’re interested in developing the necessary staging areas, if necessary, say we could do -Wexperimental-… or cc1-only warning flags. A staging mechanism would be highly beneficial beyond just this proposal and would likely encourage others in community to take a more incremental approach to their work as well.