Hi Chris and Daniel,
Hi Jan,
These are good points! I'm cc'ing llvmdev, because this is pertinent to the larger community.
I was happy to be able to talk with both of you during the Dev Meeting. Today I
started to think a bit more about code reviews, LLVM and the future. From my
perspective the major thing that sets LLVM apart from other compiler
infrastructures is the attention put on the design to make things work the right
way, both within a module/library and between them. For example, I don't see
projects like LLDB being possible unless the modularity and code quality is a
major focus.
Given that, there are very few people that actually understand the underlying
design principles and ideas behind the code. Why is something done a certain
way, why were other alternatives not as good or not viable? With the popularity
of LLVM increasing all the time and new developers wanting to contribute more
and more code it seems that code reviews will become more of a bottleneck. In
GCC this is not so much of a problem, because there are lot of maintainers that
do code reviews. The reviews are basically enforcing the coding standard, but
the design is very rarely considered. This is perhaps why GCC is a "Big Ball of
Mud" (I'm sure you have read it, but if not here's the link
Big Ball of Mud). LLVM will not go down the same path as GCC in this
regard, but since the code only exposes the "final" decision on some design and
not the ideas behind it or how it may be used in the future. It seems to me that
it is a lot more difficult for someone from the outside to do a review that
truly reflects the ideas behind the code because they do not have all this
implicit information. In some cases it is not that difficult if the task is
simply to follow an existing pattern, e.g. if a new optimization pass is added,
but often new code make some subtle change to the design that may or may not be
desirable.
Yeah, I completely agree. Getting people to write documentation in general is difficult, documenting "bad ideas that were rejected" is even harder. I'm hoping that the blog will be a good way to document design decisions and other stuff like this though. We do have one decent example so far:
I'd like to do more similar ones in the future, and encourage other contributors to also write other blog entries in general.
I was wondering what the plan is for new "code reviewers" or "maintainers"? Do
you think they are needed soon, or can it wait some time? It may be that LLVM is
moving forward fast enough today, but eventually new people will have to start
doing code reviews.
Technically, we have something like 150 code reviewers, because anyone with commit access is allowed to review a patch The trick is to get people to step up more. This is happening a lot more lately than it did even 2-3 years ago, but it certainly could always be better
Do you think it is viable to just let people work on LLVM
long enough and they will pick up on the designs and ideas and be able to do
reviews, or do you think it is necessary to have a deeper kind of knowledge
transfer for people to do it correctly? My fear is that is that bad changes that
may be fairly small can easily start to pollute the code base and if people
don't realize it, it may require big re-factoring jobs to clean it up. At least
a lot of re-factoring happens today, which is a good, but as LLVM grows and if
there is not a consistent view between the people that do the reviews it is more
likely that design bugs/incompatibilities will get into the code. Have you had
any thoughts or concerns about this?
The big issue historically has been abandonware code: the Itanium backend for example never had reviewers because noone was actively maintaining it. This was easy enough to let people just slam in any patches that they want, because it couldn't break anything important
The JIT is another interesting case. It is basically abandonware, suffers from poor design (because it was a very early component that didn't have the newfangled infrastructure to build on) and is very difficult to test (particularly for things like JIT debug or unwind info). Because of that, I personally have stayed away from reviewing patches to it. Your work on mc'izing the jit is a light at the end of this tunnel, which will hopefully allow us to shoot the old JIT in the head when the new one is available
I personally think that there are a several key aspects of our development process that contribute to ensuring that LLVM maintains its high quality over time:
1) peer review (review after commit is enough if people are willing to revert and if people actually do read the patches and speak up).
2) automated testing
3) components that are not on-by-default can be held to lower standards while they are being implemented. If they work out and it seems like a good idea to turn them on by default, they should be given a really hard once-over.
4) continuous refactoring and improvement of the codebase
5) being willing to throw away old stuff that noone is using or maintaining, or replace things that are broken and terrible.
Of course it remains to be seen how this works out over the years, but we've managed about a decade so far without anything going too horribly wrong yet
-Chris