First, I appreciate your interest in pushing this code back to the open source tree. I want to start with some high-level issues, as I think there needs to be some significant restructuring before we can really get into the details.
My main concern right now is the factoring of the code. None of this analysis logic should be added to the CFG class itself, but instead built on top of it. For example, the LiveVariables analysis builds on top of the CFG; it doesn’t embed information inside it. Instead, we should see a separate API entry point that, given a CFG, returns the analysis results.
Put another way, I should never expect to see analysis logic, like dominators, in CFG.h and CFG.cpp.
There are a variety of reasons for this. The first is modularity and purity. The CFG is just in the business of modeling control-flow. No more, and no less. That makes it API contract easy (or easier) to understand.
The second is just good software engineering. If we add 10 new analyses, do they need to get embedded into the CFG? Besides cluttering the API, that potentially means that some clients of the CFG need to pay the cost of having that logic their even if they don’t use. For example, the compiler itself won’t use the dominance analysis in the CFG, but if the analysis gets buried in the CFG API then it potentially must get linked in. Moreover, I should be able to extend the set of analyses possibly by providing new libraries or plugins; requiring that they are buried in the CFG doesn’t allow this flexibility.
Before I am really going to review this patch in detail, you will need to split your logic out of the CFG, and provide new classes and entry points for clients to use. That’s absolutely a prerequisite before it can get checked in.
My other concern is that there is no testing logic. We absolutely must have a way to test these analyses. We have very high quality standards in Clang, both in terms of code quality but also in terms of functionality. Without any tests, there is no reason to assume that these analyses really work at all.
What I propose is that you have similar testing to what we do with the LiveVariables analysis. There is a static analysis check, LiveVariablesDumper (lib/StaticAnalyzer/Checkers/DebugCheckers.cpp) that simply dumps out the live variables information for a CFG. That information can then be regression tested using FileCheck (there are many examples in the “test” directory of using FileCheck to examine output from a tool). While I’m not proposing that your analyses be part of the static analyzer proper, we can use the analyzer to test your analyses in this way. What I would then expect would be a set of tests for each of your analyses that regression tested the expected results of the analysis. Providing a way to dump these analysis results is not only good for regression testing, but also important just for debugging the behavior of these analyses in general.
The regression testing is also a prerequisite for this code getting checked into the Clang repository.