Hi!
(As a ping), I would like to summarize the measurements I done since the
original e-mail:
The approach is to first serialize all the translation units to the
storage, create an index of the functions, and then load them lazily on
demand to achieve cross translation unit support. This does not modify the
inter-procedural analysis of the Static Analyzer and could be used for
Clang Tidy as well. Once a new inter-procedural analysis is introduced for
the Static Analyzer, the cross tu support would profit from it immediately.
Benchmarks:
rAthena, a 150k LOC C project:
The size of the serialized ASTs was: 140MB
The size of the indexes: 4.4MB
The time of the analysis was bellow 4X
The amount of memory consumed was bellow 2X
The number of reports is 3 times more
Xerces, 150k LOC C++ project:
The size of the serialized ASTs was: 800MB
The size of the indexes: 90MB
The analysis time using CTU was the half of the one without CTU
LLVM + Clang + Clang tools extra:
The size of the serialized ASTs was: 45.4 GB
The size of the indexes: 1,6GB
Some optimization effort to reduce te size of the CFG:
TU ASTs after omitting function bodies from headers: 42.7 GB
TU ASTs after omitting STL: 34.0 GB
TU ASTs after skipping implicit instantiations: 21.5 GB
TU ASTs after omitting STL and implicit instantiations: 16.0 GB
Considering that the build directory of a debug build is also about 40 GB
on my platform, I do not consider the size of the serialized ASTs a
blocking issue. However, in case it is a concern, according to the attached
statistics about the serialized AST dumps, there are some optimization
possibilities in the binary format.
This solution also works in a multi process build/analysis environment.
Some of the necessary framework, for example ASTImporter code is being
accepted into mainline clang right now.
All in all, this approach:
- Can discover new bug reports as is.
Hi Gabor,
I’d like to get more data on this point. We know that inlining-style
inter-procedural analysis does not scale. That is the main issue with
adding the cross-translation unit support to the clang static analyzer.
I suspect that the only way the suggested approach works is that we time
out more often when the TU support is enabled. Since the static analyzer
performs DFS of the paths, this means that we trade catching bugs localized
within a single TU to bugs that require deep cross-TU analysis. I suspect
those bugs are much harder to understand, but more, importantly, we trade
coverage in a lot of cases instead of gaining it.
Hi Anna,
Good point. Using the cross TU analysis, the analyzer has very different
coverage pattern. I did not have time to look into the reports yet, but
will do so soon.
Looks like you clam that the number of bugs reported increased for one of
the projects, but there is no data for all the other projects. Also, had
that project ever been analyzed with the clang static analyzer before and
had some of the intra-procedural bugs been fixed before you made the
measurement?
There is no trace of fixed static analyzer warnings in the commit logs of
rAthena project. But in case issues found w/o cross tu and was fixed it can
be still useful to analyze the project with a different coverage pattern.
I’d like to see the absolute number of bugs reported in both cases. Also,
it would be good to measure coverage and the number of times the analyzer
times out in both cases. (Those statistics should be easy to generate as
the tracking is already implemented by the analyzer and we used it when
bringing up inter procedural analysis.)
I did some measurements, and here are the results for the rAthena project:
Coverage results without cross translation unit:
105499 AnalysisConsumer - The # of basic blocks in the analyzed
functions.4167 AnalysisConsumer - The # of functions and blocks
analyzed (as top level with inlining turned on).10084 AnalysisConsumer
- The # of functions at top level.60.0244651798 AnalysisConsumer - The
% of reachable basic blocks.3360 AnalysisConsumer - The maximum number
of basic blocks in a function.960440 CoreEngine - The # of paths
explored by the analyzer.140425349 CoreEngine - The # of steps
executed.646201 ExprEngine - The # of aborted paths due to
reaching the maximum block count in a top level function24317519
ExprEngine - The # of times RemoveDeadBindings is called489956
ExprEngine - The # of times we inlined a call20300 ExprEngine
- The # of times we re-evaluated a call without inlining
Coverage results with cross translation unit:
150406 AnalysisConsumer - The # of basic blocks in the analyzed
functions.3308 AnalysisConsumer - The # of functions and blocks
analyzed (as top level with inlining turned on).9829 AnalysisConsumer
- The # of functions at top level.48.9545495971 AnalysisConsumer - The
% of reachable basic blocks.3360 AnalysisConsumer - The maximum number
of basic blocks in a function.2088751 CoreEngine - The # of
paths explored by the analyzer.345875017 CoreEngine - The # of
steps executed.464409 ExprEngine - The # of aborted paths due to
reaching the maximum block count in a top level function64756893
ExprEngine - The # of times RemoveDeadBindings is called2176922
ExprEngine - The # of times we inlined a call44215 ExprEngine
- The # of times we re-evaluated a call without inlining
Note that, the results are not fully precise for the following reasons:
* I simply collected and merged the results for each translation unit. In
the cross translation unit case a function might be analyzed from multiple
translation unit, that might include duplicates in the number of analyzed
basic blocks. Inline functions in headers might cause the same effects even
when cross translation unit is turned off.
* The coverage % is also imprecise. Basic blocks not reached in one TU
might reached from another TU during the analysis.
Even though these numbers are just hints, they show about 50% increase in
the number of analyzed blocks. This makes me think that the overall
coverage is increased (however as you pointed out, the pattern is
different). It is also worth noting that the number of functions analyzed
as top level decreased.
In case the issues reported this way are too hard to understand, the
heuristics might be tuned in case this feature is turned on. This is no
silver bullet of course, but once a new interprocedural analysis is
introduced, this would provide an easy way to evaluate it in a cross tu
analysis.
What do you think?
Regards,
Gábor