Query regarding branching in Path Sensitive Analysis

Can someone please explain in detail how Clang Static Analyzer actually handles branching (i.e. if-else or switch-case) while doing Path Sensitive Analysis? From what I have understood till now, it visits ALL paths separately. I tried to experiment with this idea using some in-built checkers: checkBeginFunction(), checkEndFunction(), checkEndAnalysis() and checkBranchCondition() to explicitly print the execution flow of the analysis. For a function with a single if-else block, I expected the checkBeginFunction and checkEndFunction to be called twice each since there are two possible paths. But, both of them were called only once, and even more surprising is that checkEndFunction was called after the first path was completely analysed and checkEndAnalysis was called once at the end.

So, the final print order was:
Begin
Branch
EndFunction
Branch
EndAnalysis

And I cannot understand the reason behind this strange checker call order.

Additionally, it seems that checkEndAnalysis is also confusing, and I checked this using a function call within a function. The print order in that case is:
Begin(outer function)
Begin(inner function)
EndFunction(inner function)
EndFunction(outer function)
EndAnalysis

Note that EndAnalysis is called only once here, but I expected it to be called once after analysis of every function scope. So, it should have been printed twice.

Please explain the above two cases. It would be very helpful.

Every analysis run basically builds a graph, known as “Exploded Graph”, where every node is a (program state, program point) pair. So, every branch in the code (eg. if-statement) corresponds to branching in the graph, but merges in the code (eg. end of if-statement) don’t necessarily correspond to merge in the exploded graph (yes we reach the same program point on both branches, but if program states are different, eg. an assignment happened on one of the branches but not on the other, we won’t merge). Program points are context-sensitive, and when a function is called, we jump into that function without interrupting the analysis, then return to the caller and continue from there, hence only one EndAnalysis.

The graph is built/traversed in pretty much arbitrary order so debugging with prints is a really bad idea. Instead we have ways to dump the entire exploded graph, which is quite educational as it immediately explains everything on small examples, and it even can be used for debugging large real-world examples that are otherwise difficult to reduce.

There’s more info at Checker Developer Manual and I made a conference talk about this recently at 2019 LLVM Developers’ Meeting: A. Dergachev “Developing the Clang Static Analyzer” - YouTube

1 Like

Thanks for the reply! I think I understand what you said. But I still have a doubt about the EndFunction() checker. In all the sources I have checked, including the checker documentation, it is supposed to be called every time the function returns during the analysis, which in case of a single if-else would be two times, or four times for two consecutive if-else blocks and so on. But as in the above examples (as well as some other ones which I have tried), that doesn’t seem to be the case. Is there something else about that checker which I am missing? Any clarifications will be greatly appreciated.

I recommend looking at the exploded graph, it most likely has the answer. At a glance, this may happen if we’ve merged two paths when they reached the same program point with the same program state.

1 Like