Declarative ExplodedGraph matching

gamesh411 · February 2, 2022, 7:32pm

Hi,
I’m aware that there had been attempts to develop a solution for detecting bugs in ClangSA, by using a declarative description of the error condition, defined by the structure of the ExplodedGraph data structure. As I can remember the solution was akin to the ASTMatchers library, so the graph nodes were matched by node-, narrowing- and traversal matchers (see AST Matcher Reference (llvm.org)).

I am interested in taking up this development and would like to inquire about the opinion of the community about this endeavor.

Shoutout to Artem Dergachev, as I think the initiative was his originally.

Szelethus · February 10, 2022, 8:28am

Hi!

Just for a point of reference, here is the initial discussion:

NoQ · February 11, 2022, 2:44am

Hi, yeah, I think it’s a lovely thing to have and also I think that now that Alexey’s proof-of-concept exists in the wild, we have a chance to explore the design space a bit further. ASTMatchers are great for compiler developers but they fall short of the dream to provide a way for the users to develop their own checkers. I’m really curious if we can achieve that goal, as we probably have only one chance

Like clang-query eliminated the need for the users to compile clang in order to run custom ASTMatchers, we should probably preserve this achievement. The awesome IgnoreUnlessSpelledInSource feature lifts the requirement for understanding implicit/invisible AST nodes, thus reducing the entry barrier. Another problem with ASTMatchers though, that remains unaddressed for now, is their stability guarantees. Both changes in the AST and changes in the matchers themselves have the potential for breaking user-made matchers.

So I want to think really deeply about that last part. Even though static analyzer checkers can potentially introspect the AST as much as they want, the structure of checker callbacks that we have usually advocates for a much more high-level approach: say, checkLocation and checkBind group all memory reads/writes together regardless of how they’re represented in the AST, checkPreCall/PostCall treats all calls uniformly regardless of whether it’s a plain C function or a C++ virtual method or a temporary destructor or an overloaded operator new() invocation or an Objective-C message, etc.

Maybe if we focus on these basic building blocks, we can make a stable, easy-to-learn, high-level domain-specific language that speaks about the program under analysis in these very simple terms, so that it could be presented as a user-facing feature, despite being derived from the relatively unstable and somewhat quirky foundation of the Clang AST?

Topic		Replies	Views
[analyzer] Proof-of-concept for Matcher-like checker API (graph matchers) Static Analyzer	15	147	May 11, 2019
[RFC] A dataflow analysis framework for Clang AST Clang Frontend	16	2126	November 19, 2021
Building upon the clang static analyzer Clang Frontend	7	138	April 18, 2013
[analyzer][GSoC] Implementing a dataflow framework for the Clang Static Analyzer Static Analyzer	4	181	March 14, 2019
[analyzer] Summary IPA thoughts Static Analyzer	25	319	April 8, 2016

Declarative ExplodedGraph matching

Related topics