Creating AStMatcher from code ?

Hello,

I have been playing a bit with lib ASTMatcher and it is very tedious to use.
My workflow is to dump the AST for the relevant code and translate it into matchers, using clang-query to make sure it works.

I was wandering is there was any tool (or proposal, work in progress...) for automating this work.
The end goal would be give it some C++ code as input and have the corresponding matcher in output.

Thanks,
David

Hello,

I have been playing a bit with lib ASTMatcher and it is very tedious to use.
My workflow is to dump the AST for the relevant code and translate it
into matchers, using clang-query to make sure it works.

I was wandering is there was any tool (or proposal, work in progress…)
for automating this work.
The end goal would be give it some C++ code as input and have the
corresponding matcher in output.

Unfortunately there is no such tool. Would be awesome to have, though :slight_smile:

Sure

Sorry about earlier message - premature “send”… :frowning:

So, surely this is rather hard to do.

If you write match this: class T; T* ptr = new T;, what should it match? Only the exact class called T? Or, the opposite extreme, any assignment of new of any type, e.g “int ptr = new int;"? What about "const X p = new Y;”? Or “std::vector<int*> a = { new int, new int }”? What do you want to do with T* ptr{new T}; - it does essentially the same thing, but syntactially and semantically, it’s not identical - I’d expect the AST is not identical - and if it is when you didn’t want to match that, it’s a problem [although not solvable, since the AST is the same].

Of course these examples are mostly nonsense and can be solved in other ways, I’m just trying to point out that “matching something” often means distinguishing “This is what I want to match, this I don’t care about” - so, do you care if the type is const or not, do you care if the type is a class or struct vs. int etc? Do you care if it’s exactly an assignment, or anything that “kind of assigns”.

What I’m trying to say is that it would probably produce something to start from, but I’m not convinced it is THAT much help, since you still have to apply some fuzz and modify the code to do what you ACTUALLY want to do.

Sorry about earlier message - premature "send"... :frowning:

So, surely this is rather hard to do.

If you write match this: `class T; T* ptr = new T;`, what should it match?
Only the exact class called `T`? Or, the opposite extreme, any assignment
of new of any type, e.g "int *ptr = new int;"? What about "const X* p = new
Y;"? Or "std::vector<int*> a = { new int, new int }"? What do you want to
do with `T* ptr{new T};` - it does essentially the same thing, but
syntactially and semantically, it's not identical - I'd expect the AST is
not identical - and if it is when you didn't want to match that, it's a
problem [although not solvable, since the AST is the same].

Of course these examples are mostly nonsense and can be solved in other
ways, I'm just trying to point out that "matching something" often means
distinguishing "This is what I want to match, this I don't care about" -
so, do you care if the type is `const` or not, do you care if the type is a
`class` or `struct` vs. `int` etc? Do you care if it's exactly an
assignment, or anything that "kind of assigns".

What I'm trying to say is that it would probably produce something to
start from, but I'm not convinced it is THAT much help, since you still
have to apply some fuzz and modify the code to do what you ACTUALLY want to
do.

One way to address this would be to allow multiple expressions and generate
the most specific matcher that matches all those expressions. Still
wouldn't catch all cases, but might help make the first pass a little
closer to what you want.

And I have existence proof that this approach works well for simpler ASTs (Java), but it’s hard to say whether it would work equally well in C++