I've found clang-query to be useful to query single files. Matching the Clang
AST is easy enough and fast to learn. I don't see how to use clang-query on a
whole project, though, which would be super useful.
A simple example where it's effectively useless: Assume some header "global.h"
containing "struct Global {};", every file in the project includes this header.
Now the idiomatic way to run clang-query over all the files in the project:
clang-query -p $BUILD $(find . -name "*.cpp")
Let's try to find record decls named "Global":
match recordDecl(hasName("Global"), isDefinition())
=> You get one match per file / translation unit. But in fact, this is not
really what you want. You want one match here.
So of course this is difficult: clang-query is TU-centric, while for a whole
project you'd "somehow" want a global view over the source code and not have
duplicate results for a query like above.
Questions:
- Is it possible to get clang-query to behave like that?
- Any research/pointers in that regard?
- Is it possible to filter duplicates in the results without external tools?
I’ve found clang-query to be useful to query single files. Matching the Clang
AST is easy enough and fast to learn. I don’t see how to use clang-query on a
whole project, though, which would be super useful.
A simple example where it’s effectively useless: Assume some header “global.h”
containing “struct Global {};”, every file in the project includes this header.
Now the idiomatic way to run clang-query over all the files in the project:
clang-query -p $BUILD $(find . -name “*.cpp”)
Let’s try to find record decls named “Global”:
match recordDecl(hasName(“Global”), isDefinition())
=> You get one match per file / translation unit. But in fact, this is not
really what you want. You want one match here.
So of course this is difficult: clang-query is TU-centric, while for a whole
project you’d “somehow” want a global view over the source code and not have
duplicate results for a query like above.
Questions:
Is it possible to get clang-query to behave like that?
Any research/pointers in that regard?
Is it possible to filter duplicates in the results without external tools?
Thanks a lot for a great tool so far!
Well, usually at that point you’ll want to start writing a go/clangmr.
It might also be cool to make clang-query work as clangmr; sounds like that would be a nice 20% project
I’ve found clang-query to be useful to query single files. Matching the Clang
AST is easy enough and fast to learn. I don’t see how to use clang-query on a
whole project, though, which would be super useful.
A simple example where it’s effectively useless: Assume some header “global.h”
containing “struct Global {};”, every file in the project includes this header.
Now the idiomatic way to run clang-query over all the files in the project:
clang-query -p $BUILD $(find . -name “*.cpp”)
Let’s try to find record decls named “Global”:
match recordDecl(hasName(“Global”), isDefinition())
=> You get one match per file / translation unit. But in fact, this is not
really what you want. You want one match here.
So of course this is difficult: clang-query is TU-centric, while for a whole
project you’d “somehow” want a global view over the source code and not have
duplicate results for a query like above.
Questions:
Is it possible to get clang-query to behave like that?
Any research/pointers in that regard?
Is it possible to filter duplicates in the results without external tools?
In theory it should be easy to modify clang-query to check the location
of a declaration and only return declarations for unique locations.
Well, the problem is that a clang-query process is started for each TU, so you’d need to somehow tell clang-query which results were in a completely different process of clang-query. I don’t think this is going to be easy or worth it…
This isn't entirely accurate. Each TU is stored in a separate AST, but all
of the ASTs live in a single process. When a match query is run, we simply
iterate over a vector of TU ASTs (see MatchQuery::run in Query.cpp). If you
wanted to deduplicate results, you could probably do it in MatchQuery::run
by pretty printing each result's SourceLocation and using that as a string key.
I've found clang-query to be useful to query single files. Matching the
Clang
AST is easy enough and fast to learn. I don't see how to use clang-query
on a
whole project, though, which would be super useful.
A simple example where it's effectively useless: Assume some header
"global.h"
containing "struct Global {};", every file in the project includes this
header.
Now the idiomatic way to run clang-query over all the files in the
project:
clang-query -p $BUILD $(find . -name "*.cpp")
Let's try to find record decls named "Global":
match recordDecl(hasName("Global"), isDefinition())
=> You get one match per file / translation unit. But in fact, this is not
really what you want. You want one match here.
So of course this is difficult: clang-query is TU-centric, while for a
whole
project you'd "somehow" want a global view over the source code and not
have
duplicate results for a query like above.
Questions:
- Is it possible to get clang-query to behave like that?
- Any research/pointers in that regard?
- Is it possible to filter duplicates in the results without external
tools?
Thanks a lot for a great tool so far!
Well, usually at that point you'll want to start writing a go/clangmr.
I’ve found clang-query to be useful to query single files. Matching the
Clang
AST is easy enough and fast to learn. I don’t see how to use clang-query
on a
whole project, though, which would be super useful.
A simple example where it’s effectively useless: Assume some header
“global.h”
containing “struct Global {};”, every file in the project includes this
header.
Now the idiomatic way to run clang-query over all the files in the
project:
clang-query -p $BUILD $(find . -name “*.cpp”)
Let’s try to find record decls named “Global”:
match recordDecl(hasName(“Global”), isDefinition())
=> You get one match per file / translation unit. But in fact, this is
not
really what you want. You want one match here.
So of course this is difficult: clang-query is TU-centric, while for a
whole
project you’d “somehow” want a global view over the source code and not
have
duplicate results for a query like above.
Questions:
Is it possible to get clang-query to behave like that?
Any research/pointers in that regard?
Is it possible to filter duplicates in the results without external
tools?
In theory it should be easy to modify clang-query to check the location
of a declaration and only return declarations for unique locations.
Well, the problem is that a clang-query process is started for each TU, so
you’d need to somehow tell clang-query which results were in a completely
different process of clang-query. I don’t think this is going to be easy or
worth it…
This isn’t entirely accurate. Each TU is stored in a separate AST, but all
of the ASTs live in a single process. When a match query is run, we simply
iterate over a vector of TU ASTs (see MatchQuery::run in Query.cpp). If you
wanted to deduplicate results, you could probably do it in MatchQuery::run
by pretty printing each result’s SourceLocation and using that as a string key.
That doesn’t really scale though. But yea, for small projects it might be enough…
> > > > Heya,
> > > >
> > > > I've found clang-query to be useful to query single files. Matching
>
> the
>
> > > Clang
> > >
> > > > AST is easy enough and fast to learn. I don't see how to use
>
> clang-query
>
> > > on a
> > >
> > > > whole project, though, which would be super useful.
> > > >
> > > > A simple example where it's effectively useless: Assume some header
> > >
> > > "global.h"
> > >
> > > > containing "struct Global {};", every file in the project includes
>
> this
>
> > > header.
> > >
> > > > Now the idiomatic way to run clang-query over all the files in the
> > >
> > > project:
> > > > clang-query -p $BUILD $(find . -name "*.cpp")
> > > >
> > > > Let's try to find record decls named "Global":
> > > > match recordDecl(hasName("Global"), isDefinition())
> > > >
> > > > => You get one match per file / translation unit. But in fact, this
>
> is
>
> > > not
> > >
> > > > really what you want. You want one match here.
> > > >
> > > > So of course this is difficult: clang-query is TU-centric, while for
>
> a
>
> > > whole
> > >
> > > > project you'd "somehow" want a global view over the source code and
>
> not
>
> > > have
> > >
> > > > duplicate results for a query like above.
> > > >
> > > > Questions:
> > > > - Is it possible to get clang-query to behave like that?
> > > >
> > > > - Any research/pointers in that regard?
> > > >
> > > > - Is it possible to filter duplicates in the results without
> > > > external
> > >
> > > tools?
> > >
> > > In theory it should be easy to modify clang-query to check the
> > > location
> > > of a declaration and only return declarations for unique locations.
> >
> > Well, the problem is that a clang-query process is started for each TU,
>
> so
>
> > you'd need to somehow tell clang-query which results were in a
> > completely
> > different process of clang-query. I don't think this is going to be easy
>
> or
>
> > worth it...
>
> This isn't entirely accurate. Each TU is stored in a separate AST, but all
> of the ASTs live in a single process. When a match query is run, we simply
> iterate over a vector of TU ASTs (see MatchQuery::run in Query.cpp). If
> you
> wanted to deduplicate results, you could probably do it in MatchQuery::run
> by pretty printing each result's SourceLocation and using that as a string
> key.
That doesn't really scale though. But yea, for small projects it might be
enough...
Thanks for the pointers! This looks useful indeed.