advice with development on clang-tidy matchers

I’m finding it very hard to find information and examples on using the matchers and API to do a number of things. I wondered if anyone knew of a good source of introductory info that goes into some depth? Eli Bendersky’s blogs are very good but I need more detail on specific things: I’m keen to use matchers to match standard library types by name but that seems hard.

I have work in progress here:

I’m stuck on both:

  • MissingNameSpaceStd can’t cope with pointers, references or template types like std::vector<std::pair<int,int>>

  • PropagateConst can’t match smart pointers. Ideally it would do so by duck-typing but I’d settle for name matching for now.

I’ve posted messages on the clang IRC channel but am met by stony silence. I’ve not used IRC before so perhaps my setup is wrong.

I set up a review account on http://llvm.org/docs/Phabricator.html and have patches for the git repositories above but need a repository name for Differential (clang, clang-tools-extra, llvm are not right) to make progress.

Thanks for any help you may be able to offer.

regards,

Jon

I’m finding it very hard to find information and examples on using the matchers and API to do a number of things. I wondered if anyone knew of a good source of introductory info that goes into some depth? Eli Bendersky’s blogs are very good but I need more detail on specific things: I’m keen to use matchers to match standard library types by name but that seems hard.

Shameless plug to all the docs we wrote and a talk of mine :slight_smile:
http://clang.llvm.org/docs/IntroductionToTheClangAST.html

http://clang.llvm.org/docs/LibASTMatchers.html

http://clang.llvm.org/docs/LibASTMatchersReference.html

I have work in progress here:

I’m stuck on both:

  • MissingNameSpaceStd can’t cope with pointers, references or template types like std::vector<std::pair<int,int>>

  • PropagateConst can’t match smart pointers. Ideally it would do so by duck-typing but I’d settle for name matching for now.

I’ve posted messages on the clang IRC channel but am met by stony silence. I’ve not used IRC before so perhaps my setup is wrong.

Feel free to ping me directly there (r4nt) during CEST time; note that yesterday was a holiday in the US where not that many people were online.

I set up a review account on http://llvm.org/docs/Phabricator.html and have patches for the git repositories above but need a repository name for Differential (clang, clang-tools-extra, llvm are not right) to make progress.

You don’t need to fill in the repo name.

In article <CAAbBDD_8q2LS4V5hdBrkCjvL5SagR3Y+btLrwfekqn08k7A29g@mail.gmail.com>,
    Jonathan Coe <jbcoe@me.com> writes:

I'm finding it very hard to find information and examples on using the
matchers and API to do a number of things.

It took me a while to wrap my head around the matcher API, but I think
I've got a handle on it now. I've got two reviews up with some new
checks for clang-tidy:

Add readability-simplify-boolean-expr check to clang-tidy
<http://reviews.llvm.org/D7648&gt;

Add readability-remove-void-arg check to clang-tidy
<http://reviews.llvm.org/D7639&gt;

(This is based on an earlier version I had as a tutorial on github:
<https://github.com/LegalizeAdulthood/remove-void-args&gt;\. That
tutorial needs updating to be based on clang-tidy and my final version
of the check.)

I wondered if anyone knew of a
good source of introductory info that goes into some depth?

Definitely look at the links that Manuel Klimek posted. Additionally,
I find myself coming back to this page all the time:
<http://clang.llvm.org/docs/LibASTMatchersReference.html&gt;

Eli Bendersky's
blogs are very good but I need more detail on specific things: I'm keen to
use matchers to match standard library types by name but that seems hard.

Have you tried using the hasName narrowing matcher?

I'm stuck on both:

   - MissingNameSpaceStd can't cope with pointers, references or template
   types like std::vector<std::pair<int,int>>

   - PropagateConst can't match smart pointers. Ideally it would do so by
   duck-typing but I'd settle for name matching for now.

I find that I go back and forth between the AST matcher reference page,
clang-query, and exploring the API by navigating in CLion. If you
haven't used clang-query yet, then build that from clang-tools-extra
trunk and make sure you have the line edit library installed first.
Without lineedit, you don't get tab completion so you can explore
valid narrowing matchers and child matchers for the match expression
you're building. This is VERY useful in exploring what can be done
with the matchers.

However, not every single attribute or node traversal is exposed as a
matcher. For instance, just the other day I was exploring how you
would identify constexpr functions. Well, you can do it in code by
matching a functionDecl and calling isConstexpr() in code on the
matched node, but it sure would be nice to have this in a matcher so
that your callback isn't invoked at all.

So, matchers are a work in progress like most things in software. The
matchers that allow you to identify code expanded in the main file for
the compilation unit appeared in clang trunk but weren't in the 3.5
release. Things are being added based on what's useful when writing
checks in clang-tidy or other libtooling code.

I see that you added a FileCheck test for your clang-tidy check. This
is great! What I do is drive things forward one test case at a time
from the FileCheck source file. First I add a CHECK-MESSAGES to
ensure that the proper diagnostic is being issued. Then I add
CHECK-FIXES to verify that the proper substitution is taking place.
In between a failing test case and a passing test case is heavy use of
the AST matcher reference page and clang-query.

In article <
CAAbBDD_8q2LS4V5hdBrkCjvL5SagR3Y+btLrwfekqn08k7A29g@mail.gmail.com>,
    Jonathan Coe <jbcoe@me.com> writes:

> I'm finding it very hard to find information and examples on using the
> matchers and API to do a number of things.

It took me a while to wrap my head around the matcher API, but I think
I've got a handle on it now.

For me, the insight after staring at the API for a while is that it is
basically a tree-interpreter pattern, except that the "tree" that is
"interpreted" actually ends up open-coded at compile time. E.g.
allOf(foo(), bar()) basically is (in pseudocode pointer-leaky OO style):

class AbstractTreeInterpMatcher {
  virtual bool matches(Node *) = 0;
};

class allOf : public AbstractTreeInterpMatcher {
  vector<AbstractTreeInterpMatcher*> SubMatchers;
  bool matches(Node *N) override {
    for (auto *M : SubMatchers)
      if (!M->matches(N))
        return false;
    return true;
  }
};

class foo : public AbstractTreeInterpMatcher {
  bool matches(Node *N) override { .... };
};
class bar : public AbstractTreeInterpMatcher {
  bool matches(Node *N) override { .... };
};

Then the matcher allOf(foo(), bar()) is basically `auto *M = new allOf{{new
foo(), new bar()}}`, and the root matching process is just a call to
`M->matches(TheNode)`. There's a bit more state that filters through in
reality, but that's the essence of it.

The actual implementation is actually pretty close to what I have above,
but less Java-like (but with lots of template and macro stuff; e.g. instead
of AbstractTreeInterpMatcher there is a MatcherInterface<T>). I get the
impression that the pattern is a lot more familiar when phrased in this
textbook OO style (since there is a very clear "object recursion"
happening), while the template/macro generated version is more difficult to
see through since it is not as clear what is being recursed on (even though
under the hood it is doing the exact same thing).

E.g. looking at http://clang.llvm.org/docs/LibASTMatchers.html,

VariadicDynCastAllOfMatcher<Base, Derived>

is really just allOf together with (in pseudocode, but this could in fact
be written essentially like this):

class dynCastMatcher : public AbstractTreeInterpMatcher {
  int BaseKindNum;
  int DerivedKindNum;
  bool matches(Node *N) override {
    switch (BaseKindNum) {
      case DeclKind: return N->getAsDecl() && N->getAsDecl()->getKind() ==
DerivedKindNum;
      case StmtKind: return N->getAsStmt() &&
N->getAsStmt()->getStmtClass() == DerivedKindNum;
      case TypeKind: return N->getAsType() &&
N->getAsType()->getTypeClass() == DerivedKindNum;
      default: return false;
    }
  };
};

then the code:

namespaceDecl(foo(), bar())

becomes

new allOf{{new dynCastMatcher{DeclKind, NamespaceDeclKind}, new foo(), new
bar()}}

The role of `Node *` in this example is basically what
clang::ast_type_traits::DynTypedNode does (see
http://clang.llvm.org/doxygen/classclang_1_1ast__type__traits_1_1DynTypedNode.html
).

-- Sean Silva