Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Thanks Artem, Aleksei for those precious information.

Yes, I observed exactly the same thing about MaterializeTemporaryExpr and I actually spent quite sometime trace into the clang’s codebase and reached to the exactly same place in the link: createTemporaryRegionIfNeeded. My workaround was inspired by one of the example checkers, that uses a checkPostStmt on MaterializeTemporaryExpr. Thanks for pointing out this.

Aleksei, thanks for pointing out the getQualifiedNameAsString, I didn’t know I could use that . I personally prefers it because getParent()->getName() doesn’t contain the namespace, but only the class name. So to make it safe, I need to go all the way up to traverse all parent decls, and getQualifiedNameAsString already does the thing for me.

Thanks for taking your time to answer my question. I appreciate it.

Now I observed some other weird thing, and I wonder if anyone has seen this before and knows any solution.

As a reminder, the checker I am writing is for a particular class T in my company’s code base and the checker is to ensure T.ok() == true before calling T.value().

The above is actually a over-simplification to my task. The class T actually has a member, Let’s say s_ (in type S) to keep track of the status, so T.ok() actually returns s_.ok(). And type S is used directly by a few other classes and macros which are used to creates T. So to make the check fully working, I also need to model the behaviors of S and a few other classes.

Fortunately, T are template class and therefore all method implementations are visible to current TU. I was thinking, it is a whole lot easier if I can just model the behavior of S, and rely on inline evaluation in the static analyzer. In most cases, it works, except I found the following corner case (suppose T has a conversion constructor from X: T(X&&) and default move constructor T(T&&)):

X f();
T t = f();

This is essentially constructs a temporary T t2{f()} and then doing a move T t{std::move(t2)}. It looks fine if everything is inlined, however, the inline doesn’t work because t2 is a temporary. I dived into the code base and what happened is

  1. At the end of ExprEngine::getRegionForConstructedObject, it is marked as improperly modeled target region:
    CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion = true;

  2. In ExprEngine::mayInlineCallKind, it returns false if the flag is set:

      // If we did not find the correct this-region, it would be pointless
      // to inline the constructor. Instead we will simply invalidate
      // the fake temporary target.
      if ([CallOpts](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626).[IsCtorOrDtorWithImproperlyModeledTargetRegion](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626))
        return [CIP_DisallowedOnce](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626);

I am not an expert in C++ compiler. I just wonder why it can’t allocates a temporary region for temporary variables? Having this special case means I need to model all T’s constructors, which is really annoying. Does anyone knows some workarounds? Thanks.

You're essentially in an area that's largely under construction at the moment, and i'm likely to address your issue in like nearest future, you can probably follow my progress on phabricator or in cfe-commits, or i can try to ping you when this case is supported.

Being source-based, the analyzer needs to model every possible language construct explicitly, and in C++ it means *a ton of stuff*. We could analyze, like, llvm IR instead, and it'd be much easier to implement correctly for us, but it'd be much harder to produce user-friendly warnings, because the user does think in terms of source code. And it'd be harder to make checkers for the same reason. Which is why "symbolic execution" we're doing here is to compilation like algebra to arithmetic: we need to do the same thing - support all language constructs - but with a lot of "concrete" values replaced with algebraic unknown values. Luckily, it is also easier because we don't have to support everything perfectly: we can always say that we don't know something, behave "conservatively", and the analysis would be slightly less precise but this is fine if there's not too much of these.

Modeling temporaries is one of such cases - this is one of the difficult parts of the language that we never did properly until now. One does not simply construct into a temporary region because in this case we're not even sure that it is indeed a temporary region in this part of the analyzer, it cannot be understood by looking at the construct-expression in the AST, which is something i'm working around now by introducing "construction contexts". And even if we were sure it's a temporary, we need to not only construct the object, but also know how to properly extend its lifetime and destroy it when lifetime ends, otherwise we'd encounter massive false positives (because it'd not be "we don't know something", it'd be "we know something but it's wrong"), which should also be part of the "context" in which construction takes place. So for now this part simply behaves conservatively.

Your particular case will be relatively tricky when f() is inlined during analysis. In this case the temporary will be constructed within the function, but in order to properly work with this temporary we'd need what the caller function is going to do with it - at least according to the AST that we currently have this is what we have to do. More info in http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - "== Return by value ==" part. CodeGen does it differently, treating a lot of AST rvalues as lvalues under the hood, but i don't think we can afford breaking this invariant for now, because, well, we're more complicated and fragile.

You're essentially in an area that's largely under construction at the moment, and i'm likely to address your issue in like nearest future, you can probably follow my progress on phabricator or in cfe-commits, or i can try to ping you when this case is supported.

It's https://reviews.llvm.org/D44124 and https://reviews.llvm.org/D44131 - after that return values should work. The former is about functions that can be "inlined" during analysis (i.e. we see what object is being returned), the latter is about functions that aren't inlined (but it should make it easier to track them).