C++ analysis priorities

Slightly off topic, but something I've been wondering for a while.
Without actually investing this, but instead just reading the list, it
seems that most of the checkers that are written are specific to known
functions. For example, the malloc checker works for known allocation
functions rather than arbitrary ones. Is there a reason why there isn't
an effort to make this more extensible so that any function that is
annotated to be an allocation function can benefit from the malloc
checker (other than the obvious issue of resources and prioritization)?
And similar for other checkers. I'm mostly interested in why this
possibility is never explicitly talked about.

Having annotations is something we are definitely interested in.

I think, one reason why we still do not have them is that designing expressive enough annotations is not a trivial task. Writing an effective checker for the most popular functions first would provide feedback on what they should be.

Do you have examples on the specific scenarios on what you'd like to annotate (or are you mostly talking about the malloc/free like pairs)?

Thanks,
Anna.

Having annotations is something we are definitely interested in.

I think, one reason why we still do not have them is that designing expressive enough annotations is not a trivial task. Writing an effective checker for the most popular functions first would provide feedback on what they should be.

Do you have examples on the specific scenarios on what you’d like to annotate (or are you mostly talking about the malloc/free like pairs)?

AFAIk, if this is for malloc/free pairs, you can use the ownership attributes (ownership_holds, ownership_returns, ownership_takes).

void __attribute((ownership_returns(malloc))) *malloc(size_t);
void __attribute((ownership_takes(malloc, 1))) free(void *);

If you have your own allocator that is not malloc based, you can use an other identifier.

void __attribute((ownership_returns(my_pool))) *my_malloc(size_t);
void __attribute((ownership_takes(my_pool, 1))) my_free(void *);

Thanks,
Anna.

Slightly off topic, but something I’ve been wondering for a while.

Without actually investing this, but instead just reading the list, it

seems that most of the checkers that are written are specific to known

functions. For example, the malloc checker works for known allocation

functions rather than arbitrary ones. Is there a reason why there isn’t

an effort to make this more extensible so that any function that is

annotated to be an allocation function can benefit from the malloc

checker (other than the obvious issue of resources and prioritization)?

And similar for other checkers. I’m mostly interested in why this

possibility is never explicitly talked about.

From: Ted Kremenek

Sent: 1/11/2012 9:08 PM

To: Tom Care

Cc: cfe-dev@cs.uiuc.edu

Subject: Re: [cfe-dev] C++ analysis priorities

Hi all,

I’m looking at possibly contributing to C++ analysis support over the next few months as part of a master’s project. I have a rough idea of things that need to be implemented, but I am not sure how to prioritise them. I am hoping that the community can assist me here - what is currently stopping your programs from being analyzed?

My general goal is to implement features that will assist in analyzing the LLVM/Clang codebase, however looking at the current code it seems that existing support for some language features will have to be improved as well (eg ctor/dtors.)

Thanks,

Tom

Hi Tom,

I see that C++ support can grow in largely two directions:

(1) Core infrastructure, with interprocedural support for inlining C++

constructors/destructors to support RAII. This entails a bunch of

intermediate infrastructure work to get there.

(2) Checkers. Having C+±specific checkers will make the analyzer

more useful for C++ developers. This could be as simple as catching

mismatches between new[] and delete/new and delete[], and many others,

including providing checkers for correct usage of widely used C++ APIs

(e.g., Boost).

I think both are worth making progress on, and to do (2) some progress

will likely need to be made on (1).

As far as infrastructure work, here are some areas that need work:

(a) Better representation of C++ constructor and destructor calls in

the CFG. There is a bunch already there, but as it has been observed

on the list lately there are serious deficiencies and outright bugs.

Ideally we should be able to represent the complete initialization

logic of a constructor in the CFG, from calling the constructor of a

parent class to correctly sequencing the initializers.

Along this trajectory, there are various optimizations we can do to

the CFG representation itself to make it easier to represent

destructor calls. What we do know is a bit complicated, IMO.

(b) ExprEngine “inlining” support for C++ constructors/destructors.

Interprocedural analysis is one area we would like to grow the

analyzer, and one technique to do that is to simply “inline” function

calls for function bodies that are available. Some of this has been

prototyped in the analyzer already, and there is currently work on

making it more solid, at least for inlining simple functions. Being

able to do this well for simple C++ objects that are used for RAII,

for example, will be really critical for making some checkers really

shine for C++.

(c) Support for additional C++ expressions. In ExprEngine::Visit(),

you can see a whole bunch of C++ AST expression kinds that are simply

not handled, and halt static analysis altogether:

case Stmt::CXXBindTemporaryExprClass:

case Stmt::CXXCatchStmtClass:

case Stmt::CXXDependentScopeMemberExprClass:

case Stmt::CXXPseudoDestructorExprClass:

case Stmt::CXXThrowExprClass:

case Stmt::CXXTryStmtClass:

case Stmt::CXXTypeidExprClass:

case Stmt::CXXUuidofExprClass:

case Stmt::CXXUnresolvedConstructExprClass:

case Stmt::CXXScalarValueInitExprClass:

case Stmt::DependentScopeDeclRefExprClass:

case Stmt::UnaryTypeTraitExprClass:

case Stmt::BinaryTypeTraitExprClass:

case Stmt::ArrayTypeTraitExprClass:

case Stmt::ExpressionTraitExprClass:

case Stmt::UnresolvedLookupExprClass:

case Stmt::UnresolvedMemberExprClass:

case Stmt::CXXNoexceptExprClass:

case Stmt::PackExpansionExprClass:

case Stmt::SubstNonTypeTemplateParmPackExprClass:

case Stmt::SEHTryStmtClass:

case Stmt::SEHExceptStmtClass:

case Stmt::SEHFinallyStmtClass:

Further, there are some AST expressions we handle, but don’t do a good job:

// We don’t handle default arguments either yet, but we can fake it

// for now by just skipping them.

case Stmt::SubstNonTypeTemplateParmExprClass:

case Stmt::CXXDefaultArgExprClass:

and support for C++ lambdas as they become real in Clang.

Infrastructure is only part of the story; ultimately people want to

find bugs. Some possible checkers include:

(1) mismatched new/delete[] new[]/delete, or malloc() and delete, etc.

(2) productizing the invalid iterator checker

(3) making sure a destructor blows away everything a constructor

creates/initializes. This is a hard one, but could be REALLY useful

if done well. This could easily take up a good portion of your thesis

work, and would be interesting work to write about.

(4) Various checks for “Effective C++” rules.

(5) securely using std::string, i.e.

http://www.cert.org/archive/pdf/sd-bestpractices-strings060914.pdf

(6) CERT’s C++ secure coding standard,

https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=637,

lots of potential checks here, not all of them specific to c++, but

general goodness.

Cheers,

Ted


cfe-dev mailing list

cfe-dev@cs.uiuc.edu

http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


cfe-dev mailing list

cfe-dev@cs.uiuc.edu

http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

– Jean-Daniel

Having annotations is something we are definitely interested in.

I think, one reason why we still do not have them is that designing expressive enough annotations is not a trivial task. Writing an effective checker for the most popular functions first would provide feedback on what they should be.

Do you have examples on the specific scenarios on what you’d like to annotate (or are you mostly talking about the malloc/free like pairs)?

AFAIk, if this is for malloc/free pairs, you can use the ownership attributes (ownership_holds, ownership_returns, ownership_takes).

void __attribute((ownership_returns(malloc))) *malloc(size_t);
void __attribute((ownership_takes(malloc, 1))) free(void *);

If you have your own allocator that is not malloc based, you can use an other identifier.

void __attribute((ownership_returns(my_pool))) *my_malloc(size_t);
void __attribute((ownership_takes(my_pool, 1))) my_free(void *);

Yes, indeed. These are currently supported by the experimental malloc checker.
However, we definitely could use annotations on a larger scale.

Anna.

Hi Ahmed,

As others pointed out, there is interest in this. The analyzer already recognizes attributes in to extend its functionality, so when you speak about “annotations” it really depends on the scope of the work.

Designing a general annotation system is essentially language design. It’s a potentially worthy endeavor if it is well focused, but it is a big effort. Instead, the focus on attributes has largely been demand driven, focusing on simple but well-defined objectives. That negative of that approach is that it can be myopic, but it does result in progress. A reasonable trajectory is to provide such focused annotations, and when a more general solution arises phase that old syntax out and replace with the newer syntax. There are various rollout strategies for that approach.

Did you have a particular problem you were hoping to solve with annotations? On one extreme, one could have an annotation system powerful enough to define checkers, and on the other extreme you can have annotations to handle spot issues. There is plenty of interesting territory between those extremes.

Cheers,
Ted

One area where clang could use some work is in general infrastructure
for supporting more expressive annotations. The thread-safety
annotations, for example, can contain expressions that refer to
variables within the current lexical scope, e.g.

  void foo(MyList *list) EXCLUSIVE_LOCKS_REQUIRED(list->mu_) { }

I've been working on making sure that attributes which contain
expressions are handled properly by the rest of clang -- e.g. use late
parsing in classes, instantiated with templates, etc.

The thread safety annotations use GNU attributes, and are only applied
to variables and functions. I have not yet added such support for
C++11 attributes, which can appear in many more locations.
Comprehensive support for more expressive C++11 attributes would,
IMHO, be a good place to start for people interested in
attribute-based static analysis.

  -DeLesley