[patch] Add an attribute to reduce spurious memory leak warnings

The static analyser has no concept that ownership of memory may pass elsewhere when a pointer is passed to certain functions. So, I added an attribute, provisionally called analyzer_free, to tell the analyzer to treat a function argument as ‘free’ for the purposes of malloc analysis. Effectively this makes the memory somebody else’s problem.


clang-attribute_free.patch (11.4 KB)

Hi Andrew,

I think this is a promising start, but I don't think it's a general solution on it's own for malloc/free checking, so I don't want it going in as is. I'll explain!

The problem with malloc/free checking is that code frequently creates wrappers around these functions. malloc/free are also so pervasive that I am doubtful that users would adopt this attribute extensively without a little more care in its design. I'm not opposed to such annotations, but we should carefully consider how they would be used and assess their potential adoption. Adding an attribute is also a language change that we shouldn't take lightly.

My general feeling is that without inter-procedural analysis or better summaries of the called functions, if we really want the malloc/free checker to be turned on by default we probably should always assume that a pointer could escape and possibly might be freed. This means that we may miss many potential leaks, but we won't flag spurious leaks or use-after-free errors. Thus the default behavior should be that we assume the program is *correct* unless we have strong evidence otherwise.

Annotations, however, could be used to educate the checker about deviating from such default assumptions. For example, an annotation could indicate that a function acts like a call to "free", that it holds onto the memory and frees it later (meaning that it can still be used after the call, but it won't be leaked), or an annotation indicating that pointer value is not held on in any way (making the call essentially a no-op from the perspective of checking malloc/free). Such annotations would allow the checker to make strong assumptions, but that should only be to override the conservative defaults.

In my graduate studies I actually investigated adding "ownership" annotations that could be applied to the return values of functions as well as their arguments. One set of annotations I looked at were the following:

  returns ownership
  not-returns ownership

  claims ownership
  not-claims ownership

and then possibly a fifth category for "claims but still can be used". For example:

  p = malloc();

In this case "malloc()" essentially has the "returns ownership" annotation and "free()" has the "claims ownership" annotation. With these annotations, we actually can check malloc/free (and similar properties) pretty well in a local manner (particularly for C code). The problem with these annotations is that users shouldn't be burdened with coming up with these annotations. It turns out that we can often infer them using statistical inference (in the absence of having precise function summaries) but that's somewhat beyond the scope of the analyzer's current workflow.

I think a set of annotations could be useful here for doing the malloc/free analysis well. We'd like to tie them together with either similar naming or grammar, and ideally allow them to be "tagged" attributes so we don't have to embed in the compiler knowledge of all the malloc-like APIs out there. E.g, something like:

  __attribute__((checker_ownership_claims_owned, "malloc"))__

Which is basically your idea of 'analyzer_free' except that it is more general, and ties it specifically to a "malloc" analysis. Conceptually if we also wanted to model 'fopen'/'fclose', etc. we could have:

  __attribute__((checker_ownership_returns_owned, "FILE*"))__

Note the string is just a "tag" that the checkers could cue from. In this way, the malloc/free and fopen/fclose checker could actually be largely unified, and then cue off the appropriate tagged attributes to make smarter decisions.

This is probably a longer response than you expected. My high-level point is that if we want to add attributes to enhance this and similar checkers, we should design it well with a clear and cohesive design of how it impacts the language as well as how it would impact the checker, especially since I think there is something more general here to think about (especially since we're possibly thinking of a set of annotations). On its own, I don't think the change in this patch should go in until its evolved with that process. Does that make sense?