bug with NonNullParamChecker?

Hi,

As I was testing NonNullParamChecker this afternoon, I ran into this troubling example:

// --------- example 1 ------------
void *getNull() {
  return 0;
}

void check(void *p) __attribute__(( nonnull ));
void check(void *p) {
}

int main(int argc, char **argv) {
  void *p = getNull();
  check(p);
  return 0;
}
// --------------------------------

This code gives no warning on the versions of clang that I could test:
- Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)
- clang version 3.3 (trunk 180768)
- clang version 3.3 (trunk 180907) (llvm/trunk 180768)

To get an error one I have to replace p = getNull() by p = 0.

First I was tempted to think it was just a limitation of the core analyzer, but
1) I obtain an error with a similar example where the nonnull attribute is replaced by a division by zero (see example 2 at the end)

2) I debugged the file NonNullParamChecker.cpp : I am very new to this codebase but it seems that a report is actually emitted (lines 119-139). Then it never shows up for some reason...

Is this a bug? If not, how can we improve this checker?

Thanks!

Hi, Mathieu. You’ve run into one of the analyzer’s haziest features: false positive suppression based on return values. To put it simply, if there’s a null pointer dereference, and it turns out the null pointer came from an inlined function, we suppress the warning.

Why? Well, for better or for worse the analyzer does not have perfect information. Consider an example like this:

targetMap[targetName]->process(input);

Seems harmless, right? Well, what does the map’s operator look like?

Target *getItem(StringRef name) {
if (map contains a target for name)
return it
return NULL;
}

Most likely, the analyzer doesn’t have perfect knowledge about the contents of the map, so it assumes both branches can be taken—which is totally reasonable! However, the caller could very well have some extra knowledge: the targetName is checked at the beginning of the program, and so the map lookup will never fail at this point.

Usually, the response to this sort of bug is to add an assertion, but in this case you wouldn’t be able to do that without introducing a temporary variable and breaking apart the original expression.

When we first turned on C++ inlining, this sort of example resulting in hundreds of false posiitves on the LLVM code base. It did find some true positives, such as unchecked calls to dyn_cast, but in general it was just not compatible with the way people write code. A lot of times, generic functions have to handle error cases that the caller knows won’t happen, but would find annoying or difficult to actually assert() about. We’ve found that a lot of these cases correspond to a null value being returned, which is entirely a heuristic.

The false positive suppression heuristics can definitely be improved, but in general false negatives are better for the analyzer than false positives. The former reduces bugs caught and confidence in the tool, but the latter results in people turning off a checker wholesale or not running the analyzer at all.

Anyway, thanks for the report, and if you come up with any “counter-suppression” ideas, please send them to the list!

Jordan

Hi Jordan,

Thanks for your detailed answer. I finally found the configuration option: clang -cc1 -analyze -analyzer-checker=core -analyzer-config suppress-null-return-paths=false …

On a slightly different topic, has anyone ever considered using the core of clang-analyzer to bring more fine-grained annotations to life, in the spirit of Findbugs for Java?

http://findbugs.sourceforge.net/manual/annotations.html

From this page
http://clang-analyzer.llvm.org/annotations.html
I understand that we can already attach attributes to various syntactic elements (not sure exactly which ones yet, though).

Is there any reason to believe that this project would require more than just writing a new checker?

– Mathieu

Hi list,

Is anyone in the position to offer some insight on the question (below) about possibly using Clang analyzer for more “contract-based” null-pointer analyses?

Hi, Mathieu…sorry for the delay. There’s no reason why a new checker wouldn’t be able to check new kinds of annotations, but the annotations it supports today are the same attributes provided by the Clang compiler. Now, we can always add attributes to the compiler, but that’s not something we usually want to do lightly.

Anna, Ted, and I have kicked around the idea of a more generic ‘analyzer_annotate’ attribute, which could make one-off analyzer language extensions cheaper, but we didn’t quite finish designing it. Does this sound like the sort of thing you’re asking about? This would require modifying Clang’s attribute parser to support this generic attribute and provide convenient access to its possible fields for any checkers.

(Meanwhile, we’ve hacked some things together using the existing string-only ‘annotate’ attribute; see IvarInvalidationChecker. You could use this to prototype your checker if you didn’t want to work on the general attribute problem just yet.)

In this particular case, something else we’ve thought of is an attribute that stipulates that a function return value is always non-null, and another requiring that the result is checked. These also haven’t been fully designed yet (are they inherited? do they piggyback on existing attributes like ‘nonnull’, or do they get their own syntax?).

Jordan

Hi Jordan,

Thanks for your answer. My comments below.

– Mathieu

Hi, Mathieu…sorry for the delay. There’s no reason why a new checker wouldn’t be able to check new kinds of annotations, but the annotations it supports today are the same attributes provided by the Clang compiler. Now, we can always add attributes to the compiler, but that’s not something we usually want to do lightly.

Excellent. Sure!

Anna, Ted, and I have kicked around the idea of a more generic ‘analyzer_annotate’ attribute, which could make one-off analyzer language extensions cheaper, but we didn’t quite finish designing it. Does this sound like the sort of thing you’re asking about? This would require modifying Clang’s attribute parser to support this generic attribute and provide convenient access to its possible fields for any checkers.

Yes this is totally what I am interested in.

Ideally (some of) these custom attributes should be attachable to types so that we can annotate structures in depth and share annotations between function declarations.

See example below.

(Meanwhile, we’ve hacked some things together using the existing string-only ‘annotate’ attribute; see IvarInvalidationChecker. You could use this to prototype your checker if you didn’t want to work on the general attribute problem just yet.)

In this particular case, something else we’ve thought of is an attribute that stipulates that a function return value is always non-null, and another requiring that the result is checked. These also haven’t been fully designed yet (are they inherited? do they piggyback on existing attributes like ‘nonnull’, or do they get their own syntax?).

As far as semantics is concerned, a solution that comes to my mind is to hack the usual 4-value lattice with a 5th “magic” value duplicating the “absurd” state. This 5-th value would propagate but otherwise would not be produced by unification.

nullable (check required)
/
/
null nonnull
\ /
\ /
absurd (unreachable)
magic (don’t care) ← actually the default in the absence of annotation

// merging values from possible paths
union(null, nonnull) = nullable // one of the path explicitly null-ed the value, we can’t feed a function requiring a nonnull value anymore
union(magic, null) = null // idem

union(magic, nonnull) = nonnull // ignore the magic value and pretend we know it is nonnull (strange but I could not find any problematic example)

// unifying values
intersection(null, nonnull) = absurd // cut the path
intersection(magic, nonnull) = magic // continue to ignore this runtime value

Here is one motivating example using this vocabulary as type annotations.

struct list;
typedef struct list nullable list_t; // activate null-value checking

struct list {
int key;
struct blob value
list_t next; // here as well
};

int buggy_length(list_t l) {
int x = 1;
while(l → next) { // boom
x++;
l = l → next;
}
return x;
}

list_t buggy_next(struct list st) {
return st.next; // re-boom
}

int find(list_t l, int value, struct blob *nonnull result) {
// some code to find a node in the list and write it to *result
}