Clang Analyzer false positives with relations between variables

Erik_Cederstrand · October 4, 2012, 10:17am

I'm working my way through a full Clang Analyzer report of FreeBSD source code: http://scan.freebsd.your.org/freebsd-head/, fixing bugs in FreeBSD and reporting false positives.

First of all, someone at LLVM just shaved 1000 reports off the list since Sunday. Thanks!

There's a large group of false positives that fall in the same category where the analyzer doesn't sufficiently account for relations between variables. Here's an example from http://llvm.org/bugs/show_bug.cgi?id=13426:

int foo(int y, int z) {
    int x;
    if (y == z) {
        x = 0;
    } else {
        if (y != z) {
            x = 1;
        }
    }
    return x;
}

which warns that x may be uninitialized. Here's a more real-world example in FreeBSD: http://scan.freebsd.your.org/freebsd-head/usr.sbin.mtree/2012-09-30-amd64/report-KuXNHJ.html#EndPath

But according to the implementation of parsekey() (called at line 177) at http://svn.freebsd.org/base/head/usr.sbin/mtree/misc.c then "value" is always 1 when "type" is F_FLAGS and thus val is always initialized at line 178 and thus the reported situation can never occur.

My question is how hard this would be to implement, at least starting with the simple example? Where would the code go in the LLVM tree?

Kind regards,
Erik

AnnaZaks · October 5, 2012, 11:51pm

I'm working my way through a full Clang Analyzer report of FreeBSD source code: http://scan.freebsd.your.org/freebsd-head/, fixing bugs in FreeBSD and reporting false positives.

First of all, someone at LLVM just shaved 1000 reports off the list since Sunday. Thanks!

There's a large group of false positives that fall in the same category where the analyzer doesn't sufficiently account for relations between variables. Here's an example from Invalid Bug ID

int foo(int y, int z) {
   int x;
   if (y == z) {
       x = 0;
   } else {
       if (y != z) {
           x = 1;
       }
   }
   return x;
}

which warns that x may be uninitialized. Here's a more real-world example in FreeBSD: http://scan.freebsd.your.org/freebsd-head/usr.sbin.mtree/2012-09-30-amd64/report-KuXNHJ.html#EndPath

But according to the implementation of parsekey() (called at line 177) at http://svn.freebsd.org/base/head/usr.sbin/mtree/misc.c then "value" is always 1 when "type" is F_FLAGS and thus val is always initialized at line 178 and thus the reported situation can never occur.

My question is how hard this would be to implement, at least starting with the simple example? Where would the code go in the LLVM tree?

The first action toward fixing the simple example, would be to add alpha-remaning support to the analyzer's constraint manager. While performing symbolic execution of the program, we cannot record the fact that x == y, so even this simplified example will not work:

int foo(int y, int z, int *p) {
  int *x;
  if (y == z)
    x = 0;
  if (y == z)
    x = p;
  return *x; // False positive: null pointer dereference reported.
}

This would not guarantee that the second example will be solved. For example, it looks like the 'parsekey()' function is in a separate translation unit. The analyzer is not yet capable of reasoning across translation unit boundaries.

One could argue that the fact that parsekey's return values have the dependency has to be recorded by the programmer. Without a better mechanism, an assert could be helpful.

Cheers,
Anna.

Erik_Cederstrand · October 9, 2012, 6:28pm

The first action toward fixing the simple example, would be to add alpha-remaning support to the analyzer's constraint manager. While performing symbolic execution of the program, we cannot record the fact that x == y, so even this simplified example will not work:

int foo(int y, int z, int *p) {
int *x;
if (y == z)
x = 0;
if (y == z)
x = p;
return *x; // False positive: null pointer dereference reported.
}

Thanks for the explanation. It's a bit over my head to implement but nice to know what's going on.

This would not guarantee that the second example will be solved. For example, it looks like the 'parsekey()' function is in a separate translation unit. The analyzer is not yet capable of reasoning across translation unit boundaries.

One could argue that the fact that parsekey's return values have the dependency has to be recorded by the programmer. Without a better mechanism, an assert could be helpful.

I'll have a look at it again.

Erik

Topic		Replies	Views
RFC clang analyzer false positives (for loop) Clang Frontend	15	82	August 27, 2016
Addressing uninitialized false positives via function/method initialization Clang Frontend	4	140	April 7, 2011
Clang Analysis of glibc 2.13 Clang Frontend	4	82	May 10, 2011
RFC clang analyzer false positives? Clang Frontend	1	106	August 26, 2016
Reporting false positives detected by Clang static analyzer Clang Frontend	6	102	April 13, 2016

Clang Analyzer false positives with relations between variables

Related topics