Clang Analysis of several open source projects.

Hi.

In case anyone is interested, I ran the clang analyzer on several open
source projects. (gcc, gdb, glib, ntp, openldap, openssl, postfix).

However, there are many issues found on most of those projects, which
are reasonably well known and widely used pieces of software. Which
makes me wonder if there arent just a lot of false positives here ?

The resulting reports can be found here :

http://lbalbalba.freezoka.net/ccc-analyzer/

Regards,

John Smith

Interesting to look at - though no doubt it’ll take a while to work through them all. I just started having a glance at the results for GCC (where it lists the “null passed as a nonnull argument” first) & the first one doesn’t entirely make sense to me, the second one is passing null to strncmp but with a length of 0. So perhaps the annotation is incorrect and it should be nonnull only when the length is non-zero. I don’t know what annotations are used to markup these properties & whether they are sufficiently expressive to handle such a feature.

Hi.

In case anyone is interested, I ran the clang analyzer on several open
source projects. (gcc, gdb, glib, ntp, openldap, openssl, postfix).

However, there are many issues found on most of those projects, which
are reasonably well known and widely used pieces of software. Which
makes me wonder if there arent just a lot of false positives here ?

The resulting reports can be found here :

http://lbalbalba.freezoka.net/ccc-analyzer/

Experience with static analysis says that almost all the issues will be false positives (at least in openssl).

Interesting to look at -

Thanks. Im thinking about running ccc-analyzer on some more reasonably
widely used projects, but I dont really know which one to take on
next. I tried apache-httpd, but the results were so few I really didnt
think it was worthwhile to post them. :wink: I also tried samba, but
couldnt even get ./confgiure to run properly on my system; and neither
did the people on the mailinglist and irc channel that I contacted.

though no doubt it'll take a while to work through them all.

Yeah, that would take an immense amount of time. I think it may be the
most worthwhile if people that are interested, perhaps take a look at
the project they are most familiar with or have the most affinity
with, or a bug class they have the most knowledge of. For example, the
issue 'Dereference of null pointer' seems to score pretty high on all
projects so far, so either this is *the* most common mistake made by C
developers, or this is an area where a lot of false positives are
generated.

I just started having a glance at the results for GCC (where it
lists the "null passed as a nonnull argument" first) & the first one doesn't
entirely make sense to me, the second one is passing null to strncmp but
with a length of 0. So perhaps the annotation is incorrect and it should be
nonnull only when the length is non-zero. I don't know what annotations are
used to markup these properties & whether they are sufficiently expressive
to handle such a feature.

Well, if you put it like that, it does indeed sound a little weird. It
makes sense that it should be nonnull only when the length is
non-zero... But then agian, im not an expert on this subject.
:wink:

Regards,

John Smith.

Hi.

In case anyone is interested, I ran the clang analyzer on several open
source projects. (gcc, gdb, glib, ntp, openldap, openssl, postfix).

However, there are many issues found on most of those projects, which
are reasonably well known and widely used pieces of software. Which
makes me wonder if there arent just a lot of false positives here ?

The resulting reports can be found here :

http://lbalbalba.freezoka.net/ccc-analyzer/

Experience with static analysis says that almost all the issues will be false positives (at least in openssl).

e.g. http://lbalbalba.freezoka.net/ccc-analyzer/scan-build-openssl-1.0.0d/report-x3HkoT.html#EndPath is bad analysis (the branch stuff needs to understand bitmaps to fix it - hmm, that could be a fun project).

This is indeed the argument against static analysis that I hear from
developers. But if this is universally known to be true, then why
bother with static analysis in the first place ? Isnt this part of the
project just a waste of time then ?

Regards,

John Smith.

Thanks for examining that one. Part of the point of posting this, is
that hopefully it will result in a better analyzer in the end by
eliminating as much false positives as possible.

Perhaps a bug report could/should be filed for this one ?...
:wink:

Regards,

John Smith.

e.g. http://lbalbalba.freezoka.net/ccc-analyzer/scan-build-openssl-1.0.0d/report-x3HkoT.html#EndPath is bad analysis (the branch stuff needs to understand bitmaps to fix it - hmm, that could be a fun project).

With only a cursory glance at the code - it looks like it’s parsing network traffic. Is it possible that the data is not in the correct format and malicious/erroneous packets could be null? Or is it that some up-front validation was done, but the data wasn’t permanently converted at that time but still used as a raw (though now verified correct) byte buffer?

  • David

Sometimes it finds a few real bugs. I think it found 5 or 10 bugs in
ClamAV in the past years.
The signal-to-noise ratio is quite high though, and some reports require
careful analysis just to determine whether clang's annotated execution
path is possible at all.

The bugs it found were a few NULL derefs, one division by zero, and a
few uninitialized value usage.

Best regards,
--Edwin

Experience with static analysis says that almost all the issues will be
false positives (at least in openssl).

This is indeed the argument against static analysis that I hear from
developers. But if this is universally known to be true, then why
bother with static analysis in the first place ? Isnt this part of the
project just a waste of time then ?

Static analysis should be used during development not after debugging is complete. That’s where the real value is.

Experience with static analysis says that almost all the issues will be
false positives (at least in openssl).

This is indeed the argument against static analysis that I hear from
developers. But if this is universally known to be true, then why
bother with static analysis in the first place ? Isnt this part of the
project just a waste of time then ?

We have used Coverity on RTEMS and it found a few places
that we could have written clearer, easier to analyse code
and a couple of real bugs.

Other places are questionable. Telling you strn*() is better
than the without 'n' version is not so helpful.

I tried to run it on RTEMS also but the cross nature of
RTEMS got in the way too much and I had to give up.

I am interested. Any bug found by a program is better
than a bug found by a user.

Sorting out 50 real bugs from a few hundred analyzer results is vastly easier than finding them in 200,000 lines of code. The static analyzer is a tool (and a very useful one!), not a miracle. False positives can also point out code that's difficult to reason about and might be good to refactor.

  David

Experience with static analysis says that almost all the issues will be
false positives (at least in openssl).

This is indeed the argument against static analysis that I hear from
developers. But if this is universally known to be true, then why
bother with static analysis in the first place ? Isnt this part of the
project just a waste of time then ?

Regards,

John Smith.

Sorting out 50 real bugs from a few hundred analyzer results is vastly easier than finding them in 200,000 lines of code. The static analyzer is a tool (and a very useful one!), not a miracle. False positives can also point out code that's difficult to reason about and might be good to refactor.

Agreed. If an analyser has trouble reasoning
about a piece of code, you have to wonder
that code. RTEMS reworked a lot of code between
doing static analysis and performing instruction
level test coverage.

Hello,

Interesting to look at -

Thanks. Im thinking about running ccc-analyzer on some more reasonably
widely used projects, but I dont really know which one to take on
next.

I would have a few suggestions:

- Lua
- libpng
- zlib
- freetype

And any other of the widely used Open Source libraries. Sure, they are fairly small,
but as they are widely used (I use them, at least ;-), finding bugs in them would be
very useful.

Oh, and while I'm at it: How about clang's sister, libc++?

Jonathan

Thanks for the suggestions, ill look into those.

But my main point wasnt really finding bugs in the projects
themselves, but finding & fixing bugs in the analyzer (by decreasing
the potential for false positives).

Thanks,

Regards,

John Smith

Scanned and added them to: http://lbalbalba.freezoka.net/ccc-analyzer/
But they show really few issues. So I guess that that's not all that
relevant to ccc-analyzer then ?

Thanks John. That’s what I am hopeful for as well.

To make this exercise the most constructive, we need actual bug reports against the analyzer. Diagnosing a sea of reports, and complaining that there are too many false positives just really isn’t constructive or helpful on its own.

Typically the bug reports have the following characteristics:

a) have a concise but precise diagnosis of what the analyzer isn’t reasoning about correctly

b) provides a test case of a preprocessed file that can be used later to reproduce the issue. (also include the platform/arch you are on when filing the report)

The scan-build results are useful, but they ultimately lack the ability to be replayed in a debugger session, which is useful when debugging the analyzer. Typically, I have found three kinds of analyzer false positives:

  1. The analyzer doesn’t know about some higher-level program invariant that the developer knows about and is implicitly relying upon. The discussion there should be how to help the analyzer become more educated about such invariants. Sometimes the answer is interprocedural analysis, sometimes its annotations, etc. There is some actual design tradeoffs to be made here in “fixing” these kind of issues. In some cases, restructuring the original code to make it easier to be reasoned about is the best answer (but that depends on who is voicing the opinion and on what codebase).

  2. The analyzer has an outright bug in handling a specific edge cases. Typically these require a modest amount of change to the analyzer, but having a test case is really key to diagnosing these issues. These are honestly the easiest issues to fix.

  3. The analyzer has an algorithmic problem with reasoning about some code. For example, the analyzer doesn’t currently reason about bit fields. It also lacks the ability to reason about linear constraints (e.g., a + b > c). Some of these are known issues, others are not. Having concrete examples really helps.

Beyond filing static analyzer bug reports, it would also be great if anyone wanted to help with any of the following projects:

  1. Update or overhaul http://clang-analyzer.llvm.org have more information about extracting maximum value from the analyzer.

  2. Making scan-build more awesome by making it more turn key, or having a much better way of presenting analysis results. There’s a ton of stuff we could do here. I’m not a web developer, so scan-build’s HTML reports are what they are because I don’t have the expertise to make them better.

  3. Integrating the analyzer into other IDEs, such as Eclipse.

  4. Working on helping to make the analyzer’s precision better, or working on new checkers.

I really should document all of this on the clang-analyzer website.

Anyhow, thanks again everyone for running the analyzer on these projects. I do appreciate your level of enthusiasm; my main concern is channeling that enthusiasm in a way that has maximum value.

Cheers,
Ted

But my main point wasnt really finding bugs in the projects
themselves, but finding & fixing bugs in the analyzer (by decreasing
the potential for false positives).

Thanks John. That’s what I am hopeful for as well.

To make this exercise the most constructive, we need actual bug reports against the analyzer. Diagnosing a sea of reports, and complaining that there are too many false positives just really isn’t constructive or helpful on its own.

Typically the bug reports have the following characteristics:

a) have a concise but precise diagnosis of what the analyzer isn’t reasoning about correctly

b) provides a test case of a preprocessed file that can be used later to reproduce the issue. (also include the platform/arch you are on when filing the report)

One thing that could be very usefully added to the output of scan-build is exactly this prepocessed file, which is otherwise painul to prepare…

The scan-build results are useful, but they ultimately lack the ability to be replayed in a debugger session, which is useful when debugging the analyzer. Typically, I have found three kinds of analyzer false positives:

  1. The analyzer doesn’t know about some higher-level program invariant that the developer knows about and is implicitly relying upon. The discussion there should be how to help the analyzer become more educated about such invariants. Sometimes the answer is interprocedural analysis, sometimes its annotations, etc. There is some actual design tradeoffs to be made here in “fixing” these kind of issues. In some cases, restructuring the original code to make it easier to be reasoned about is the best answer (but that depends on who is voicing the opinion and on what codebase).

I think restructuring the code is a perfectly valid thing to ask for - it would help if there were some body of knowledge about what kind of restructuring might help. Certainly in OpenSSL I often add unneeded initialisers simply to silence compiler warnings, for example.

  1. The analyzer has an outright bug in handling a specific edge cases. Typically these require a modest amount of change to the analyzer, but having a test case is really key to diagnosing these issues. These are honestly the easiest issues to fix.

  2. The analyzer has an algorithmic problem with reasoning about some code. For example, the analyzer doesn’t currently reason about bit fields. It also lacks the ability to reason about linear constraints (e.g., a + b > c). Some of these are known issues, others are not. Having concrete examples really helps.

Beyond filing static analyzer bug reports, it would also be great if anyone wanted to help with any of the following projects:

  1. Update or overhaul http://clang-analyzer.llvm.org have more information about extracting maximum value from the analyzer.

  2. Making scan-build more awesome by making it more turn key, or having a much better way of presenting analysis results. There’s a ton of stuff we could do here. I’m not a web developer, so scan-build’s HTML reports are what they are because I don’t have the expertise to make them better.

FWIW I think they’re pretty reasonable, but ways to make bug reporting cheaper would definitely be nice.

  1. Integrating the analyzer into other IDEs, such as Eclipse.

  2. Working on helping to make the analyzer’s precision better, or working on new checkers.

For anyone contemplating this: clang’s static analysis code is a pleasure to work with: I recommend the experience (and wish I had more spare time myself).

We can certainly provide this as a scan-build option. It’s probably a bit of perl script hackery, but it could be done. We wouldn’t want to do it all the time, as those preprocessed files can get big.

Hi Ted,

But my main point wasnt really finding bugs in the projects
themselves, but finding & fixing bugs in the analyzer (by decreasing
the potential for false positives).

Thanks John. That's what I am hopeful for as well.

To make this exercise the most constructive, we need actual bug reports against the analyzer. Diagnosing a sea of reports, and complaining that there are too many false positives just really isn't constructive or helpful on its own.

FreeBSD is continually running the analyzer on our code, which has already uncovered a lot of bugs. I have previously sifted through a lot of the bug reports, and a very large part of the false positives fall in the category of the analyzer not detecting that a function never returns, i.e. the IPA not being smart enough.

I created a bug report some time ago http://llvm.org/bugs/show_bug.cgi?id=8914 I realize that fixing this is non-trivial, but it would be nice if the analyzer could at least handle the following:

if foo():
  x = 5
else:
  exit()
bar(x)

without complaining that x might be uninitialized.

Kind regards,
Erik