Automatic scan-build on the LLVM toolchain

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

Note that:
* polly is currently not analyzed. I had to disable the build until the
new version of cloog is available.
* compiler-rt is not analyzed because it does not respect the CC/CXX
argument to use the clang built locally [3]. Should I report a bug here?

This work is done through the Debian/LLVM jenkins instance [2].
The report is updated twice a day.
The scan-build/clang used to produce the report is the one published on
http://llvm.org/apt/. That means that a new feature/fix done on
scan-build will appear only about a day after on report. As a side
effect, it tests automatically the packages distributed on llvm.org/apt/.

Sylvestre

[1] http://buildd-clang.debian.net/coverage/

[2] Example:
/tmp/buildd/llvm-toolchain-snapshot-3.4~svn185728/build-llvm/Release/bin/clang
-fno-exceptions -fPIC -funwind-tables
-I/tmp/buildd/llvm-toolchain-snapshot-3.4~svn185728/projects/compiler-rt/lib
-I/tmp/buildd/llvm-toolchain-snapshot-3.4~svn185728/projects/compiler-rt/include
-Wall -Werror -O3 -fomit-frame-pointer -m64 -fPIE -fno-builtin
-gline-tables-only -fno-rtti -DASAN_FLEXIBLE_MAPPING_AND_OFFSET=1 -c -o
/tmp/buildd/llvm-toolchain-snapshot-3.4~svn185728/build-llvm/tools/clang/runtime/compiler-rt/clang_linux/asan-x86_64/x86_64/SubDir.lib__sanitizer_common/sanitizer_symbolizer_itanium.o
/tmp/buildd/llvm-toolchain-snapshot-3.4~svn185728/projects/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_itanium.cc

[3] http://llvm-jenkins.debian.net/

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

Note that:

  • polly is currently not analyzed. I had to disable the build until the
    new version of cloog is available.
  • compiler-rt is not analyzed because it does not respect the CC/CXX
    argument to use the clang built locally [3]. Should I report a bug here?

I’m not sure what bug to report here, if any, compiler-rt deliberately uses the just-built clang because it has to be in lock-step with it.

This would be one of those times a compilation database, emitted by clang itself or the build system, would be a better solution than the scan-build style interception.

This work is done through the Debian/LLVM jenkins instance [2].
The report is updated twice a day.
The scan-build/clang used to produce the report is the one published on
http://llvm.org/apt/. That means that a new feature/fix done on
scan-build will appear only about a day after on report. As a side
effect, it tests automatically the packages distributed on llvm.org/apt/.

Any reason this can’t be a two stage and use the previous build to analyze the next?

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

Note that:
* polly is currently not analyzed. I had to disable the build until the
new version of cloog is available.
* compiler-rt is not analyzed because it does not respect the CC/CXX
argument to use the clang built locally [3]. Should I report a bug here?

I'm not sure what bug to report here, if any, compiler-rt deliberately
uses the just-built clang because it has to be in lock-step with it.

Or a wish to have compiler-rt checked with scan-build (while still using the
just-built clang)

This work is done through the Debian/LLVM jenkins instance [2].
The report is updated twice a day.
The scan-build/clang used to produce the report is the one published on
http://llvm.org/apt/. That means that a new feature/fix done on
scan-build will appear only about a day after on report. As a side
effect, it tests automatically the packages distributed on

llvm.org/apt/ <http://llvm.org/apt/>.

Any reason this can't be a two stage and use the previous build to
analyze the next?

For two main reasons:
* it is faster (I don't have to build clang during this process)
* it is easier in the current process of building the package.

However, I am planning to do it at some point.

Sylvestre

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

Note that:

  • polly is currently not analyzed. I had to disable the build until the
    new version of cloog is available.
  • compiler-rt is not analyzed because it does not respect the CC/CXX
    argument to use the clang built locally [3]. Should I report a bug here?

I’m not sure what bug to report here, if any, compiler-rt deliberately
uses the just-built clang because it has to be in lock-step with it.
Or a wish to have compiler-rt checked with scan-build (while still using the
just-built clang)

I’m not entirely clear on how scan-build works, but I assumed it redefined cxx and cc to run the analyzer and then run the old value of cxx and cc, so I’m not sure how the build system for compiler rt would gracefully support that. The build wants yto set cxx and cc to point to the just built clang, which overrides the value set by scan-build, losing the interception.

Sylvestre Ledru wrote

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

Good job. I see my work here is done. :wink:

- John Smith.

Thanks for setting this up, Sylvestre! And John, your work running scan-build on all sorts of projects (not just LLVM) has been very helpful. :slight_smile:

Jordan

Sylvestre Ledru wrote

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

So... this seems like something I could contribute. Small, isolated
fix with an automated way of knowing when I've fixed it.

However, I don't want to duplicate work being done by someone else as
I've just subscribed to the list. Are these being turned
automatically into bug tickets? Do I just "claim" some of these and
then submit a patch?

Thanks.

By the way, would it be possible for me to have a copy of the script
that compiles&generates the report ? I have a shell script that
basically does that, but if this is something like a buildbot module
or something im interested in using something like that for another
project.

Regards,

John Smith.

I see no one has answered this one yet, so ill have a go here:

I doubt the report is 'automagically' turned into bug reports, or that
(if even possibly, certainly requiring more skills that I have) it
would even be desirable. Part of the process is to determine if you
are dealing with a genuine bug or a false positive. And as icing on
the cake, if it is a false positive, maybe even a modification of the
checker to prevent it from generating similar false positives in the
future.

I guess the best way to go would be announcing on this list that youre
looking into a certain class of bugs in the report and ask if anyone
else is doing that already. If noone answers, I assume it would be
safe to claim the bug(s) and start working

Regards,

John Smith.

Sylvestre Ledru wrote

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

So... this seems like something I could contribute. Small, isolated
fix with an automated way of knowing when I've fixed it.

However, I don't want to duplicate work being done by someone else as
I've just subscribed to the list. Are these being turned
automatically into bug tickets? Do I just "claim" some of these and
then submit a patch?

I see no one has answered this one yet, so ill have a go here:

I doubt the report is 'automagically' turned into bug reports, or that
(if even possibly, certainly requiring more skills that I have) it
would even be desirable. Part of the process is to determine if you
are dealing with a genuine bug or a false positive. And as icing on
the cake, if it is a false positive, maybe even a modification of the
checker to prevent it from generating similar false positives in the
future.

The nice part about bugs in the LLVM database is that they would
either be bugs in the code (true positives) or bugs in the analyzer
(false positives) - so either way the bug could be used to track the
resolution. (assuming that there's no "acceptable false positive"
without some kind of suppression mechanism, which seems like a
reasonable goal - if a project can't hold itself at/towards "clean"
that seems like a problem)

So I think it might be worth considering an auto-filing system, if
anyone wanted to spend the time to do so. (one issue might be figuring
out how to not file the same issues again on future runs - maybe even
detect when they've been fixed (though that would be hard - since the
code might change so it doesn't trigger the warning anymore, but if
it's a false positive that doesn't mean the bug in the analyzer has
been fixed - but if someone's not looked at/resolved the bug, maybe
it's worth just resolving the bug as no-repro & moving on))

Sylvestre Ledru wrote

Hello,

After setting an automatic code coverage tool [1], I just plugged an
automatic scan-build on the LLVM toolchain:

http://buildd-clang.debian.net/scan-build/

So… this seems like something I could contribute. Small, isolated
fix with an automated way of knowing when I’ve fixed it.

However, I don’t want to duplicate work being done by someone else as
I’ve just subscribed to the list. Are these being turned
automatically into bug tickets? Do I just “claim” some of these and
then submit a patch?

I see no one has answered this one yet, so ill have a go here:

I doubt the report is ‘automagically’ turned into bug reports, or that
(if even possibly, certainly requiring more skills that I have) it
would even be desirable. Part of the process is to determine if you
are dealing with a genuine bug or a false positive. And as icing on
the cake, if it is a false positive, maybe even a modification of the
checker to prevent it from generating similar false positives in the
future.

The nice part about bugs in the LLVM database is that they would
either be bugs in the code (true positives) or bugs in the analyzer
(false positives) - so either way the bug could be used to track the
resolution. (assuming that there’s no “acceptable false positive”
without some kind of suppression mechanism, which seems like a
reasonable goal - if a project can’t hold itself at/towards “clean”

that seems like a problem)

I agree that it would be great if we could get to a state where LLVM+clang+… codebase is static analyzer warnings free.
I am not sure that automated bug filing for each reported issue is the best approach. Many false positives (“analyzer is wrong”) issues will be dups of one another. Also, if we resolve most of the reported issues, it would be much easier to keep the project analyzer-warning clean going forward.

Richard, Here are some suggestions on how one could deal with the reported issues:

  1. If an issue looks like a real bug in the codebase, prepare a patch to fix it and get it reviewed by someone familiar with the code.

  2. Some of the reports might not be bugs you could trigger, but point at code that might benefit from restructuring, which would also suppress the analyzer report. If that is an acceptable solution, we should go for it.

  3. The last bucket are the real false positives that cannot be fixed with code restructuring. Some of these might be easier to fix in the analyzer, some might be very difficult to fix. Static analyzer is not a zero false positives tool. Our false positive suppression mechanism is currently very primitive, but it exists. It’s the clang_analyzer macro, which we could use if we are going for analyzer warning free codebase. Another approach would be to built up on the CmpRun.py script, which can allow us to compare issues from different runs and report the diff, so after you fix all the real bugs, you’d expect the diff to not include new issues. the main benefit is that you don’t need to change the codebase. However, bringing that up would require much more effort. One challenge, for example, is to compare issues on an evolving codebase; the solution we use now is very primitive. (Other people from the community might also have an opinion on what is an acceptable approach here.)

You can find a bit more info on how to deal with #2 and #3 in:
http://clang-analyzer.llvm.org/faq.html

Cheers,
Anna.

So I think it might be worth considering an auto-filing system, if
anyone wanted to spend the time to do so. (one issue might be figuring
out how to not file the same issues again on future runs - maybe even
detect when they’ve been fixed

This auto-detection could be built on top of the result comparison script (see above). You’d probably need a database somewhere to store these, possibly a UI for people to mark the resolution of these.