Issue #1 from RTEMS run of clang-analyzer

Hi,

I guess one email with multiple issues didn't
get attention. :frowning: So let's try it again with
one issue at a time.

The first is this which looks like a quoted string on the command
line is getting mangled:

/usr/lib/clang-analyzer/scan-build/ccc-analyzer
-DPACKAGE_NAME=\"rtems-c-src\" -DPACKAGE_TARNAME=\"rtems-c-src\"
-DPACKAGE_VERSION=\"4.10.99.0\" -DPACKAGE_STRING=\"rtems-c-src\
4.10.99.0\" -DPACKAGE_BUGREPORT=\"http://www.rtems.org/bugzilla\\"
-DPACKAGE_URL=\"\" -DCPU_U32_FIX=1 -I.
-I/home/joel/rtems-4.11-work/build/rtems/c/src/optman
-I../../../sis/lib/include -mcpu=cypress -O2 -g -Wall
-Wimplicit-function-declaration -Wstrict-prototypes -Wnested-externs -MT
rtems/no_timer_rel-no-timer.o -MD -MP -MF
rtems/.deps/no_timer_rel-no-timer.Tpo -c -o
rtems/no_timer_rel-no-timer.o `test -f 'rtems/no-timer.c' || echo
'/home/joel/rtems-4.11-work/build/rtems/c/src/optman/'`rtems/no-timer.c
error: error reading '4.10.99.0"'
In file included from <built-in>:109:
<command line>:5:24: error: missing terminating '"' character
#define PACKAGE_STRING "rtems-c-src
                         ^

This is from this -D option...

-DPACKAGE_STRING=\"rtems-c-src\ 4.10.99.0\"

Notice that there is an embedded space in the right hand side of the
-DPACKAGE_STRING which is causing the parsing of the command line to break.

I can provide any output, scripts, help, etc to reproduce and
track down the issues I encountered. I would like to get to
the point where we ran the analyser as part of our automated
build process on RTEMS.

Thanks.

Thanks.

--joel sherrill
RTEMS

Hi Joel,

First, sorry for not replying sooner to your earlier email. I appreciate your efforts in trying to get scan-build to work on RTEMS. Unfortunately, scan-build is something I can't allocate many cycles to these days, but I'll try and address some of your meta-level concerns, which will hopefully let you know whether or not to investing any more energy into trying to get it to work with RTEMS is worthwhile. Note that scan-build is different from the static analyzer itself; scan-build is just the driver to run the static analyzer from the command line for non-Xcode projects.

As you probably inferred, scan-build is meant to have a similar interface to cov-build, which comes with Coverity Prevent. That said, it is nowhere near that level of functionality as far as being able to transparently fit into your build. First, cov-build does actual process interposition to intercept uses of the compiler in the build, while scan-build simply overrides the environment variable CC so that your build uses ccc-analyzer (which then in turn uses 'clang' for analysis and gcc for compilation) instead of the default compiler. It's a total hack and is not a general solution, but it was good enough to get things basically working a couple years ago when I was just bringing up the analyzer.

When I wrote scan-build, my intention was to get some basic analysis functionality working on the Mac. Since scan-build seemed generally useful enough that other people could try it out on other platforms, I was happy to make it publicly available and put some documentation on the clang-analyzer.llvm.org web page. That said, it really is a hack, and has become less important on the Mac once the static analyzer could be run directly within Xcode. Obviously it is still useful on other platforms (and even on the Mac for non-Xcode projects), but I haven't had the bandwidth for a while to do any major work on scan-build. I also don't have the resources (both time and machines) to support getting the analyzer working on other platforms, although I fully encourage others in making that happen.

My hope was that one day, once the static analyzer had enough traction, that a community would build around improving scan-build, ideally re-writing it so that it could accomplish the following two goals:

(1) Support transparent integration into almost any build system on most platforms.

(2) Support different workflows with processing analysis results other than generating static HTML reports.

That community hasn't materialized yet; most of the open source contributors on the analyzer are mainly interested in working on the static analysis engine and checkers and not this kind of infrastructure. Making scan-build more of a real alternative to proprietary static analysis tools (which do (1) really well) probably won't happen until someone else decides to drive it and invest some serious engineering time into doing so.

Some of the problems you have been reporting with using the analyzer on RTEMS have to do with (1), although the other issue seems to be that the Clang frontend doesn't really understand your architecture. That means someone needs to implement the necessary driver logic, target triple support, etc., in Clang to get the analyzer working on your code. Most of this is probably pretty minor, but it is stuff that would require modifying Clang. I know for a fact (from my conversations with Coverity engineers) that Coverity spends a ton of resources getting their analysis tool to work on a diversity of platforms and setups, and part of that comes with adding such platform knowledge into the analysis tool itself. So the transparency you see with using Coverity Prevent is the result of a tremendous engineering effort, interacting with customers, etc. While Coverity has done a great job here, in comparison there is nobody driving that kind of work in scan-build at the moment.

That said, if you are interested in diving deeper, it looks like the main issues you are seeing (at least Issues #2 and #3) is that the Clang frontend doesn't understand your headers, doesn't implement some command line options, etc. These are all issues that would be encountered if you tried using Clang as your compiler. The static analyzer uses all the pieces of Clang just up to the part where we would do code generation in the compiler, so if the compiler would reject your code because of syntactic or similar issues then so will the static analyzer. The next steps here is really to figure out how Clang can be taught about your platform and setup (Anton gave some pointers), but it's not something anyone else is likely to do unless they are trying to get Clang to work as an RTEMS cross compiler.

As for Issue #1, that's likely a problem in the ccc-analyzer Perl script, where it is screwing up argument processing. I will try and took a look at that soon and see if I can figure out what is going wrong.

Anyway, I know this was a long email that didn't actually address your specific issues, but hopefully the back story will be useful for in you when assessing whether or not this is worth any more effort on your part. The bottom line is that if you want to get scan-build working on RTEMS, it will likely require some hacking on Clang (or at least some deep investigation) on your part in order to make this happen in the near future.

Cheers,
Ted

  Hi,

I guess one email with multiple issues didn't
get attention. :frowning: So let's try it again with
one issue at a time.

Hi Joel,

First, sorry for not replying sooner to your earlier email. I appreciate your efforts in trying to get scan-build to work on RTEMS. Unfortunately, scan-build is something I can't allocate many cycles to these days, but I'll try and address some of your meta-level concerns, which will hopefully let you know whether or not to investing any more energy into trying to get it to work with RTEMS is worthwhile. Note that scan-build is different from the static analyzer itself; scan-build is just the driver to run the static analyzer from the command line for non-Xcode projects.

Thanks. We are autoconf'ed and cross-compiled and it didn't go too badly. So I think
it is closer than you think.

I think a lot is getting the general programming community aware of tools like this
and convincing them they are useful.

As you probably inferred, scan-build is meant to have a similar interface to cov-build, which comes with Coverity Prevent. That said, it is nowhere near that level of functionality as far as being able to transparently fit into your build. First, cov-build does actual process interposition to intercept uses of the compiler in the build, while scan-build simply overrides the environment variable CC so that your build uses ccc-analyzer (which then in turn uses 'clang' for analysis and gcc for compilation) instead of the default compiler. It's a total hack and is not a general solution, but it was good enough to get things basically working a couple years ago when I was just bringing up the analyzer.

This worked except for the part of our tree where it gets configured dynamically. I think if we broke
the build into stages, it might work. It may even take doing a regular build, then a clean, then a
scan-build make. I really don't care about execution time for the analysis. Machines are fast
and this can run overnight if need be. Results count.

When I wrote scan-build, my intention was to get some basic analysis functionality working on the Mac. Since scan-build seemed generally useful enough that other people could try it out on other platforms, I was happy to make it publicly available and put some documentation on the clang-analyzer.llvm.org web page. That said, it really is a hack, and has become less important on the Mac once the static analyzer could be run directly within Xcode. Obviously it is still useful on other platforms (and even on the Mac for non-Xcode projects), but I haven't had the bandwidth for a while to do any major work on scan-build. I also don't have the resources (both time and machines) to support getting the analyzer working on other platforms, although I fully encourage others in making that happen.

Compiler internals is not my area of expertise but I have been hacking on GCC for a long
time -- mostly adding rtems target variants, porting run-time libraries, etc. I am more of the
OS target maintainer type than a code generation type.

But I am willing to try to get a complete scan of RTEMS. So teach me to fish. :smiley:

My hope was that one day, once the static analyzer had enough traction, that a community would build around improving scan-build, ideally re-writing it so that it could accomplish the following two goals:

(1) Support transparent integration into almost any build system on most platforms.

It seems to work pretty well with autoconf'ed packages. RTEMS pushes a number of
buttons on autoconf/automake.

(2) Support different workflows with processing analysis results other than generating static HTML reports.

The HTML output is nice but once I get past "making it work", I am happy to hear what you
have in mind.

That community hasn't materialized yet; most of the open source contributors on the analyzer are mainly interested in working on the static analysis engine and checkers and not this kind of infrastructure.

That makes sense. The people interested in scan-build are like me -- project folks who
may or may not have the skill or interest in making it work for their project.

  Making scan-build more of a real alternative to proprietary static analysis tools (which do (1) really well) probably won't happen until someone else decides to drive it and invest some serious engineering time into doing so.

I am interested enough in having a free static analyser available as part of the open
RTEMS development process. So if I can get some guidance and nibble on the problem,
maybe I can help.

I am becoming a strong believer in applying a variety of tools and testing strategies
to a code base to improve its quality. We compile with -Wall plus some extra warnings
and actually do coverage analysis of our test suite using simulators for 6 processor
architectures. My recent focus is on finding a suite of open source tools that can be
applied at different stages in a development process to improve the overall quality
of a software package.

Some of the problems you have been reporting with using the analyzer on RTEMS have to do with (1), although the other issue seems to be that the Clang frontend doesn't really understand your architecture. That means someone needs to implement the necessary driver logic, target triple support, etc., in Clang to get the analyzer working on your code. Most of this is probably pretty minor, but it is stuff that would require modifying Clang. I know for a fact (from my conversations with Coverity engineers) that Coverity spends a ton of resources getting their analysis tool to work on a diversity of platforms and setups, and part of that comes with adding such platform knowledge into the analysis tool itself. So the transparency you see with using Coverity Prevent is the result of a tremendous engineering effort, interacting with customers, etc. While Coverity has done a great job here, in comparison there is nobody driving that kind of work in scan-build at the moment.

I have heard them comment on that as well. The number of compiler versions,
architectures, OSes, etc is amazing. And small variations require tinkering. We
have configuration explosion with RTEMS and expect it is just as bad for this.
We have host OS, host CPU (usually 32/64 bit x86), target cpu, and number of boards as variables in
our test matrix.

That said, if you are interested in diving deeper, it looks like the main issues you are seeing (at least Issues #2 and #3) is that the Clang frontend doesn't understand your headers, doesn't implement some command line options, etc. These are all issues that would be encountered if you tried using Clang as your compiler. The static analyzer uses all the pieces of Clang just up to the part where we would do code generation in the compiler, so if the compiler would reject your code because of syntactic or similar issues then so will the static analyzer. The next steps here is really to figure out how Clang can be taught about your platform and setup (Anton gave some pointers), but it's not something anyone else is likely to do unless they are trying to get Clang to work as an RTEMS cross compiler.

Maybe I was lucky but it worked for what we call "cpukit" which is multilib'ed. It does
not use the -B option.

If given a little guidance, I am willing to take a stab at getting it to work for RTEMS.
I just need some very direct pointers. It sounds like I would be responsible for:

+ Adding -B support
+ Whatever the target triple support is. I am betting that when I know the list
of what is required, I can write a script/C program to probe and generate the
entire set.

I am a bit concerned about something I would sometimes see which is
our configure script invoking ccc

As for Issue #1, that's likely a problem in the ccc-analyzer Perl script, where it is screwing up argument processing. I will try and took a look at that soon and see if I can figure out what is going wrong.

This one looks like it is preventing all the test code from building. Thanks.

Anyway, I know this was a long email that didn't actually address your specific issues, but hopefully the back story will be useful for in you when assessing whether or not this is worth any more effort on your part. The bottom line is that if you want to get scan-build working on RTEMS, it will likely require some hacking on Clang (or at least some deep investigation) on your part in order to make this happen in the near future.

No. It was just the email I needed. Please give me pointers on what is required to teach it
a triple and where to look for adding recognition of a -B option. :smiley:

Hello, Joel

No. It was just the email I needed. Please give me pointers on what is
required to teach it a triple

1. You should probably start with looking into LLVM's triple support -
include/llvm/ADT/Triple.h and around (Triple.cpp file)
2. Then proceed to clang/lib/Basic/TargetInfo.cpp and Targets.cpp
describing what RTEMS will need, e.g. builtin defines, ABI etc.

You will need to take the lead on these, particularly the target triple support (which defines the architectural specific of your platform), but I think others on the list can give you plenty of guidance. My advice is to take the analyzer itself out of the equation, and just focus on what you would need to get the source through the clang frontend (then revisit the analyzer). There's enough traffic on this list that not everyone pays attention to static analyzer issues. For example, I'd recommend splitting off discussion of -B on another thread, and just focus on that issue out of the context of the static analyzer (e.g., a thread "How to implement '-B' preprocessor option in Clang"). As I mentioned in my other email, I think that issue in particular is largely a driver issue.

Hi Joel,

I looked into this issue. The error is coming from the compiler, which is directly forwarded the arguments (with no extra interpretation) from ccc-analyzer. I’m not certain where the issue is, because I see this problem even when just using the compiler directly (and not using scan-build). For example:

$ touch t.c
$ cat Makefile

t.o:
$(CC) -DFOO="Foo Bar" -c t.c

clean:
rm -f t.o

all: t.o

$ CC=gcc make

gcc -DFOO="Foo Bar" -c t.c
i686-apple-darwin11-gcc-4.2.1: Bar": No such file or directory
: warning: missing terminating " character
make: *** [t.o] Error 1

Note that if I change the relevant line in the Makefile to:

t.o:
$(CC) -DFOO=‘“Foo Bar”’ -c t.c

the problem goes away.

The first is this which looks like a quoted string on the command
line is getting mangled:

/usr/lib/clang-analyzer/scan-build/ccc-analyzer
-DPACKAGE_NAME=\"rtems-c-src\" -DPACKAGE_TARNAME=\"rtems-c-src\"
-DPACKAGE_VERSION=\"4.10.99.0\" -DPACKAGE_STRING=\"rtems-c-src\
4.10.99.0\" -DPACKAGE_BUGREPORT=\"http://www.rtems.org/bugzilla\\ <http://www.rtems.org/bugzilla/&gt;&quot;
-DPACKAGE_URL=\"\" -DCPU_U32_FIX=1 -I.
-I/home/joel/rtems-4.11-work/build/rtems/c/src/optman
-I../../../sis/lib/include -mcpu=cypress -O2 -g -Wall
-Wimplicit-function-declaration -Wstrict-prototypes -Wnested-externs -MT
rtems/no_timer_rel-no-timer.o -MD -MP -MF
rtems/.deps/no_timer_rel-no-timer.Tpo -c -o
rtems/no_timer_rel-no-timer.o `test -f 'rtems/no-timer.c' || echo
'/home/joel/rtems-4.11-work/build/rtems/c/src/optman/'`rtems/no-timer.c
error: error reading '4.10.99.0"'
In file included from <built-in>:109:
<command line>:5:24: error: missing terminating '"' character
#define PACKAGE_STRING "rtems-c-src
                        ^

This is from this -D option...

-DPACKAGE_STRING=\"rtems-c-src\ 4.10.99.0\"

Notice that there is an embedded space in the right hand side of the
-DPACKAGE_STRING which is causing the parsing of the command line to break.

Hi Joel,

I looked into this issue. The error is coming from the compiler, which is directly forwarded the arguments (with no extra interpretation) from ccc-analyzer. I'm not certain where the issue is, because I see this problem even when just using the compiler directly (and not using scan-build). For example:

$ touch t.c
$ cat Makefile
t.o:
$(CC) -DFOO=\"Foo Bar\" -c t.c

clean:
rm -f t.o

all: t.o

$ CC=gcc make
gcc -DFOO=\"Foo Bar\" -c t.c
i686-apple-darwin11-gcc-4.2.1: Bar": No such file or directory
<command-line>: warning: missing terminating " character
make: *** [t.o] Error 1

Try "scan-build make"

$ scan-build make
scan-build: 'clang' executable not found in '/usr/lib/clang-analyzer/scan-build/bin'.
scan-build: Using 'clang' from path: /usr/bin/clang
/usr/lib/clang-analyzer/scan-build/ccc-analyzer -DFOO=\"Foo Bar\" -c t.c
gcc: Bar": No such file or directory
<command-line>: warning: missing terminating " character
make: *** [t.o] Error 1
scan-build: Removing directory '/tmp/scan-build-2010-09-29-1' because it contains no reports.

Hi Joel,

My point was that this error occurs regardless of whether you use scan-build or not. If I do:

$ CC=gcc make

or

$ scan-build make

I get the same problem. All ccc-analyzer is doing is literally forwarding the argv arguments to gcc. It's not doing any manipulation of them.

Indeed, this example is a bug in the Makefile. In the first form the
shell will see two words, not one. You need to quote or escape the
space as well as the double quotes, such as is done in the revised
version above. This would work too:

t.o:
        $(CC) -DFOO=\"Foo\ Bar\" -c t.c

John Bytheway

Hi Joel,

I looked into this issue. The error is coming from the compiler, which
is directly forwarded the arguments (with no extra interpretation) from
ccc-analyzer. I'm not certain where the issue is, because I see this
problem even when just using the compiler directly (and not using
scan-build). For example:

$ touch t.c
$ cat Makefile
t.o:
$(CC) -DFOO=\"Foo Bar\" -c t.c

clean:
rm -f t.o

all: t.o

$ CC=gcc make
gcc -DFOO=\"Foo Bar\" -c t.c
i686-apple-darwin11-gcc-4.2.1: Bar": No such file or directory
<command-line>: warning: missing terminating " character
make: *** [t.o] Error 1

Note that if I change the relevant line in the Makefile to:

t.o:
         $(CC) -DFOO='"Foo Bar"' -c t.c

the problem goes away.

Indeed, this example is a bug in the Makefile. In the first form the
shell will see two words, not one. You need to quote or escape the
space as well as the double quotes, such as is done in the revised
version above. This would work too:

t.o:
         $(CC) -DFOO=\"Foo\ Bar\" -c t.c

I have attached my version of the Makefile with the "DEFS"
line from the Makefile generated by autoconf. It appears
to be properly escaped. The following commands work
fine for it.

make clean all
CC=gcc make clean all

But "scan-build make clean all" fails.

Makefile (330 Bytes)