I'm using the Clang Static Analyzer (r70876) and I'm unsure if the
following is a false positive (in the sense that I shouldn't fill a
1: int size;
2: socklen_t size_len = sizeof(size);
3: if (getsockopt(s, SOL_SOCKET, SO_SNDBUF, (char *)&size, &size_len) < 0)
4: return SOCKET_ERROR;
5: return size;
It says "Uninitialized or undefined return value returned to caller"
on line 5. I can understand why this is happening because clang
doesn't realise that if getsockopt returns successfully (by returning
a value >= 0) then size will have been set. I easily fixed the problem
by initialising size to SOCKET_ERROR.
Please file a Bugzilla report with a self-contained example that reproduces what you are seeing (i.e., this code wrapped in a function and the appropriate headers included). I suspect this has to do with the 'char*' cast.
Now is this a false positive, or should clang actually know that
getsockopt will change the size variable?
The analyzer shouldn't report a bug here. Without any additional information about getsockopt, the analyzer should assume it was initialized to some value. This is a conservative approach that reduces false positives.
I can understand how it
wouldn't know since getsockopt is in an external library, but I was
wondering if clang perhaps kept some kind of metadata about how
different functions were meant to work?
It does for some, but not all. When it doesn't know what a function does, we try to make conservative assumptions that cause the analyzer to assume that you are doing the right thing (and thus emit less bogus errors).
For example, I haven't tested it but if I did
1: char src = "X";
2: char dest;
3: strcpy(dest, src, 1);
4: return dest;
Would clang know that the first byte of dest would have been set? I
guess in this case it might since strcpy is most likely a builtin, or
declared in the header file.
Not yet. We probably need to (eventually) explicitly model strcpy in the analyzer for a variety of reasons. Also, the default "memory model" used by the analyzer doesn't reason about array values, but an experimental new memory model (that Zhongxing has been working on) does.
The analyzer is actually built out of a set of modules that handle different parts of the analysis process, e.g., modeling memory bindings, modeling constraints between values, etc. This allows us to gradually improve pieces of the analyzer in isolation over time.
P.S I am really impressed with this static analyser!. I have tried
many other static code analysers in the past and they have always
generated far too many false positives, or displayed the output in a
hard to read/understand way. Your simple HTML output is great! I can't
wait until all this also works for C++
Thank you very much for these compliments! They are greatly appreciated.
As for the future of the analyzer, C++ support will require some amount of inter-procedural analysis to handle well (something we hope to eventually do). We're certainly looking for new contributors for anyone who wants to get involved in any way (new checks, improving the analysis core, improving the HTML output, etc.).