PR17558 - Question about Uninitialized Variables

Thanks for looking into the PR, Michael.

Michael Bao wrote:

I'm looking at this bug report here:
http://llvm.org/bugs/show_bug.cgi?id=17558

[...]

I've narrowed down the issue ClassifyRefs::VisitCastExpr.

I don't know the internals of clang, but I don't think the cast is essential.
For example, here is a test case without casting:

#include <iostream>

int main()
{
    int x;
    bool b = (&x == 0);

    std::cout << x << std::endl; // Missing "uninitialized" warning
    std::cout << b << std::endl;
}

P.S.
Richard, I'd like to say thank you for fixing PR16054 (Missing
"uninitialized" warning). Spotting such uninitialized bugs without
running static analyzer is very helpful in a C primer course.

Regards,
Michel

Thanks for looking into the PR, Michael.

Michael Bao wrote:

I’m looking at this bug report here:
http://llvm.org/bugs/show_bug.cgi?id=17558
[…]
I’ve narrowed down the issue ClassifyRefs::VisitCastExpr.

I don’t know the internals of clang, but I don’t think the cast is essential.
For example, here is a test case without casting:

#include

int main()
{
int x;
bool b = (&x == 0);

std::cout << x << std::endl; // Missing “uninitialized” warning
std::cout << b << std::endl;
}

The more general issue here is that if we see a local variable having its address taken, we assume the variable escapes the analysis, and that we can no longer reason about whether it is initialized. This is tricky to fix in general; Michael’s patch addresses one particular corner of this, but if possible it’d be better to introduce a more holistic fix (addressing your case, his, and lots of others).

To get a feeling for why this is tricky, we would like to diagnose:

int x; bool b = (&x == 0); f(b); return x;

… and …

int x; bool b = &x; f(b); return x;

… but not …

int x; int *b = &x; f(b); return x;

… because ‘f’ might initialize ‘x’. In general, we would like to look at the context in which ‘&x’ appears, and determine if that context could possibly store through the pointer or save it somewhere.

We’d probably want to keep the analysis simple and local, however, so we’d likely still miss the bug in this code:

int x; int *b = &x; if (cond) return x;

P.S.
Richard, I’d like to say thank you for fixing PR16054 (Missing
“uninitialized” warning). Spotting such uninitialized bugs without
running static analyzer is very helpful in a C primer course.

=) You’re welcome!

Richard Smith wrote:

This is tricky to fix in general;
Michael's patch addresses one particular corner of this,

Ah, now I understand what Michael tried to do.

In general, we would like to look at the context in which '&x' appears,
and determine if that context could possibly store through the pointer
or save it somewhere.

Such contextual analysis would be nice :wink:

FWIW, gcc can warn about uninitialized variables whose address are taken
only when optimization is enabled.
(Without optimization, gcc behaves similarly to clang.)

Regards,
Michel

Got an implementation here and it passes all the tests! :slight_smile: (Actually
had to change one test since it was relying on (void)&x marking 'x' as
being initialized).

So cases like:

  int x; (void)&x; return x;
  int x; int* y = &x; return x;
  int x; bool b = (&x == 0); return x;
  int x; bool b; b = &x; foo(b); return x;

Will give an "uninitialized" warning on the return.

While cases like
  int x; int* y = &x; foo(y); return x;
  int x; int *y = &x; int *z = y; return x;

Will not show an uninitialized warning.

More technical details:

1) When I encounter a "Address Of" operator, I mark the child
DeclRefExpr as being in a "Pending" state. If it remains in the
pending state, then we treat it the same way as we would an "Ignore"
and Clang will identify the variable as being uninitialized (provided
nothing else is done with the variable).

2) If the "Address Of" is assigned to a non-pointer variable, then we
mark the child DeclRefExpr as being in the "Pending" state.

3) If the "Address Of" is assigned to a pointer variable, then we mark
the child DeclRefExpr as being in the "WaitForRef" state. In this
case, we will store it for later processing and store which variable
it was assigned to. Along the way, if the pointer is used, we will
make sure to remember that it was used so we can later mark the
"Address Of" variable as being potentially initialized.

4) After the initial visits using ClassifyRefs, I do one last pass
through the stored "Address Of" expressions that were assigned to
pointers and check to see if those pointers were used and change
classifications as necessary.

Thoughts?
If this general idea/direction/functionality looks good, I'll write up
some tests and send it over to cfe-commits.

uninitialized-warning.patch (9.39 KB)