[analyzer] UninitializedObjectChecker evaluation

Hi!

As of https://reviews.llvm.org/D58573, UninitializedObjectChecker is out of alpha. Since that patch landed, I’ve received some bug reports, which I’m super grateful for, both for finding and reporting the errors, and also for the reason that it’s being used.

Since it is being used, however, I’m very interested in the user experience. I personally found it to be almost unbearably noisy for LLVM, as this checker finds partially/fully uninitialized objects, and as a performance critical project, it takes advantage of not initializing certain fields.

My experience for any “regular” projects however was very different, analyzing cppcheck, bitcoin, rtags, cap n proto, xerces resulted gathering some very meaningful reports, often pointing out errors such as

struct MyStruct {
MyStruct() {} // should be = default;
int a, b, c, d;
};

struct DidntHaveTheTimeToWriteAProperConstructor {
DidntHaveTheTimeToWriteAProperConstructor() : a(0), b(0) {}

int a, b, hopeNobodyForgetsAboutThisOne;
};

Could you please share some of your experiences using this checker? Would you like to see some improvements being made? Would you configure this checker for your project, if you were exposed to some checker options?

Cheers!
Kristóf

  • Dmitri Gribenko

Hi,

The results of the UninitializedObject checker significantly overlap
with the clang-tidy checker cppcoreguidelines-pro-type-member-init.
When both checkers trigger, messages from
cppcoreguidelines-pro-type-member-init are significantly more clear
and less noisy.

$ cat /tmp/t.cc
struct Foo {
  Foo() : y(0) {}
  int x;
  int y;
};
void entry1() {
  Foo x;
}

$ ./bin/clang-tidy -checks="-*,clang-analyzer*" /tmp/t.cc
/tmp/t.cc:2:17: warning: 1 uninitialized field at the end of the
constructor call [clang-analyzer-optin.cplusplus.UninitializedObject]
  Foo() : y(0) {}
                ^
/tmp/t.cc:3:7: note: uninitialized field 'this->x'
  int x;
      ^
/tmp/t.cc:7:7: note: Calling default constructor for 'Foo'
  Foo x;
      ^
/tmp/t.cc:2:17: note: 1 uninitialized field at the end of the constructor call
  Foo() : y(0) {}
                ^

$ ./bin/clang-tidy -checks="-*,cppcoreguidelines-pro-type-member-init" /tmp/t.cc
/tmp/t.cc:2:3: warning: constructor does not initialize these fields:
x [cppcoreguidelines-pro-type-member-init]
  Foo() : y(0) {}
  ^

Note how cppcoreguidelines-pro-type-member-init packs the same
information into one message. The primary message from
UninitializedObject checker says there's "1 uninitialized field". As a
user, my first question is "which?" -- and to figure it out I have to
parse the rest of the output, where two notes out of three have
completely irrelevant information. There's no way to call this
constructor in a way that will initialize `Foo::x`, so why print all
that? Also, the phrasing is passive, I like the phrasing from
cppcoreguidelines-pro-type-member-init much better.

cppcoreguidelines-pro-type-member-init does not trigger on templates,
while UninitializedObject checker does. For example:

template<typename T>
struct Foo {
  Foo() : y(0) {}
  T x;
  T y;
};
void entry1() {
  Foo<int> x;
}

It is unclear why cppcoreguidelines-pro-type-member-init decided to
not look into templates.

One source of false positives for the UninitializedObject checker are
`Init()` methods, for example:

bool ReadIntFromFile(int *x);
struct Foo {
  Foo() : y(0) {}
  int x;
  int y;
  bool Init() {
    return ReadIntFromFile(&x);
  }
};
bool entry1() {
  Foo x;
  return x.Init();
}

It is also unclear how to annotate member variables that are
intentionally left uninitialized:

struct Foo {
  Foo() : has_cached_value(false) {}
  bool has_cached_value;
  int cached_value;
};
bool entry1() {
  Foo x;
}

In many cases it is possible to rewrite the code using std::optional,
but not everywhere.

Another source of false positives is the `is_valid` pattern:

bool ReadIntFromFile(int *x);
struct Foo {
  Foo() {
    int temp;
    if (!ReadIntFromFile(&temp)) {
      is_valid = false;
      return;
    }
    x = temp;
    is_valid = true;
  }
  int x;
  bool is_valid;
};
void entry1() {
  Foo x;
}

Inlining in the Clang Static Analyzer helps the UninitializedObject
checker to see through abstractions. In the following example
UninitializedObject checker is silent (as it should be), while
cppcoreguidelines-pro-type-member-init complains:

struct Foo {
  Foo() {
    Unset();
  }
  void Unset() {
    has_cached_value = false;
    x = 0;
  }
  bool has_cached_value;
  int x;
};
void entry1() {
  Foo x;
}

To summarize, I have two findings.

First, I haven't found a case where taking the call site context into
account helps this checker. Fundamentally, this check is about
assuming nothing at the entry to the constructor, and checking a
strong postcondition (all fields are initialized) at the exit.

Second, the `Init()` and `is_valid` patterns are fundamentally at odds
with this checker, because it does not try to detect an uninitialized
memory read, it tries to detect a postcondition violation at the
constructor exit.

Hope this helps.

Dmitri

Hi!

Sorry for my late response, I’m currently too busy to start tinkering with this checker, but the feedback is very much appreciated!

Hi,

The results of the UninitializedObject checker significantly overlap
with the clang-tidy checker cppcoreguidelines-pro-type-member-init.
When both checkers trigger, messages from
cppcoreguidelines-pro-type-member-init are significantly more clear
and less noisy.

$ cat /tmp/t.cc
struct Foo {
Foo() : y(0) {}
int x;
int y;
};
void entry1() {
Foo x;
}

$ ./bin/clang-tidy -checks=“-,clang-analyzer” /tmp/t.cc
/tmp/t.cc:2:17: warning: 1 uninitialized field at the end of the
constructor call [clang-analyzer-optin.cplusplus.UninitializedObject]
Foo() : y(0) {}
^
/tmp/t.cc:3:7: note: uninitialized field ‘this->x’
int x;
^
/tmp/t.cc:7:7: note: Calling default constructor for ‘Foo’
Foo x;
^
/tmp/t.cc:2:17: note: 1 uninitialized field at the end of the constructor call
Foo() : y(0) {}
^

$ ./bin/clang-tidy -checks=“-*,cppcoreguidelines-pro-type-member-init” /tmp/t.cc
/tmp/t.cc:2:3: warning: constructor does not initialize these fields:
x [cppcoreguidelines-pro-type-member-init]
Foo() : y(0) {}
^

Note how cppcoreguidelines-pro-type-member-init packs the same
information into one message. The primary message from
UninitializedObject checker says there’s “1 uninitialized field”. As a
user, my first question is “which?” – and to figure it out I have to
parse the rest of the output, where two notes out of three have
completely irrelevant information. There’s no way to call this
constructor in a way that will initialize Foo::x, so why print all
that? Also, the phrasing is passive, I like the phrasing from
cppcoreguidelines-pro-type-member-init much better.

When I use CodeChecker to look at the results, I prefer the notes. However, having different report styles for different outputs (text, plist, etc) would be great!

cppcoreguidelines-pro-type-member-init does not trigger on templates,
while UninitializedObject checker does. For example:

template
struct Foo {
Foo() : y(0) {}
T x;
T y;
};
void entry1() {
Foo x;
}

It is unclear why cppcoreguidelines-pro-type-member-init decided to
not look into templates.

One source of false positives for the UninitializedObject checker are
Init() methods, for example:

bool ReadIntFromFile(int *x);
struct Foo {
Foo() : y(0) {}
int x;
int y;
bool Init() {
return ReadIntFromFile(&x);
}
};
bool entry1() {
Foo x;
return x.Init();
}

Having to remember that you have to call an init function is a good example IMO why we really should emit a report in this case. What kind of heuristic would work well here? “If the class has a method named Init(), temporarily suppress the report, and emit it when the variable goes out of scope without the Init() function being called”? Can the clang-tidy checker deal with this?

It is also unclear how to annotate member variables that are
intentionally left uninitialized:

struct Foo {
Foo() : has_cached_value(false) {}
bool has_cached_value;
int cached_value;
};
bool entry1() {
Foo x;
}

That’s a very valid concern.

In many cases it is possible to rewrite the code using std::optional,
but not everywhere.

Another source of false positives is the is_valid pattern:

bool ReadIntFromFile(int *x);
struct Foo {
Foo() {
int temp;
if (!ReadIntFromFile(&temp)) {
is_valid = false;
return;
}
x = temp;
is_valid = true;
}
int x;
bool is_valid;
};
void entry1() {
Foo x;
}

Hmmm, interesting. A better guarded field analysis (one of the options for the checker) might be able to deal with this. In this specific case, x is public, so the danger is real, but if it wasn’t, and was guarded with assert(is_valid);, I see why we shouldn’t report it.

Inlining in the Clang Static Analyzer helps the UninitializedObject
checker to see through abstractions. In the following example
UninitializedObject checker is silent (as it should be), while
cppcoreguidelines-pro-type-member-init complains:

struct Foo {
Foo() {
Unset();
}
void Unset() {
has_cached_value = false;
x = 0;
}
bool has_cached_value;
int x;
};
void entry1() {
Foo x;
}

To summarize, I have two findings.

First, I haven’t found a case where taking the call site context into
account helps this checker. Fundamentally, this check is about
assuming nothing at the entry to the constructor, and checking a
strong postcondition (all fields are initialized) at the exit.

Second, the Init() and is_valid patterns are fundamentally at odds
with this checker, because it does not try to detect an uninitialized
memory read, it tries to detect a postcondition violation at the
constructor exit.

Hope this helps.

Yes it does, thank you for everything! :slight_smile: I’ll definitely make some improvements when I’ll have some spare time.