Unsound assumptions about exhaustiveness of enum switch cases?

For a test.cc input of

enum E1 { E1_0, E1_1 };
enum E2 { E2_0, E2_1, E2_2 };
enum class E3 { E3_0, E3_1 };
int x;
int f(E1 e1, E2 e2, E3 e3) {
    int n1 = x;
    switch (e1) {
    case E1_0: n1 = 0; break;
    case E1_1: n1 = 1; break;
    }
    int n2 = x;
    switch (e2) {
    case E2_0: n2 = 0; break;
    case E2_1: n2 = 1; break;
    case E2_2: n2 = 2; break;
    }
    int n3 = x;
    switch (e3) {
    case E3::E3_0: n3 = 0; break;
    case E3::E3_1: n3 = 1; break;
    }
    return n1 + n2 + n3;
}

(where x is needed to work around heuristics in clang-tidy when not to warn about redundant initializations), calling (at least for a recent Clang trunk build)

clang-tidy -checks=clang-analyzer-deadcode.DeadStores test.cc -- -std=c++11

causes clang-analyzer-deadcode.DeadStores warnings about the initializations of all three of n1, n2, n3. Apparently on the grounds that each switch statement is exhaustive.

But from my understanding of the C++ standard, the ranges of enumeration values for E2, E3 are such that f can legally be called with values for e2, e3 that are not covered by the case branches of the respective switches, e.g. as

f(E1_0, E2(3), E3(std::numeric_limits<int>::max()));

Is this analysis deliberately unsound, on the assumption that objects of enumeration type take only enumerator values? If yes, is it deliberate to do so for both scoped and unscoped enumerations?

(Jordan - just cc’ing you because I don’t know who’s most active on the Static Analyzer these days)

Yep - the Clang CFG used to make only one decision about edges, and mostly it made them conservatively to avoid one kind of false positive (eg: avoid diagnosing an uninitialized variable when a switch had a case for each enum value and each of those initialized the variable). But this conservative assumption has problems in the other direction as you noticed.

After “goto fail; goto fail;” happened, Ted Kremenek improved the CFG somewhat to have optional/conditional extra edges in some way, and covered a few cases of this disparity, but the work is far from complete as demonstrated by cases like this.

Not sure if anyone’s looking/interested at the moment.

David has it correct. A good heuristic might be to say that if the switch has a default case, then we should assume values outside the enum cases are possible, but we don’t currently do this.

Different codebases have different opinions on this, which is why Clang has both -Wswitch and -Wswitch-enum. The analyzer currently does not let you make such a decision.

Jordan

If you want to be explicitly about the switch not being covering, cast
the variable to the underlaying integer type.

Joerg

David has it correct. A good heuristic might be to say that if the switch
has a default case, then we should assume values outside the enum cases are
possible, but we don't currently do this.

Not sure whether you're suggesting a heuristic instead of having both
optimistic and pessimistic edges on the CFG. I don't think a heuristic
would be sufficient in any case - there are several places (uninit
variables, dead code, etc) where pessimistic edges are desired and the
rest/ones we already have that prefer optimistic. (or whichever way around
that goes)

- Dave