False positive for -Wunreachable-code

$ cat p.c
#include <stdio.h>

enum e { a, b = 4 } x = 3;

void g(int v) {
  printf("%d\n", v);
}

int main(int argc, char **argv) {
  switch (x) {
  case a:
    g(0);
    break;
  case b:
    g(1);
    break;
  default:
    g(2);
    break;
  }
}
$ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
p.c:17:3: warning: default label in switch which covers all enumeration
values
      [-Wcovered-switch-default]
  default:
  ^
p.c:18:7: warning: will never be executed [-Wunreachable-code]
    g(2);
      ^
$ ./a.out
2

Of course -Wcovered-switch-default warning is a perfectly true positive.

My reading of the standard is that nothing prevent an enum to have a
value different from listed enum constants if this value is compatible
with enum range (and code generation seems to agree on that).

Are there any objection to file a bug report?

Well, I would object on the basis that for all “regular” uses of the switch/enum combination: ie, the enum values are only the enumerators and the switch does cover all enumerators; then this warning is perfectly valid.

Therefore I would rather say that people who use enums for bit-masks and thus have variables of the enum type with values other than the enumerators (which is fine) should either not be using a switch on this enum OR simply disable this warning. Maybe casting x to int such as switch(int(x)) would also work.

How do you propose to distinguish the two widely different usages of enum ?

I personally would be rather disappointed to see -Wcovered-switch-default disappear since at both the company I work on and in my pet programs we only ever use enum for its enumerators and thus this warning is very useful.

– Matthieu

There is a misunderstanding here:

- I think that -Wcovered-switch-default is very useful and the warning
show above is a *TRUE* positive, so there is nothing to fix there

- my point is that -Wunreachable-code is showing a false negative for
the example above and should be fixed

I am afraid I did not expressed myself clearly either: if we accept that the switch is perfectly covered (and thus warn that the default is not needed), then it falls out that the code in the default branch is unreachable.

I am fine to remove the -Wunreachable warning in the case were the -Wcovered-switch-default one has already been emitted, they are redundant.

The real question is: when Clang does not warn about the switch already being covered, should it warn about the code being unreachable ?

What do you think of a note suggesting a cast to the underlying integral type of x such as switch(int(x)) added to -Wunreachable to show how to silence it ? (and making it so that it does silence it)

– Matthieu

    >
    >
    >
    > $ cat p.c
    > #include <stdio.h>
    >
    > enum e { a, b = 4 } x = 3;
    >
    > void g(int v) {
    > printf("%d\n", v);
    > }
    >
    > int main(int argc, char **argv) {
    > switch (x) {
    > case a:
    > g(0);
    > break;
    > case b:
    > g(1);
    > break;
    > default:
    > g(2);
    > break;
    > }
    > }
    > $ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
    > p.c:17:3: warning: default label in switch which covers all
    enumeration
    > values
    > [-Wcovered-switch-default]
    > default:
    > ^
    > p.c:18:7: warning: will never be executed [-Wunreachable-code]
    > g(2);
    > ^
    > $ ./a.out
    > 2
    >
    > Of course -Wcovered-switch-default warning is a perfectly true
    positive.
    >
    > My reading of the standard is that nothing prevent an enum to
    have a
    > value different from listed enum constants if this value is
    compatible
    > with enum range (and code generation seems to agree on that).
    >
    > Are there any objection to file a bug report?
    >
    >
    > Well, I would object on the basis that for all "regular" uses of the
    > switch/enum combination: ie, the enum values are only the enumerators
    > and the switch does cover all enumerators; then this warning is
    > perfectly valid.
    >
    > Therefore I would rather say that people who use enums for
    bit-masks and
    > thus have variables of the enum type with values other than the
    > enumerators (which is fine) should either not be using a switch on
    this
    > enum OR simply disable this warning. Maybe casting `x` to `int`
    such as
    > `switch(int(x))` would also work.
    >
    > How do you propose to distinguish the two widely different usages
    of enum ?
    >
    > I personally would be rather disappointed to see
    > -Wcovered-switch-default disappear since at both the company I work on
    > and in my pet programs we only ever use enum for its enumerators and
    > thus this warning is very useful.

    There is a misunderstanding here:

    - I think that -Wcovered-switch-default is very useful and the warning
    show above is a *TRUE* positive, so there is nothing to fix there

    - my point is that -Wunreachable-code is showing a false negative for
    the example above and should be fixed

I am afraid I did not expressed myself clearly either: if we accept that
the switch is perfectly covered (and thus warn that the default is not
needed), then it falls out that the code in the default branch is
unreachable.

The fact that the switch is "perfectly" covered means only that every
enum constant listed is handled in case statements. This says nothing
about the fact that the default might be needed for valid values in the
enum range but not present in enum constant list.

Note that the default branch in the example above is perfectly reachable
and is subject to code generation (it cannot be optimized away and it is
not optimized away with clang or gcc or other compilers).

I am fine to remove the `-Wunreachable` warning in the case were the
`-Wcovered-switch-default` one has already been emitted, they are redundant.

I'm inclined to disagree, these two warning show very different infos:

-Wunreachable: this code is never executeed and might be removed from
code generation

-Wcovered-switch-default: the default might be needless if the enum
typed switch guard is expected to not have any value different from enum
constants listed in the enum definition

Of course there are marginal cases where both condition are true (e.g.
an enum with unsigned char underlying type and all enum constants with
value from 0 to 255 are represented in case statements, but this is not
really relevant.

The real question is: when Clang does not warn about the switch already
being covered, should it warn about the code being unreachable ?

What do you think of a note suggesting a cast to the underlying integral
type of `x` such as `switch(int(x))` added to `-Wunreachable` to show
how to silence it ? (and making it so that it does silence it)

This would not help, the default is reachable and subject to code
generation with and without int cast.

    >
    >
    >
    > $ cat p.c
    > #include <stdio.h>
    >
    > enum e { a, b = 4 } x = 3;
    >
    > void g(int v) {
    > printf("%d\n", v);
    > }
    >
    > int main(int argc, char **argv) {
    > switch (x) {
    > case a:
    > g(0);
    > break;
    > case b:
    > g(1);
    > break;
    > default:
    > g(2);
    > break;
    > }
    > }
    > $ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
    > p.c:17:3: warning: default label in switch which covers all
    enumeration
    > values
    > [-Wcovered-switch-default]
    > default:
    > ^
    > p.c:18:7: warning: will never be executed [-Wunreachable-code]
    > g(2);
    > ^
    > $ ./a.out
    > 2
    >
    > Of course -Wcovered-switch-default warning is a perfectly true
    positive.
    >
    > My reading of the standard is that nothing prevent an enum to
    have a
    > value different from listed enum constants if this value is
    compatible
    > with enum range (and code generation seems to agree on that).
    >
    > Are there any objection to file a bug report?

This is, if I understand you correctly, already the subject of PR13815.

Essentially there's a huge class of -Wunreachable-code false positives
(if you think of "positive" in the case of "warns" rather than
"doesn't warn") because our CFG is used for both "warn about these
constructs only if they're reachable" and "warn about unreachable
code". The problem is that this buckets all blocks into one of those
two categories, when in fact there's a 3rd middle ground of "code that
we're not sure is reachable or unreachable".

for example code like this:

if (sizeof(int) == 4) {
  ...
} else if (sizeof(int) == 8) {
  ...
} ...

warns always on one of those two blocks, but we really don't want to
warn on either, yet we only want to emit the warnings related to
reachable code constructs (like dereferencing null, for example - but
see any other uses of DiagnoseRuntimeBehavior in Clang) on code we're
fairly sure is reachable.

The same goes for the default case in a covered switch. To reduce
false positives for runtime-behavior warnings we probably want to
consider it unreachable. But to reduce false positives in the
unreachable-code warning, we probably want to consider it reachable
(or at least "maybe reachable").

It requires modifying the CFG to represent this 3rd intermediate state
& I haven't got around to doing that yet. (once we have that mechanism
we'll then have to start doing more things to detect what kind of edge
classification each CFG edge will get (to detect things like "depends
on sizeof" and "depends on enums having only enumerate values", etc)
to remove all those -Wunreachable-code false positives)

- David

    >
    >
    >
    > $ cat p.c
    > #include <stdio.h>
    >
    > enum e { a, b = 4 } x = 3;
    >
    > void g(int v) {
    > printf("%d\n", v);
    > }
    >
    > int main(int argc, char **argv) {
    > switch (x) {
    > case a:
    > g(0);
    > break;
    > case b:
    > g(1);
    > break;
    > default:
    > g(2);
    > break;
    > }
    > }
    > $ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
    > p.c:17:3: warning: default label in switch which covers all
    enumeration
    > values
    > [-Wcovered-switch-default]
    > default:
    > ^
    > p.c:18:7: warning: will never be executed [-Wunreachable-code]
    > g(2);
    > ^
    > $ ./a.out
    > 2
    >
    > Of course -Wcovered-switch-default warning is a perfectly true
    positive.
    >
    > My reading of the standard is that nothing prevent an enum to
    have a
    > value different from listed enum constants if this value is
    compatible
    > with enum range (and code generation seems to agree on that).
    >
    > Are there any objection to file a bug report?

This is, if I understand you correctly, already the subject of PR13815.

Yes, I confirm that.

Essentially there's a huge class of -Wunreachable-code false positives
(if you think of "positive" in the case of "warns" rather than
"doesn't warn") because our CFG is used for both "warn about these
constructs only if they're reachable" and "warn about unreachable
code". The problem is that this buckets all blocks into one of those
two categories, when in fact there's a 3rd middle ground of "code that
we're not sure is reachable or unreachable".

for example code like this:

if (sizeof(int) == 4) {
  ...
} else if (sizeof(int) == 8) {
  ...
} ...

warns always on one of those two blocks, but we really don't want to
warn on either, yet we only want to emit the warnings related to
reachable code constructs (like dereferencing null, for example - but
see any other uses of DiagnoseRuntimeBehavior in Clang) on code we're
fairly sure is reachable.

In my mind things are very simple: unreachable code is code that cannot
be executed and can be omitted from code generation.

So the default statement in my testcase above is not unreachable code
(as it is code reachable and actually executed).

To emit a warning about unreachable code for that is grossly misleading.

Now let consider your testcase about sizeof: there is no doubt that at
least one conditional branch is unreachable code.

There is the possibility that despite the fact it is definitely true
that this code is unreachable the programmers get annoyed to see such
warnings when he knows that the code is unreachable by design.

Then under a quality of implementation perspective we might choose to
have a system to silent that, but this has nothing to do with
"unreachable code" concept and it is on an orthogonal plane.

The same goes for the default case in a covered switch. To reduce
false positives for runtime-behavior warnings we probably want to
consider it unreachable. But to reduce false positives in the

This phrase amazes me, can you make an example where considering a
reachable default as unreachable reduces the false positives of some
other warnings?

unreachable-code warning, we probably want to consider it reachable
(or at least "maybe reachable").

It requires modifying the CFG to represent this 3rd intermediate state
& I haven't got around to doing that yet. (once we have that mechanism
we'll then have to start doing more things to detect what kind of edge
classification each CFG edge will get (to detect things like "depends
on sizeof" and "depends on enums having only enumerate values", etc)
to remove all those -Wunreachable-code false positives)

I don't think this is needed, IMHO we should only have a CFG concept of
unreachable code congruent with the one used for code generation.

I've attached the patch for review.

The fixed testcase shows well why to hide warnings about undefined
behaviour in code actually generated is a bad thing.

ReachableDefault.patch (1.83 KB)

If we do this, we're going to want this under a CFG option at the very least. The static analyzer should continue assuming that an enum input to a switch is always a valid enum constant, in order to keep our false positive rate down.

Jordan

Yeah, I doubt this'll be any better in the compiler proper, really.
The heuristic exists to, as you rightly point out, reduce false
positives & that rationale exists in the compiler as well.

While, yes, it means we lose some true positives as well, that
probably isn't worth the increase in false positives.

I can see Abramo's point, however, that in a more defensive coding style the current -Wunreachable warning can easily be considered a false positive. We don't optimize out the default case in an enum-covered switch.

Flagging this as unreachable code is a bug & should be fixed - but
probably in the way I described. Treating it purely as reachable code
& emitting our 'runtime' diagnostics for code in such situations will
(I believe - though I haven't run numbers) increase false positive
rates substantially.

A trivial example that GCC often warns about & Clang deliberately does not:

int func(enum X v) {
  switch (v) {
  case A: return 1;
  case B: return 2;
  ... // fully covered
  case Z: return 26;
  }
  // GCC warns that the function may exit without a return value here,
Clang does not
  // the LLVM/Clang codebase has a lot of llvm_unreachables after
fully covered/exiting
  // switches like this to silence GCC's warning. Each of those is,
essentially, a GCC
  // false positive (in the sense that the code is not buggy).
}

We need the two-state CFG to represent maximally and minimally
reachable code. Maximal for -Wunreachable and minimal for 'runtime'
diagnostics.

- David

:-o

Unless I'm missing something, the code is definitely buggy and leads to
undefined behaviour in C++ entering with v = Z + 1. Note that entering
with v = Z + 1, is not per se an undefined behavior: only the missing
return causes that.

Can we at least agree on that?

If we'd agree on that we will easily discover that my proposed change
does not lead to any false positive diagnostics, that GCC is right and
that llvm_unreachable in llvm/clang codebase is a needed and correct
defensive measure.

Can you show an example of those false positives?

$ cat p.c
#include <stdio.h>

enum e { a, b = 4 } x = 3;

void g(int v) {
printf("%d\n", v);
}

int main(int argc, char **argv) {
switch (x) {
case a:
  g(0);
  break;
case b:
  g(1);
  break;
default:
  g(2);
  break;
}
}
$ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
p.c:17:3: warning: default label in switch which covers all enumeration
values
    [-Wcovered-switch-default]
default:
^
p.c:18:7: warning: will never be executed [-Wunreachable-code]
  g(2);
    ^
$ ./a.out
2

Of course -Wcovered-switch-default warning is a perfectly true positive.

My reading of the standard is that nothing prevent an enum to have a
value different from listed enum constants if this value is compatible
with enum range (and code generation seems to agree on that).

I've attached the patch for review.

The fixed testcase shows well why to hide warnings about undefined
behaviour in code actually generated is a bad thing.

If we do this, we're going to want this under a CFG option at the very least. The static analyzer should continue assuming that an enum input to a switch is always a valid enum constant, in order to keep our false positive rate down.

Yeah, I doubt this'll be any better in the compiler proper, really.
The heuristic exists to, as you rightly point out, reduce false
positives & that rationale exists in the compiler as well.

While, yes, it means we lose some true positives as well, that
probably isn't worth the increase in false positives.

I can see Abramo's point, however, that in a more defensive coding style the current -Wunreachable warning can easily be considered a false positive. We don't optimize out the default case in an enum-covered switch.

Flagging this as unreachable code is a bug & should be fixed - but
probably in the way I described. Treating it purely as reachable code
& emitting our 'runtime' diagnostics for code in such situations will
(I believe - though I haven't run numbers) increase false positive
rates substantially.

A trivial example that GCC often warns about & Clang deliberately does not:

int func(enum X v) {
  switch (v) {
  case A: return 1;
  case B: return 2;
  ... // fully covered
  case Z: return 26;
  }
  // GCC warns that the function may exit without a return value here,
Clang does not
  // the LLVM/Clang codebase has a lot of llvm_unreachables after
fully covered/exiting
  // switches like this to silence GCC's warning. Each of those is,
essentially, a GCC
  // false positive (in the sense that the code is not buggy).
}

:-o

Unless I'm missing something, the code is definitely buggy and leads to
undefined behaviour in C++ entering with v = Z + 1. Note that entering
with v = Z + 1, is not per se an undefined behavior: only the missing
return causes that.

Can we at least agree on that?

Yes & no. Yes a program exhibits UB if the function is called with v =
Z + 1, no the code is not (necessarily) buggy if the function is never
called with such invalid values.

If code is written in such a way as to not violate that invariant,
then the warning is a false positive (it has not found a bug in the
code). If people often write code with this invariant then the false
positive rate is "high" and the true positive rate is "not so high",
so we try to avoid warning & producing more noise than good advice.
(it's obviously not a 1:1 ratio, and it's certainly a judgement call)

If we'd agree on that we will easily discover that my proposed change
does not lead to any false positive diagnostics, that GCC is right and

Your definition of "false positive" differs from mine/ours. Hopefully
my description above helps describe the motivation here.

that llvm_unreachable in llvm/clang codebase is a needed and correct
defensive measure.

llvm_unreachable isn't a defensive measure - it causes UB if it's
reached in optimized builds. Indeed there's a thread about what Clang
should do by default for "falling off the end" type code. Currently it
traps on non-opt builds and is unreachable (not llvm_unreachable
(which asserts/prints stuff in debug builds) but the unreachable llvm
instruction which is used for optimization) in optimized builds.

- David

Yes, my definition of false positive is different, see:

http://en.wikipedia.org/wiki/Type_I_and_type_II_errors#False_positive_error

"is the default unreachable"? ("is there a wolf near the herd?")

If the message says "warning: will never be executed
[-Wunreachable-code]" ("Wolf, wolf!") then we have a false positive.

The idea that a warning is a false positive if it has not found a bug in
the code is rather weird... do you know any warning that is able to
always find a bug without knowing the programmer intentions?

That apart, of course if we want a compilation switch that deviates from
the standard and says that an enum typed expression cannot have any
value different from enum constants specified we can do that, but really
we want that?

What about enum designed to be used to represent a mask?

$ cat p.c
#include <stdio.h>

enum e { a, b = 4 } x = 3;

void g(int v) {
printf("%d\n", v);
}

int main(int argc, char **argv) {
switch (x) {
case a:
  g(0);
  break;
case b:
  g(1);
  break;
default:
  g(2);
  break;
}
}
$ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
p.c:17:3: warning: default label in switch which covers all enumeration
values
    [-Wcovered-switch-default]
default:
^
p.c:18:7: warning: will never be executed [-Wunreachable-code]
  g(2);
    ^
$ ./a.out
2

Of course -Wcovered-switch-default warning is a perfectly true positive.

My reading of the standard is that nothing prevent an enum to have a
value different from listed enum constants if this value is compatible
with enum range (and code generation seems to agree on that).

I've attached the patch for review.

The fixed testcase shows well why to hide warnings about undefined
behaviour in code actually generated is a bad thing.

If we do this, we're going to want this under a CFG option at the very least. The static analyzer should continue assuming that an enum input to a switch is always a valid enum constant, in order to keep our false positive rate down.

Yeah, I doubt this'll be any better in the compiler proper, really.
The heuristic exists to, as you rightly point out, reduce false
positives & that rationale exists in the compiler as well.

While, yes, it means we lose some true positives as well, that
probably isn't worth the increase in false positives.

I can see Abramo's point, however, that in a more defensive coding style the current -Wunreachable warning can easily be considered a false positive. We don't optimize out the default case in an enum-covered switch.

Flagging this as unreachable code is a bug & should be fixed - but
probably in the way I described. Treating it purely as reachable code
& emitting our 'runtime' diagnostics for code in such situations will
(I believe - though I haven't run numbers) increase false positive
rates substantially.

A trivial example that GCC often warns about & Clang deliberately does not:

int func(enum X v) {
  switch (v) {
  case A: return 1;
  case B: return 2;
  ... // fully covered
  case Z: return 26;
  }
  // GCC warns that the function may exit without a return value here,
Clang does not
  // the LLVM/Clang codebase has a lot of llvm_unreachables after
fully covered/exiting
  // switches like this to silence GCC's warning. Each of those is,
essentially, a GCC
  // false positive (in the sense that the code is not buggy).
}

:-o

Unless I'm missing something, the code is definitely buggy and leads to
undefined behaviour in C++ entering with v = Z + 1. Note that entering
with v = Z + 1, is not per se an undefined behavior: only the missing
return causes that.

Can we at least agree on that?

Yes & no. Yes a program exhibits UB if the function is called with v =
Z + 1, no the code is not (necessarily) buggy if the function is never
called with such invalid values.

If code is written in such a way as to not violate that invariant,
then the warning is a false positive (it has not found a bug in the
code). If people often write code with this invariant then the false
positive rate is "high" and the true positive rate is "not so high",
so we try to avoid warning & producing more noise than good advice.
(it's obviously not a 1:1 ratio, and it's certainly a judgement call)

If we'd agree on that we will easily discover that my proposed change
does not lead to any false positive diagnostics, that GCC is right and

Your definition of "false positive" differs from mine/ours. Hopefully
my description above helps describe the motivation here.

Yes, my definition of false positive is different, see:

http://en.wikipedia.org/wiki/Type_I_and_type_II_errors#False_positive_error

"is the default unreachable"? ("is there a wolf near the herd?")

If the message says "warning: will never be executed
[-Wunreachable-code]" ("Wolf, wolf!") then we have a false positive.

The idea that a warning is a false positive if it has not found a bug in
the code is rather weird... do you know any warning that is able to
always find a bug without knowing the programmer intentions?

Generally, no. They make local assumptions so to make a best effort at
finding bugs. Part of that is also to avoid finding non-bugs because
doing so creates a burden on developers that may eventually eclipse
the benefit they gain from the warning.

We're not trying to write warnings to teach people about the semantics
of their code - they can read books about that. They're meant to be
actionable for a good reason, not just to indicate that the developer
read & understood the diagnostic message & then went on their merry
way.

That apart, of course if we want a compilation switch that deviates from
the standard and says that an enum typed expression cannot have any
value different from enum constants specified we can do that, but really
we want that?

Not really, no - the compiler implements the standard. For the purpose
of diagnostics we might be interested in adding an attribute to enum
types or variables that could indicate whether that variable or
variables of that type are intended to contain an/all values in the
representable range (to cause us to do things like diagnose the code
after the switch or in the default of a covered switch as reachable,
etc).

What about enum designed to be used to represent a mask?

They happen, certainly. I'm not at all claiming that people never
deliberately put values outside the enum constants into an enum value.
Simply that the signal/noise in terms of bug finding/actionable/useful
warnings isn't considered (by me right now & by whoever implemented
this particular tweak whenever they implemented it) worthwhile to
assume enums contain all values in their representable range. Other
people might have other opinions (as you do) & I the only real way to
assess the situation is to run the diagnostic on a large codebase &
figure out how many cases would require semantically differing changes
(ie: code fixes) versus simple suppressions (like llvm_unreachable).

From our experience with GCC so far, it seems a fair bit more of the

latter & not much of the former I think.

- David

I'd like to clarify furhter this point.

What I mean is that if I ask to compiler to show me trigraphs uses with
-Wtrigraph I expect that it shows me every use and that it does not show
me the points where I don't use them (as this check can be done without
false positive and without false negative).

Similary if I ask to see portion of functions that are never executed I
expect that it shows me all the points where this is provably true and
that does not show me the points where this is provably false. About the
points where the compiler is not able to prove this property I expect it
will be silent (to reduce the signal/noise ration) or that, if useful,
these points will be shown using a different warning that can be
enabled/disable separately.

Yes, I agree this is the right approach.

$ cat p.c
#include <stdio.h>

enum e { a, b = 4 } x = 3;

void g(int v) {
printf("%d\n", v);
}

int main(int argc, char **argv) {
switch (x) {
case a:
g(0);
break;
case b:
g(1);
break;
default:
g(2);
break;
}
}
$ _clang -Wunreachable-code -Wcovered-switch-default -O2 p.c
p.c:17:3: warning: default label in switch which covers all enumeration
values
[-Wcovered-switch-default]
default:
^
p.c:18:7: warning: will never be executed [-Wunreachable-code]
g(2);
^
$ ./a.out
2

Of course -Wcovered-switch-default warning is a perfectly true positive.

My reading of the standard is that nothing prevent an enum to have a
value different from listed enum constants if this value is compatible
with enum range (and code generation seems to agree on that).

I’ve attached the patch for review.

The fixed testcase shows well why to hide warnings about undefined
behaviour in code actually generated is a bad thing.

If we do this, we’re going to want this under a CFG option at the very least. The static analyzer should continue assuming that an enum input to a switch is always a valid enum constant, in order to keep our false positive rate down.

Yeah, I doubt this’ll be any better in the compiler proper, really.
The heuristic exists to, as you rightly point out, reduce false
positives & that rationale exists in the compiler as well.

While, yes, it means we lose some true positives as well, that
probably isn’t worth the increase in false positives.

I can see Abramo’s point, however, that in a more defensive coding style the current -Wunreachable warning can easily be considered a false positive. We don’t optimize out the default case in an enum-covered switch.

Flagging this as unreachable code is a bug & should be fixed - but
probably in the way I described. Treating it purely as reachable code
& emitting our ‘runtime’ diagnostics for code in such situations will
(I believe - though I haven’t run numbers) increase false positive
rates substantially.

A trivial example that GCC often warns about & Clang deliberately does not:

int func(enum X v) {
switch (v) {
case A: return 1;
case B: return 2;
… // fully covered
case Z: return 26;
}
// GCC warns that the function may exit without a return value here,
Clang does not
// the LLVM/Clang codebase has a lot of llvm_unreachables after
fully covered/exiting
// switches like this to silence GCC’s warning. Each of those is,
essentially, a GCC
// false positive (in the sense that the code is not buggy).
}

:-o

Unless I’m missing something, the code is definitely buggy and leads to
undefined behaviour in C++ entering with v = Z + 1. Note that entering
with v = Z + 1, is not per se an undefined behavior: only the missing
return causes that.

Can we at least agree on that?

Yes & no. Yes a program exhibits UB if the function is called with v =
Z + 1, no the code is not (necessarily) buggy if the function is never
called with such invalid values.

If code is written in such a way as to not violate that invariant,
then the warning is a false positive (it has not found a bug in the
code). If people often write code with this invariant then the false
positive rate is “high” and the true positive rate is “not so high”,
so we try to avoid warning & producing more noise than good advice.
(it’s obviously not a 1:1 ratio, and it’s certainly a judgement call)

If we’d agree on that we will easily discover that my proposed change
does not lead to any false positive diagnostics, that GCC is right and

Your definition of “false positive” differs from mine/ours. Hopefully
my description above helps describe the motivation here.

Yes, my definition of false positive is different, see:

http://en.wikipedia.org/wiki/Type_I_and_type_II_errors#False_positive_error

“is the default unreachable”? (“is there a wolf near the herd?”)

If the message says “warning: will never be executed
[-Wunreachable-code]” (“Wolf, wolf!”) then we have a false positive.

The idea that a warning is a false positive if it has not found a bug in
the code is rather weird… do you know any warning that is able to
always find a bug without knowing the programmer intentions?

Generally, no. They make local assumptions so to make a best effort at
finding bugs. Part of that is also to avoid finding non-bugs because
doing so creates a burden on developers that may eventually eclipse
the benefit they gain from the warning.

We’re not trying to write warnings to teach people about the semantics
of their code - they can read books about that. They’re meant to be
actionable for a good reason, not just to indicate that the developer
read & understood the diagnostic message & then went on their merry
way.

We try to guess people’s intentions all the time. -Wmissing-semi will insert the semicolon and continue compiling as if it had been there. -Wparentheses is a controversial style warning, but it catches real bugs. We don’t warn about tautological comparisons in templates because many times the template itself isn’t tautological, only the instantiation.

When talking about the analyzer I’ve sometimes called behavior like this “pedantically true”—cases where something could happen and we have no way to check it within this translation unit, but it most likely won’t and programmers shouldn’t have to spend time thinking about it. That applies in both directions: we shouldn’t give a -Wreturn-type warning if there’s a fully-covered enum switch, but we shouldn’t call it out as being unreachable either. So I agree with your latest e-mail in the thread.

(The llvm_unreachable does allow for optimization that we wouldn’t get otherwise, so we probably should stop calling it a sop to GCC.)

What about enum designed to be used to represent a mask?

They happen, certainly. I’m not at all claiming that people never
deliberately put values outside the enum constants into an enum value.

This is legal in C++11 and probably in C11 as well, but for what it’s worth, doing a switch on a value whose type is an enum of bit-flags is probably almost always the wrong thing to do.