Consider the attached testcase. When built with Clang from SVN tip, it
produces the following warning:
clang_testcase.c:21:11: warning: comparison of unsigned enum expression < 0 is
always false [-Wtautological-compare]
if (test < 0)
~~~~ ^ ~
Could you provide the full source & command line used? My attempt to
reproduce this don't seem to be consistent with your claim. I'm only
getting that warning if I explicitly specify the backing type of the
enum as unsigned:
dblaikie@mediabox:~/Development/scratch$ cat test.cpp
enum foo { X, Y };
int bar(foo f) {
if (f < 0)
return 3;
return 7;
}
int main() {
}
dblaikie@mediabox:~/Development/scratch$ clang++ -Wtautological-compare test.cpp
dblaikie@mediabox:~/Development/scratch$ clang++ -std=c++0x
-Wtautological-compare test.cpp
dblaikie@mediabox:~/Development/scratch$ cat test2.cpp
enum foo: unsigned { X, Y };
int bar(foo f) {
if (f < 0)
return 3;
return 7;
}
int main() {
}
dblaikie@mediabox:~/Development/scratch$ clang++ -std=c++0x
-Wtautological-compare test2.cpp
test2.cpp:4:9: warning: comparison of unsigned enum expression < 0 is
always false [-Wtautological-compare]
if (f < 0)
~ ^ ~
1 warning generated.
What's also true is that compilers may opt to invariably make enums
signed. While I'm not aware that any compiler does so, I'm also not
aware that they all don't, and I sincerely doubt that the optimisation
GCC and Clang perform here to maximize the number of values that can
be represented is all that useful in practice, so I can certainly
imagine compiler vendors not bothering with it.
I don't think this is necessarily an optimization, as such - either
way the compiler would have to have be able to flip between different
representations based on the explicit enum values specified (eg: if
you specify -1 as an enum value it's going to have to used a signed
type, yet if you used the max unsigned nit value as a value in another
enum it'd have to be able to choose unsigned int (or signed long - in
which case apply the same problem to max unsigned long long, etc). The
'optimization' is then just a question of which type it uses by
default - if your values are all within the range of unsigned int and
int, either is equally valid & there's no cost (that I know of) to
choosing one over the other. So not really an optimization)
Anyway, the above test could be perfectly valid in the case where some
compiler's enum representation is signed.
I haven't checked the spec but I'll take your word for this. Seems
possible/reasonable.
That isn't true of Clang or
GCC, but I feel that Clang should aim to both follow the standard to
the letter and follow it in spirit, allowing people to write portable
code.
At which point I sort of question the purpose of the code testing for
enum > 0, really. (details to follow)
If you find this scenario implausible, consider that this
actually happened on Postgres, due to the fact that we checked that a
certain enum variable's value could be safely used as an array
subscript within our client library, libpq.
Could you point to some source - I'm curious about what exactly is
going on here. Are you checking a specific enum value passed to your
API? If so it could be outside the range of actual enum values anyway,
couldn't it? & knowing that the one enum value they passed is within
the range of the array doesn't necessarily tell you anything about
whether the entire range of enums is within the range, does it? So
what do you do with one enum value within the range, but others not?
We weren't about to trust
arbitrary client code to not pass an invalid value.
Shouldn't you be validating this against the range of enum values (eg:
assert(first_enum <= val && val <= last_enum)) ?
I am aware that using the constant literal in place of 0 prevents the
warning from being seen.
By constant literal you mean the first enum value, rather than the
literal 0? That seems like a different check, semantically.
If you want to support this scenario it sounds like there's two steps:
1) validating that a user specified enum value is within the range of
acceptable values (assuming the spec requires that unspecified enum
values are at least contiguous, I haven't checked that detail myself)
- as above
2) validating that the values of an enum without a specified backing
type are usable as array indecies - wouldn't this be better achieved
simply by taking "value - first_enum"? That way it would work no
matter what range the compiler decided to use for your enum values.
(if the C standard requires that unspecified enum values must start
from 0 then this step is unnecessary)
Nonetheless, I thought that it was worth
considering if this is a case that we could handle better.
My best guess, based on experiment, is that it's already handling this
better but I'm not sure it's even necessary. I imagine it must be
being handled deliberately perhaps for the reasons you cited, though
I'm not sure I agree with them as being necessary.
Incidentally, Clang now builds Postgres without any warnings, other
than a warning that we get with GCC too and can't really do anything
about. I've blogged about it. Impressively, Clang caught a bug that,
if a certain struct had been used a certain way that we'd generally
consider to be perfectly reasonable, would have resulted in undefined
behaviour. Luckily or unluckily, it actually wasn't being used that
way, but that could have easily changed over time.
Did you blog about this particular issue? I'd be curious to see what
construct it found.
- David