-Warray-bounds seems over-zealous on Clang

Gentlemen, this is Clang; we can differentiate between '1' as a literal
value and '1' resulting from a template- or macro-metaprogrammed
computation. The former case has no legitimate examples and can
be safely assumed to be a pre-C99 flexible array member. The latter
case should still be diagnosed using -Warray-bounds.

John.

+1.

The point of a warning is to be useful, not abusive. If going to extreme measures makes the warning more useful and causes it to be more useful by default, then I think it is worth it.

Clang needs to be tolerant of different programming styles and constraints. While it is never possible to make everyone happy, disabling the warning in a very-likely false positive, or splitting that case out into a sub -W flag is the right way to go.

-Warray-bounds-single-element seems like overkill to me, but is completely feasible.

-Chris

In the past, warnings that warns about a common pattern but also find
bugs have been kept on-by-default. Search this list for
-Wdelete-non-virtual-dtor.

I don't have time to review Clang's exact behavior there right now,
but my guess is that this is a false equivalency. Code that causes
that warning at the class definition (which I think is what you allude
to - no delete is required) is inherently wrong according to the C++
intelligentsia, unless the base dtor is declared protected per an
idiom described by Herb Sutter. No one would ever encourage virtual
functions with a non-virtual public dtor (some styleguides go so far
as to require all dtors be virtual, which even the std lib violates),
whereas what I describe is in various text books.

This bug is in ICU, which is pretty commonly used. Someone (me) posted
an example where this warning found a bug within 2h of Chris asking
for it, so maybe it's not as vanishingly small as you think.

That bug was really obvious, and was a narrow case of someone misusing
a struct that used this idiom. I doubt the developer of that app would
thank you for your attitude - you just saved them this one bug, but
you probably created a whole bunch of spurious warnings at other parts
of their codebase in the process. Why should they even think that it's
valid in this one place anyway, where it is only valid by dumb luck?
They weren't clued-in enough to catch a really obvious bug in the
first place.

A small meta comment:
Your complaints would be less annyoing to read if you'd
stop using cheap rhetorical tricks like displaying your use case
as "sane", "reasonable", representing the "vast majority", etc.;
and others as "paranoid", "burdened with passing additional compiler flags"
(WOW! that's something I have to remember as a nifty excuse for the next time the
delivery date is missed! Sorry, I was so burdened passing all the additional
compiler flags, really, I couldn't make it!
sorry, I'm using cheap rhetorical tricks ;-),
"gross", "unhygienic" etc.

But, alas!, speaking as a "paranoid avionics engineer", being expected
to happily carry the compiler flag burden home with me, I can only recommend being
compliant with latest language standards, in this case C99.
Shouldn't be too bad for postgresql to target a now
already aging standard.
Also the software I'm working on suffered some from some non-standard
extensions that we took as granted, but were marked as an error or were
given a warning by clang.

So, I did not see an argument why using the "unhygienic" suggestion
of C99 variable length arrays isn't viable; if MS cl.exe does not support
this (which I don't know), I'd see this as an insufficient standards
implementation and regard this as the problem of this compiler, not of clang.
In this case, I'd recommend to use the "gross" suggestion of Reid Kleckner.
This is analogous to what e.g. boost does for portability to older non
compliant compilers, and in most cases I wouldn't attribute the boost libs
with any of the fancy negative words you brought up so far.
And I'd regard the fact that this expression of the idiom
is - as Reid pointed out - self-documenting as adhering to very high hygienic
standards.

Regards
Titus

Please stop putting words in my mouth. I never called anything
unhygienic. Any of the adjectives I used were my opinion. I really
don't see how you could consider that a cheap rhetorical trick. If
you'd like me to demonstrate that I represent a large segment of
people by taking issue with the way Clang currently deals with this,
I'd be happy to. I consider that unnecessary though, as it has already
been acknowledged by Chris and Chandler.

It would be nice if we could move to C99; I'd be the first to support
such a move, if it was practical. It is not practical for highly
portable software such as PostgreSQL to move to C99, probably for more
reasons than MSVC compatibility, although we do already use some
widely supported C99 features. Uptake of C99 just hasn't been what I'm
sure we'd all like it to be. Pretending that this isn't the case isn't
going to help C99's cause, or Clang's.

I think someone mentioned the idea of adding a separate warning for this
narrower case and, in the case of compiling for c99, suggesting a fixup of
using a flexible array. In the case of c89 at least it’d be a separate
warning that, if the pattern is used extensively in the codebase, could be
disabled separately from the more general warning (which would be adjusted
to ignore these cases).

This would allow for the bug that was mentioned to have been found, while
allowing you to disable this warning if you don’t think the chance of such
bugs coming up is worth the code hygiene cost of the above construct or the
move to c99 would be worthwhile in your case.

While strictly speaking that would be a valid solution, it would also
be overly-complex considering the benefits IMHO. I’d now have a flag
to pass to Clang, but only to Clang, despite the fact that it’s
supposed to be a drop-in replacement for GCC. I’d now have to
explicitly indicate that I was using C89.

If you are compiling c89, that doesn’t seem onerous - perhaps one day GCC could get smarter & start warning you about this or other superseded constructs if you continue to compile as c99 but avoid c99 features.

So, options that seem to be being discussed include:

  1. suppress this warning in all cases where the array is of length 1 and the last element in a struct
    1.1) refinement: only when the length is specified explicitly and not via macro expansion, etc. (as John suggested)
    1.2) refinement: under c99 recommend a fixup to use flexible arrays
  2. split the warning in two, the second being the cases suppressed by the above option (probably less interesting if 1.1 is implemented)

Once again, the burden would be on application developers who are just
using a demonstrably mainstream idiom,

Mainstream in c89 with better alternatives in c99.

(1) and (2) aren’t mutually exclusive. (2) is still useful when the heuristics implied by 1.1 and 1.2 aren’t good enough.

Hello

So, options that seem to be being discussed include:

  1. suppress this warning in all cases where the array is of length 1 and the last element in a struct
    1.1) refinement: only when the length is specified explicitly and not via macro expansion, etc. (as John suggested)
    1.2) refinement: under c99 recommend a fixup to use flexible arrays
  2. split the warning in two, the second being the cases suppressed by the above option (probably less interesting if 1.1 is implemented)

(1) and (2) aren’t mutually exclusive. (2) is still useful when the heuristics implied by 1.1 and 1.2 aren’t good enough.

I’ve tried to write such a patch, as it seemed simple, but I’m stuck because I don’t know enough
about internal clang’s APIs yet.
Until now, I’ve come up with the simple patch attached, that disables the warning if
the array is declared inside a record type and the size is one.
Questions:

  • From the NamedDecl* object representing the array declaration, how do I know if it’s declared last in the struct?
  • From the NamedDecl*, how do I know if the size of the member declaration comes from a macro expansion?

Thanks,
Nicola

pgsql_warning.patch (698 Bytes)

So, options that seem to be being discussed include:

  1. suppress this warning in all cases where the array is of length 1 and the last element in a struct
    1.1) refinement: only when the length is specified explicitly and not via macro expansion, etc. (as John suggested)
    1.2) refinement: under c99 recommend a fixup to use flexible arrays
  2. split the warning in two, the second being the cases suppressed by the above option (probably less interesting if 1.1 is implemented)

(1) and (2) aren’t mutually exclusive. (2) is still useful when the heuristics implied by 1.1 and 1.2 aren’t good enough.

I’ve tried to write such a patch, as it seemed simple, but I’m stuck because I don’t know enough
about internal clang’s APIs yet.
Until now, I’ve come up with the simple patch attached, that disables the warning if
the array is declared inside a record type and the size is one.
Questions:

  • From the NamedDecl* object representing the array declaration, how do I know if it’s declared last in the struct?

It should be a FieldDecl, and its parent should be a RecordDecl. I don’t think there’s a cleaner solution than just iterating through the fields of the record and complaining if it’s not the last one.

  • From the NamedDecl*, how do I know if the size of the member declaration comes from a macro expansion?

The exception should apply to:

  • a field of a struct, where
  • the field’s TypeLoc is a (possibly parenthesized) ConstantArrayTypeLoc and
  • the size expression in that TypeLoc is a (possibly parenthesized) IntegerLiteral and
  • the source location of that literal is not a macro ID.

John.

Oh, of course there’s getNextDeclInContext(); you could just walk those (until you get a null pointer back) and make sure there aren’t any FieldDecls.

John.

Thanks for your efforts. I wish I was familiar enough with the Clang
code base to help.

The exception should apply to:
  - a field of a struct, where

ok

  - the field's TypeLoc is a (possibly parenthesized) ConstantArrayTypeLoc and

ok, do you mean that getTypeLocClass() == ConstantArray?

  - the size expression in that TypeLoc is a (possibly parenthesized) IntegerLiteral and

where do I get the size expression from the typeloc?

  - the source location of that literal is not a macro ID.

John.

Nicola

- the field's TypeLoc is a (possibly parenthesized) ConstantArrayTypeLoc and

ok, do you mean that getTypeLocClass() == ConstantArray?

isa<ConstantArrayTypeLoc>, please, but that's the idea.

- the size expression in that TypeLoc is a (possibly parenthesized) IntegerLiteral and

where do I get the size expression from the typeloc?

Look at the methods on ArrayTypeLoc.

John.

So, options that seem to be being discussed include:

  1. suppress this warning in all cases where the array is of length 1 and the last element in a struct
    1.1) refinement: only when the length is specified explicitly and not via macro expansion, etc. (as John suggested)
    1.2) refinement: under c99 recommend a fixup to use flexible arrays
  2. split the warning in two, the second being the cases suppressed by the above option (probably less interesting if 1.1 is implemented)

(1) and (2) aren’t mutually exclusive. (2) is still useful when the heuristics implied by 1.1 and 1.2 aren’t good enough.

I’ve tried to write such a patch, as it seemed simple, but I’m stuck because I don’t know enough
about internal clang’s APIs yet.
Until now, I’ve come up with the simple patch attached, that disables the warning if
the array is declared inside a record type and the size is one.
Questions:

  • From the NamedDecl* object representing the array declaration, how do I know if it’s declared last in the struct?

It should be a FieldDecl, and its parent should be a RecordDecl. I don’t think there’s a cleaner solution than just iterating through the fields of the record and complaining if it’s not the last one.

  • From the NamedDecl*, how do I know if the size of the member declaration comes from a macro expansion?

The exception should apply to:

  • a field of a struct, where
  • the field’s TypeLoc is a (possibly parenthesized) ConstantArrayTypeLoc and
  • the size expression in that TypeLoc is a (possibly parenthesized) IntegerLiteral and
  • the source location of that literal is not a macro ID.

Hello.

I’ve attached the patch. It seems to work as intended. The only thing missing is to split up
the warning. How to do it?

Also what do you mean by “possibly parenthesized” ConstantArrayTypeLoc and IntegerLiteral?

John.

Nicola

pgsql_warning.patch (3.65 KB)

While I thank you for your efforts, I don't see why we have to
explicitly use C89 to avoid the warning, since we're only avoiding the
warning under circumstances exactly consistent with the use of the
idiom.

What's valid in C89 is valid in C99, with very few exceptions. Just
because this is arguably bad style in C99 does not mean that we should
see a warning - warnings are supposed to prevent downright errors that
are still valid code. No one would seriously suggest that we should
have warnings when C style casts are used in C++ on non-POD types,
even though they're extremely hazardous there. Precedent matters.

Consider that in PostgreSQL, a large C application with almost 1000 C
files, we only see this warning 4 or 5 times. The sorts of errors we
hope to protect people from with this warning are very unlikely to be
statically detectable.

While I thank you for your efforts, I don’t see why we have to
explicitly use C89 to avoid the warning, since we’re only avoiding the
warning under circumstances exactly consistent with the use of the
idiom.

What’s valid in C89 is valid in C99, with very few exceptions. Just
because this is arguably bad style in C99 does not mean that we should
see a warning - warnings are supposed to prevent downright errors that
are still valid code. No one would seriously suggest that we should
have warnings when C style casts are used in C++ on non-POD types,
even though they’re extremely hazardous there. Precedent matters.

Hm, I just tried to implement what seemed to be the general accepted solution
to your problem.
That said, if I understand your point, you want to develop a huge crossplatform
codebase written in C, but:

  • You can’t fully use C99 because some compilers won’t support it (MSVC).
  • So your code is C89, but you can force C89 compliance because you rely
    on some widely supported C99 feature (I suppose these are trivial things like
    mixing declarations and code or //-style comments, or the scope of the
    for() initializers, what else?)
  • You want to support a lot of platforms and compilers, but you’re reluctant
    to tune your build system for specific compilers
  • You don’t want to deal with compilers differences with macro tricks because
    of “code hygiene”
  • You want your code to compile warning-clean on all the compilers you support,
    but without disabling warnings from the command line.

Clang or not, I don’t think these requirements are generally easy to fulfill,
you need to make a choice.

Anyway, this patch suppress the warning also in -std=gnu89 mode, which
is the default for gcc, so you could simply pass -std=gnu89 to both compilers
and you’ll have true drop-in compatibility.


Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Bye,

Nicola

P.S. this was just an exercise to me, so reviews for the patch are welcomed anyway
P.P.S. clang accepts flexibles arrays also with -std=c89 -pedantic. Is it compliant?

So, is there any chance of getting Nicola's patch committed in some form?

I’m planning to look at this tomorrow.

Thanks.

I urge you to take a pragmatic view when considering when a warning
should still be displayed. I really don't think you'll be doing
affected developers any favours by showing a "consider flexible
arrays" warning when C99 is used - in most cases, given the reality of
how slow the adoption of C99 has been, you'll just be hectoring them
rather than telling them something that they didn't know. I'd have a
lot more sympathy for the view that such a warning should be seen if
C99 wasn't Clang's default way of building C, but it is.

The reality is that most C projects are using C89 with various very
common extensions that made it into C99. I think that Clang should do
what is most useful for the largest number of people, even if that
comes at the expense of an ivory tower notion of correctness.

I'm sorry if some people disliked how I expressed my views on this
issue - I felt strongly that Clang should go a certain way, and so
used correspondingly strong language. I really just want the best
outcome for both Clang and PostgreSQL.

Hi Chandler

I'm planning to look at this tomorrow.

Did you get a chance to look at it?

It will interest some of you to know that I've created a blog post on
my experiences with the Clang community, which have been very
positive. The post is also a general Clang advocacy piece. It says
that it was posted on Saturday, although it wasn't syndicated on
planet Postgresql today and I just started blogging, so it was
effectively made public 3 hours ago:

I think that the yet-to-be-published follow-up blog post, which
explains some work I did in co-operation with a colleague of mine,
Greg Smith, will generate a lot of interest. He is probably the
world's foremost authority on PostgreSQL performance, in part because
he wrote an extremely well received book on the subject.