Handling of empty structs in C

Hi,

I have just helped a colleague track down a bug in a project using mixed C and C++ code which revolved around the fact that an empty struct in C has zero size and in C++ has non-zero size. Since a picture is a thousand words, a simplified version of the code is as follows:

header.h

When combining C and C++ files it is recommended to specify option '-Wc+±compat. It enables specific messages related to incompatibilities between C and C++. In your example relevant message appears:

clang -Wc+±compat -c lib.c -o lib.o

In file included from lib.c:1:
./header.h:5:9: warning: empty struct has size 0 in C, size 1 in C++ [-Wc+±compat]
typedef struct empty { } empty;

But -Wall doesn’t enable these warnings.

Thanks,
–Serge
^
1 warning generated.

Whenever you think "why doesn't the compiler warn here?!", create a test case (as you did) and try compiling it with -Weverything, which enables literally all warnings.

Cheers,

IMO extern “C” is a clear indication that the user wants to interoperate between C and C++. clang should have this warning on by default in extern “C” contexts.

The problem is that the compiler only sees the extern "C" when compiling in C++ mode, but the author of the header most likely compiles predominantly in C mode.

It would be nice to have an annotation that could be added to a header that would let the compiler (or some tool) automatically check that it would work if compiled in either C or C++ mode.

To check properly, you'd need to parse it as both C and C++ and do the full range of checks, ideally also checking that symbols have the same linkage in both versions.

For a simple example of why this is difficult, in FreeBSD libc (and many other libc implementations) we have __BEGIN_DECLS and __END_DECLS macros. These expand to extern "C" { and } respectively in C++ mode, or to nothing in C mode. Just looking for extern "C", even inside an #ifdef, won't help you.

I'd like to see such a tool, and ideally one that would let you specify what language dialects you expect a header to be valid for. For example:

#pragma clang valid_languages c99, c89, gnu99, gnu++99

Although ideally something other than a pragma, so that we could have a __VALID_C and __VALID_CXX macros that would expand to something different depending on the compiler and the language dialects that we care about.

The tool would then check trivial compatibility (does this file parse as valid code in all of these dialects), but also less trivial properties, including:

- Are all C structs declared in it POD types in C++?
- Are all C functions extern "C" linkage in C++?
- ...?

David

> IMO extern "C" is a clear indication that the user wants to interoperate
between C and C++. clang should have this warning on by default in extern
"C" contexts.

The problem is that the compiler only sees the extern "C" when compiling
in C++ mode, but the author of the header most likely compiles
predominantly in C mode.

Sure, but it'll help the end user to see a warning as soon as they compile
in C++ mode, this seems like a big step up from runtime problems. I'm not
sure how much the added complexity of a pragma, etc, would add over this.

It would be nice to have an annotation that could be added to a header
that would let the compiler (or some tool) automatically check that it
would work if compiled in either C or C++ mode.

To check properly, you'd need to parse it as both C and C++ and do the
full range of checks, ideally also checking that symbols have the same
linkage in both versions.

For a simple example of why this is difficult, in FreeBSD libc (and many
other libc implementations) we have __BEGIN_DECLS and __END_DECLS macros.
These expand to extern "C" { and } respectively in C++ mode, or to nothing
in C mode. Just looking for extern "C", even inside an #ifdef, won't help
you.

I'd like to see such a tool, and ideally one that would let you specify
what language dialects you expect a header to be valid for. For example:

#pragma clang valid_languages c99, c89, gnu99, gnu++99

Although ideally something other than a pragma, so that we could have a
__VALID_C and __VALID_CXX macros that would expand to something different
depending on the compiler and the language dialects that we care about.

The tool would then check trivial compatibility (does this file parse as
valid code in all of these dialects),

Might be a nice tool, sure - but it's still going to be an extra tool, I'd
imagine, not just a flag passed to the compiler - at which point it's
probably just as easy for someone to setup their build system with a few
different configurations they want to ensure correctness for.

> IMO extern "C" is a clear indication that the user wants to interoperate
between C and C++. clang should have this warning on by default in extern
"C" contexts.

The problem is that the compiler only sees the extern "C" when compiling
in C++ mode, but the author of the header most likely compiles
predominantly in C mode.

I'm honestly not worried about this. If someone is writing a header that
they expect to be usable from language modes X, Y, and Z, and they neither
try building the code in all of those language modes nor telling us they
intend to build the code in those modes (by using -Wc++-compat, for
instance), I think it's reasonable to say that they are beyond our help.

The -Wc++-compat warning we already have is nice, but we should probably
also warn (with an on-by-default warning!) if this occurs in an extern "C"
block in C++.

It would be nice to have an annotation that could be added to a header
that would let the compiler (or some tool) automatically check that it
would work if compiled in either C or C++ mode.

To check properly, you'd need to parse it as both C and C++ and do the
full range of checks, ideally also checking that symbols have the same
linkage in both versions.

For a simple example of why this is difficult, in FreeBSD libc (and many
other libc implementations) we have __BEGIN_DECLS and __END_DECLS macros.
These expand to extern "C" { and } respectively in C++ mode, or to nothing
in C mode. Just looking for extern "C", even inside an #ifdef, won't help
you.

I'd like to see such a tool, and ideally one that would let you specify
what language dialects you expect a header to be valid for. For example:

#pragma clang valid_languages c99, c89, gnu99, gnu++99

Although ideally something other than a pragma, so that we could have a
__VALID_C and __VALID_CXX macros that would expand to something different
depending on the compiler and the language dialects that we care about.

The tool would then check trivial compatibility (does this file parse as
valid code in all of these dialects), but also less trivial properties,
including:

- Are all C structs declared in it POD types in C++?

We already check that types passed to / returned from extern "C" functions
are C-compatible (POD is not the right thing to check). It would make sense
to check the zero-sized struct case here too.

- Are all C functions extern "C" linkage in C++?

We can only do this check if we're building in C++, but if we're building
in C++, this check is not appropriate, because cross-language headers
frequently declare extra or different function signatures when built in C++
(for instance, an overload set of 'abs' instead of a single function). I
think this would have too much noise to be really useful.

An extension to this would be to additionally include this warning if the
compiler comes across a use of the __cplusplus macro in C mode, perhaps?