C++ modules imported in extern "C"

With C++ modules being far enough along to be included in -fmodules now I tried enabling -fmodules on a small C++ program I have laying around. I was impressed that it almost all worked right away.

Here's the issue I ran into with one header:

glcorearb.h:616:1: error: import of C++ module 'Darwin.C.stddef' appears within extern "C" language linkage specification
#include <stddef.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins here
extern "C" {
^
glcorearb.h:1517:1: error: import of C++ module 'Darwin.C.inttypes' appears within extern "C" language linkage specification
#include <inttypes.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins here
extern "C" {
^

Even without modules it seems like people should be taking care not #include a header in a context that may not be compatible with the content of those headers. Here we are building C++ and stddef.h and inttypes.h are therefore C++ headers that might not work in an extern "C" context. Platforms owners, who are in control of both the including file and the included file can legitimately do this, but I don't think this is good practice when included someone else's header. However the standards don't clearly disallow this and I have certainly seen circumstances where people recommended extern "C" { #include <system header> } for various reasons.

In this specific case it's not even deliberate as far as I can tell, but it's worked and so no one's bothered to fix it here. A quick hack to remove these #includes from extern "C" solved the issue for me.

Anyway, I checked http://clang.llvm.org/docs/Modules.html#modularizing-a-platform but didn't notice any recommendations that apply here. So what should be recommended? Is this an 'anti-pattern' for modules? An anti-pattern for 'well organized' non-module headers? Should modules be able to coexist with these kinds of shenanigans when the code worked fine pre-modules? Should I contact the developers of this header and recommend a change? (It's an automatically generated header from gl3w.)

Thanks,
Seth

With C++ modules being far enough along to be included in -fmodules now I
tried enabling -fmodules on a small C++ program I have laying around. I was
impressed that it almost all worked right away.

Here's the issue I ran into with one header:

glcorearb.h:616:1: error: import of C++ module 'Darwin.C.stddef' appears
within extern "C" language linkage specification
#include <stddef.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins
here
extern "C" {
^
glcorearb.h:1517:1: error: import of C++ module 'Darwin.C.inttypes'
appears within extern "C" language linkage specification
#include <inttypes.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins
here
extern "C" {
^

Even without modules it seems like people should be taking care not
#include a header in a context that may not be compatible with the content
of those headers. Here we are building C++ and stddef.h and inttypes.h are
therefore C++ headers that might not work in an extern "C" context.
Platforms owners, who are in control of both the including file and the
included file can legitimately do this, but I don't think this is good
practice when included someone else's header. However the standards don't
clearly disallow this

C++ absolutely does disallow it for its own headers (which include
<stddef.h> and <inttypes.h>):

[using.headers]p3: "A translation unit shall include a header only outside
of any external declaration or definition, and shall include the header
lexically before the first reference in that translation unit to any of the
entities declared in that header. No diagnostic is required."

and I have certainly seen circumstances where people recommended extern "C"

{ #include <system header> } for various reasons.

In this specific case it's not even deliberate as far as I can tell, but
it's worked and so no one's bothered to fix it here. A quick hack to remove
these #includes from extern "C" solved the issue for me.

Anyway, I checked
Modules — Clang 18.0.0git documentation but
didn't notice any recommendations that apply here. So what should be
recommended? Is this an 'anti-pattern' for modules? An anti-pattern for
'well organized' non-module headers? Should modules be able to coexist with
these kinds of shenanigans when the code worked fine pre-modules? Should I
contact the developers of this header and recommend a change? (It's an
automatically generated header from gl3w.)

See Modules — Clang 18.0.0git documentation and in
particular the description of the [extern_c] attribute. By default, a
module that can be used in C++ can only be used at global scope (outside of
all namespaces and language linkage specifications). This attribute says
that the module provides a C interface to C++ code, and thus:
1) The module is built surrounded by an implicit extern "C" context, and
2) The module can be imported within a top-level extern "C" linkage
specification.

This attribute should typically be applied to modules whose headers contain
the

#ifdef __cplusplus
extern "C" {
#endif
// ...
#ifdef __cplusplus
}
#endif

pattern. Depending on your platform, it may or may not make sense to use
the [extern_c] on your libc module; Darwin's doesn't (probably because its
module map predates this attribute) but my glibc module map does.

Oh, excellent. That means I can report it to the header’s devs as a bug with a nice standards quote. Thanks.

Would it make sense for the compiler, in the interest of backwards compatibility, to just go ahead and import a module ignoring the fact that it’s imported in an extern “C” context? Since it’s a module the correct interface is already known, including whether the declarations are extern “C” or extern “C++”. Having this error downgraded to a warning would make adopting modules that much easier (although I’m glad to have another “no diagnostic required” error become easily detectable).

Thanks,
Seth

That breaks things. Suppose you have a C header that does *not* include the
above conditional 'extern "C" {' shenanigans. From C++ code, you'd use that
library like this:

extern "C" {
  #include "that_library.h"
}

If you supply a module map for that library and you add neither 'requires
!cplusplus' nor '[extern_c]', *and* we follow your suggestion, then we'd
build that_library.h has a non-extern-C C++ header file, which will lead to
weird link errors.

And we can't link the implicit "extern "C" { ... }" to the 'requires' line
because there are also libraries that provide a richer interface when
included from C++ (they don't just wrap everything in extern "C"
themselves). The best option here seems to be to require module maps to
explicitly specify this property.

Doesn’t providing such a module map constitute an incorrect assertion that the header works as a C++ module, and so an error there would be okay?

I think a more common situation will be people just flipping on -fmodules without providing module maps for all their headers, so for them extern “C” { #include “that_library.h” } will work just fine. And if they do get around to writing their own module maps they should correctly mark C modules as C.

Of course if my system’s module map used the extern_c attribute there’d be no problem either way, so I agree that’s the real solution. I’ll file a bug with Apple for this as well.

And here’s the bug I filed with Khronos for their header: https://www.khronos.org/bugzilla/show_bug.cgi?id=1236

With C++ modules being far enough along to be included in -fmodules now
I tried enabling -fmodules on a small C++ program I have laying around. I
was impressed that it almost all worked right away.

Here's the issue I ran into with one header:

glcorearb.h:616:1: error: import of C++ module 'Darwin.C.stddef' appears
within extern "C" language linkage specification
#include <stddef.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins
here
extern "C" {
^
glcorearb.h:1517:1: error: import of C++ module 'Darwin.C.inttypes'
appears within extern "C" language linkage specification
#include <inttypes.h>
^
glcorearb.h:5:1: note: extern "C" language linkage specification begins
here
extern "C" {
^

Even without modules it seems like people should be taking care not
#include a header in a context that may not be compatible with the content
of those headers. Here we are building C++ and stddef.h and inttypes.h are
therefore C++ headers that might not work in an extern "C" context.
Platforms owners, who are in control of both the including file and the
included file can legitimately do this, but I don't think this is good
practice when included someone else's header. However the standards don't
clearly disallow this

C++ absolutely does disallow it for its own headers (which include
<stddef.h> and <inttypes.h>):

[using.headers]p3: "A translation unit shall include a header only
outside of any external declaration or definition, and shall include the
header lexically before the first reference in that translation unit to any
of the entities declared in that header. No diagnostic is required.”

Oh, excellent. That means I can report it to the header’s devs as a bug
with a nice standards quote. Thanks.

and I have certainly seen circumstances where people recommended extern

"C" { #include <system header> } for various reasons.

In this specific case it's not even deliberate as far as I can tell, but
it's worked and so no one's bothered to fix it here. A quick hack to remove
these #includes from extern "C" solved the issue for me.

Anyway, I checked
Modules — Clang 18.0.0git documentation but
didn't notice any recommendations that apply here. So what should be
recommended? Is this an 'anti-pattern' for modules? An anti-pattern for
'well organized' non-module headers? Should modules be able to coexist with
these kinds of shenanigans when the code worked fine pre-modules? Should I
contact the developers of this header and recommend a change? (It's an
automatically generated header from gl3w.)

See http://clang.llvm.org/docs/Modules.html#module-declaration and in
particular the description of the [extern_c] attribute. By default, a
module that can be used in C++ can only be used at global scope (outside of
all namespaces and language linkage specifications). This attribute says
that the module provides a C interface to C++ code, and thus:
1) The module is built surrounded by an implicit extern "C" context, and
2) The module can be imported within a top-level extern "C" linkage
specification.

This attribute should typically be applied to modules whose headers
contain the

#ifdef __cplusplus
extern "C" {
#endif
// ...
#ifdef __cplusplus
}
#endif

pattern. Depending on your platform, it may or may not make sense to use
the [extern_c] on your libc module; Darwin's doesn't (probably because its
module map predates this attribute) but my glibc module map does.

Would it make sense for the compiler, in the interest of backwards
compatibility, to just go ahead and import a module ignoring the fact that
it’s imported in an extern “C” context?

That breaks things. Suppose you have a C header that does *not* include
the above conditional 'extern "C" {' shenanigans. From C++ code, you'd use
that library like this:

extern "C" {
  #include "that_library.h"
}

If you supply a module map for that library and you add neither 'requires
!cplusplus' nor '[extern_c]', *and* we follow your suggestion, then we'd
build that_library.h has a non-extern-C C++ header file, which will lead to
weird link errors.

Doesn't providing such a module map constitute an incorrect assertion that
the header works as a C++ module, and so an error there would be okay?

An error is OK, so long as it's easy to diagnose and understand. We can do
a lot better if we catch this during compilation than during linking (or
worse, during dynamic linking or at dlopen time).

If a module had to list its supported languages somehow, I think that would
be a completely reasonable position. As it stands, we have an opt-out
system, not an opt-in one; a module would need to say "requires !cplusplus,
!cuda, !objc, !opencl, ..." to explicitly opt out of all the language modes
where it doesn't work or hasn't been tested. C++ is arguably a special case
here (because it introduces an ABI difference by default), but I still
don't think it's reasonable to expect every C module author to write
"requires !cplusplus"; the current rule prompts them to write [extern_c] in
those cases where it's necessary to do so.

That said, I think we should extend the diagnostic to point out that the
error can be resolved by adding [extern_c] to the module map.

I think a more common situation will be people just flipping on -fmodules

without providing module maps for all their headers, so for them extern "C"
{ #include "that_library.h" } will work just fine. And if they do get
around to writing their own module maps they should correctly mark C
modules as C.

I think the error has a lot of value in this situation: if they forget to
mark their modules as C, it stops the build from accidentally generating
wrong code. (And [extern_c] is the way to mark a module as
C-but-usable-from-C++.)

Of course if my system's module map used the extern_c attribute there'd be
no problem either way, so I agree that's the real solution. I'll file a bug
with Apple for this as well.

And here's the bug I filed with Khronos for their header:
The Khronos Group · GitHub

Thank you for filing both of those bugs!