Modules: redefinition errors from explicit submodules

(+cfe-dev & others for extra input)

Hi Richard,

Unlike C++/ObjC++, we don't currently merge struct (and others)
definitions in C/ObjC. This is fine since we are not even talking
about local submodules visibility. However, consider the following:

- Consider module F, and its submodule F.Private.
- The submodule is marked explicit and should only contain private
headers that won't be visible unless by explicit include or @import
Module.Private.
- A private header NS.h defines 'struct NS {...}'
- A TU x.c also defines 'struct NS {...}' but it never includes NS.h.
The private module is built though, since module.private.modulemap
builds any submodule of what has been previously defined in
module.modulemap
- BOOM -> error: redefinition of 'NS'

Turns out that this example does not error in C++ because we merge
this definition if one of the candidates is hidden, but only in C++.
Shouldn't we behave the same in C/ObjC for the case where: (a) the
definition comes from a module marked 'explicit' and (b) there's no
@import / direct include in the TU at the point of re-definition?

Am I missing something or is it just that we never really implemented
this? What's your general opinion on this?

Here's the reduced testcase:

== F.framework/Headers/F.h
// F.h
== F.framework/Modules/module.modulemap
framework module F [extern_c] [system] {
  umbrella header "F.h"
  module * {
      export *
  }
  export *
}
== F.framework/Modules/module.private.modulemap
explicit module F.Private [system] {
  explicit module NS {
      header "NS.h"
      export *
  }
  export *
}
== F.framework/PrivateHeaders/NS.h
struct NS {
  int a;
  int b;
};
== x.c
#include "F/F.h"
//@import F.Private;
struct NS {
  int a;
  int b;
};

$ clang x.c -c -fmodules -F`pwd`

(+cfe-dev & others for extra input)

Hi Richard,

Unlike C++/ObjC++, we don't currently merge struct (and others)
definitions in C/ObjC. This is fine since we are not even talking
about local submodules visibility. However, consider the following:

- Consider module F, and its submodule F.Private.
- The submodule is marked explicit and should only contain private
headers that won't be visible unless by explicit include or @import
Module.Private.
- A private header NS.h defines 'struct NS {...}'
- A TU x.c also defines 'struct NS {...}' but it never includes NS.h.
The private module is built though, since module.private.modulemap
builds any submodule of what has been previously defined in
module.modulemap
- BOOM -> error: redefinition of 'NS'

Turns out that this example does not error in C++ because we merge
this definition if one of the candidates is hidden, but only in C++.
Shouldn't we behave the same in C/ObjC for the case where: (a) the
definition comes from a module marked 'explicit' and (b) there's no
@import / direct include in the TU at the point of re-definition?

Am I missing something or is it just that we never really implemented
this? What's your general opinion on this?

The testcase should be made to work, one way or another. In C++, for
various reasons we have a strong requirement that there is at most one
definition per CXXRecordDecl, and we achieve that in this case by detecting
that we're about to parse a second definition and instead skipping the
tokens of the new definition and making the existing definition visible.

I didn't implement the same thing in C because it didn't seem in the spirit
of the C linkage rules for types; C does not have a direct analogue of
C++'s ODR, and instead has a form of structural typing for
cross-translation-unit linking of types. My opinion is that the correct way
to handle this is to treat declarations of tag types with the same name in
different modules as being different entities in C, and to extend Clang's
notion of "compatible type" to treat tag types from two different modules
as being compatible types if they have the same name and pass the
structural compatibility check described in C11 6.2.7/1. If we want to
allow the definition of 'struct NS' in F.Private to be different from that
in x.c, I think that's the most sensible way to approach the problem.

However, we could alternatively say that a rule like C++'s ODR applies to C
modules, and use a strategy similar to the one we use in C++ (such as
ignoring the definition of 'struct NS' in x.c and substituting the one from
NS.h). I think you folks have more experience with C and ObjC modules than
anyone else, so if you think something like that is viable, I'm OK with
that too.

Thanks for the feedback Richard. In the light of your comments, we
discussed this a bit more and decide to try a mixed approach: ODR-like
semantics in the sense of not allowing more than one entity to be
present, but allowing the redefinition of tags types only in case they
pass the structural compatibility check described in C11 6.2.7/1. Let
me know yours thoughts; the two patches that accomplishes it can be
found in:

https://reviews.llvm.org/D31777
https://reviews.llvm.org/D31778