[Modules TS] Have the file formats been decided?

Hamza_Sood1 · January 16, 2017, 9:20pm

I’ve been looking into Clang’s implementation of the C++ Modules TS to see if there’s anything I can do to help.

From what I understand, a cppm file is essentially treated as one big header file (with handling for a few extra keywords) which is preprocessed and dumped to disk as a pcm file containing a binary representation of the AST. Consumers of the module will end up importing this pcm file as a precompiled header. (If that summary is incorrect then you can stop reading here...)

There are a few problems I ran into with this, which I think are because of this format:
  - Consumers of the module will import the entire implementation of all of the functions in the cppm, which will lead to a lot of duplicated code between object files (and greatly increased compile times).
  - There’s no way to hide declarations that aren’t exported or that are declared as part of the global module.
  - Library developers will have the ship the entire AST for their project if they want users to be able to import it using Modules.
   - Disk usage for large projects with lots of code will be fairly high (not as big of a problem as the others, but still worth a mention).

Is this format decided on? Or is it just an initial test? If it's not yet concrete, then I'd like to propose a slightly different implementation that could potentially solve these problems. While parsing a cppm file, we could construct two ASTs. One containing the entire file as before, and the other consisting of just exported declarations (without their implementations if they aren't inline or templated). The former AST could be used to generate an object file as usual, while the latter could be dumped to disk as a separate interface file (with some kind of special extension). The interface file would essentially serve as a binary "header", containing only what's needed by consumers of the module.

Has anyone got any thoughts on this?

Richard_Smith · January 17, 2017, 7:30pm

I’ve been looking into Clang’s implementation of the C++ Modules TS to see
if there’s anything I can do to help.

From what I understand, a cppm file is essentially treated as one big
header file (with handling for a few extra keywords) which is preprocessed
and dumped to disk as a pcm file containing a binary representation of the
AST. Consumers of the module will end up importing this pcm file as a
precompiled header. (If that summary is incorrect then you can stop reading
here...)

There are a few problems I ran into with this, which I think are because
of this format:
- Consumers of the module will import the entire implementation of all
of the functions in the cppm, which will lead to a lot of duplicated code
between object files (and greatly increased compile times).

Clang's on-disk AST format is read lazily, so the amount of data in the AST
file is not as important as the amount of that data that is actually used
by a particular compilation. You'll only get duplicated code in AST files
if you have duplicated code in module interfaces. The exception is if
multiple modules instantiate the same template with the same arguments, and
they do not depend on each other.

Right now, we will get duplicated code in object files for functions
defined inside the module interface (at least, for those functions that are
used by the current translation unit). That's simply because the
implementation of the Modules TS is incomplete; we are going to add the
facility to generate code for a module interface at some point, and when we
do we will disable the emission of functions defined inside the interface
when compiling any other translation unit. (As an exception, we may still
emit those function definitions when building with optimizations enabled in
order to support inlining, at least when LTO is disabled.)

- There’s no way to hide declarations that aren’t exported or that are

declared as part of the global module.

Can you be more specific about what kind of hiding you want?

We provide a mechanism to prevent these declarations from being visible to
downstream code (Clang doesn't fully support the Modules TS export
semantics yet, but we have long supported a __module_private keyword for
this).

We could avoid emitting some such declarations to the AST file, but note
this isn't as simple as just not emitting definitions into the precompiled
form: an exported template can, for instance depend on the definition of a
non-exported template or constexpr function, so at least some of the
non-exported definitions within the module must be available. Once we have
separate code generation for module interfaces, we can consider supporting
this as an optimization.

- Library developers will have the ship the entire AST for their project
if they want users to be able to import it using Modules.

Clang's module files are explicitly not a distribution format. You are
expected to ship your module interface files, not a precompiled form of
them.

- Disk usage for large projects with lots of code will be fairly high
(not as big of a problem as the others, but still worth a mention).

Is this format decided on? Or is it just an initial test? If it's not yet
concrete, then I'd like to propose a slightly different implementation that
could potentially solve these problems. While parsing a cppm file, we could
construct two ASTs. One containing the entire file as before, and the other
consisting of just exported declarations (without their implementations if
they aren't inline or templated). The former AST could be used to generate
an object file as usual, while the latter could be dumped to disk as a
separate interface file (with some kind of special extension). The
interface file would essentially serve as a binary "header", containing
only what's needed by consumers of the module.

Rather than producing two ASTs, it would be preferable to simply export
less of the AST into the pcm file. As noted above, this optimization is not
yet implemented (along with some of the semantics of the Modules TS).

Has anyone got any thoughts on this?

Even with modules, large codebases will still want to maintain an interface
/ implementation separation discipline, in order to avoid every change to a
low-level library's implementation triggering unnecessary recompilation of
dependent code. (Keep in mind that a change that affects line numbers in a
low-level library could affect the debug information generated for any
transitive dependency, so we can't necessarily bail out of the compilation
if the abstract interface of the module is unchanged.)

This somewhat reduces the impact of your concerns above, but they are still
real and important considerations.

Hamza_Sood1 · January 17, 2017, 10:45pm

Thanks for clarifying parts of the current implementation. I wasn’t sure what’s incomplete and what’s by design.

Rather than producing two ASTs, it would be preferable to simply export less of the AST into the pcm file. As noted above, this optimization is not yet implemented (along with some of the semantics of the Modules TS).

Since it’s currently possible to generate a complete object file from a pcm, I assumed that such an optimisation wouldn't be possible with the current format. In fact a fully optimised pcm is pretty much what I was trying to describe here, but I wasn’t sure if being able to go from pcm -> obj is an essential part of what a pcm is.
Just writing less of the AST to a file is certainly better than producing two ASTs, and I attempted that with my original tests. However I wasn’t able to find anything in ASTWriter that lets you to control which parts of the AST are written; all I could get working is producing a second AST from the original (with modifications of course) and passing that through to ASTWriter. Is there an API that I missed?

Clang's module files are explicitly not a distribution format. You are expected to ship your module interface files, not a precompiled form of them.

Would library developers want to ship their module interfaces considering they could potentially contain a lot of code?
Microsoft for example have come up with a distributable binary format so that library developers don’t have to ship their module interface files.

Even with modules, large codebases will still want to maintain an interface / implementation separation discipline, in order to avoid every change to a low-level library's implementation triggering unnecessary recompilation of dependent code. (Keep in mind that a change that affects line numbers in a low-level library could affect the debug information generated for any transitive dependency, so we can't necessarily bail out of the compilation if the abstract interface of the module is unchanged.)

That brings up the question of how a module based build system would look, which I don’t think I’ve seen mentioned anywhere. Should the compiler be in charge by seeking out imported modules based on search paths and automatically building them if needed? Or should it be more like the dependency file generation that occurs with headers, which leaves a tool such as GNU make in charge?

Richard_Smith · January 17, 2017, 11:10pm

Thanks for clarifying parts of the current implementation. I wasn’t sure
what’s incomplete and what’s by design.

> Rather than producing two ASTs, it would be preferable to simply export
less of the AST into the pcm file. As noted above, this optimization is not
yet implemented (along with some of the semantics of the Modules TS).
Since it’s currently possible to generate a complete object file from a
pcm, I assumed that such an optimisation wouldn't be possible with the
current format. In fact a fully optimised pcm is pretty much what I was
trying to describe here, but I wasn’t sure if being able to go from pcm ->
obj is an essential part of what a pcm is.

Our pcm format is not immutable; we are free to make such changes if
necessary. One thing that might not be immediately obvious: in a highly
parallel build, it can be beneficial to avoid blocking downstream compiles
on the step that generates object code from a module interface. That is, we
may want to generate a .pcm file without generating object code, and then
later generate the object code from it, to improve build performance. This
doesn't necessarily mean that the .pcm file must contain all function
definitions -- we could generate the object file for the module interface
by re-parsing the .cppm file -- but there's a tradeoff between parallelism
and total CPU time in doing so.

Just writing less of the AST to a file is certainly better than producing
two ASTs, and I attempted that with my original tests. However I wasn’t
able to find anything in ASTWriter that lets you to control which parts of
the AST are written; all I could get working is producing a second AST from
the original (with modifications of course) and passing that through to
ASTWriter. Is there an API that I missed?

We have no real support for this yet, but it doesn't seem especially hard
to add the ability to filter during AST emission. The interesting part will
be determining what can be safely filtered out. Example: an exported
template makes a call to a function with unqualified name 'foo'; can we
still discard any non-exported functions named 'foo' in the module
interface? Those functions might be found by ADL.

Also note that this affects linkage: even internal, non-exported functions
in the module interface might be called that way, and if so, we need some
way to link the symbol references in those template instantiations to the
code we emitted for the module interface.

Clang's module files are explicitly not a distribution format. You are
expected to ship your module interface files, not a precompiled form of
them.
Would library developers want to ship their module interfaces considering
they could potentially contain a lot of code?
Microsoft for example have come up with a distributable binary format so
that library developers don’t have to ship their module interface files.

Considering that the module interface can, and often will, contain code
that is in some way conditional on the environment (for instance, on the
size of 'int', or on whether certain headers or functions are provided by
the environment, or on certain details of their standard library
implementation -- and so on), it is not clear that Microsoft's approach is
feasible for a non-single-vendor environment. Even trivial concerns such as
whether assert(X) in an inline function or template in a module interface
require precompiled module interfaces for the same .cppm file. At this
point, the idea of a redistributable binary module interface format seems
misguided, but we'll have to see how usage patterns develop and whether
they ever start to make sense.

Even with modules, large codebases will still want to maintain an
interface / implementation separation discipline, in order to avoid every
change to a low-level library's implementation triggering unnecessary
recompilation of dependent code. (Keep in mind that a change that affects
line numbers in a low-level library could affect the debug information
generated for any transitive dependency, so we can't necessarily bail out
of the compilation if the abstract interface of the module is unchanged.)
That brings up the question of how a module based build system would look,
which I don’t think I’ve seen mentioned anywhere. Should the compiler be in
charge by seeking out imported modules based on search paths and
automatically building them if needed? Or should it be more like the
dependency file generation that occurs with headers, which leaves a tool
such as GNU make in charge?

Historically, Clang's approach has been to provide a mode that requires no
changes to build systems, in order to make transition to modules and
sharing code between a modules build and a non-modules build
straightforward, but that introduces many problems (particularly with
parallel and distributed builds), and with the Modules TS we are already
making a break with the past, so we should simply treat the act of building
a module as a first-class action performed by a build. The compiler should
not become a build system.

This does mean that build systems will need to track interface dependencies
in a way they didn't before (you need to know which module interfaces
should be built before which other module interfaces), and that information
will either need to be provided or detected by the build system. If a build
system wishes to automate this, it would not be dissimilar to the #include
scanning that some existing build systems already perform.

Barbara_Geller · January 18, 2017, 12:14am

In my perspective the biggest problem with clang modules (and the reason I
stopped investigating it from a buildsystem perspective) is

https://llvm.org/bugs/show_bug.cgi?id=21593

I tried reaching out several times about that issue, but so far I still
don't know how you will solve it.

After that is reported as solvable or solved (is the situation different
today compared to 2014?), I'd like to investigate other buildsystem related
issues with modules in CMake. Unless I'm missing something, that's currently
a waste of time for me given that I'm on linux and I can't see a future for
the current design there. I hope I'm just missing something.

Thanks,

Steve.

r4nt · January 18, 2017, 11:12am

This does mean that build systems will need to track interface
dependencies in a way they didn’t before (you need to know which module
interfaces should be built before which other module interfaces), and that
information will either need to be provided or detected by the build
system. If a build system wishes to automate this, it would not be
dissimilar to the #include scanning that some existing build systems
already perform.

In my perspective the biggest problem with clang modules (and the reason I
stopped investigating it from a buildsystem perspective) is

https://llvm.org/bugs/show_bug.cgi?id=21593

I tried reaching out several times about that issue, but so far I still
don’t know how you will solve it.

After that is reported as solvable or solved (is the situation different
today compared to 2014?), I’d like to investigate other buildsystem related
issues with modules in CMake. Unless I’m missing something, that’s currently
a waste of time for me given that I’m on linux and I can’t see a future for
the current design there. I hope I’m just missing something.

I believe explicit module maps solve your problem, but your bug doesn’t contain enough information to know what you want
Generally, a lot has changed since 2014. I believe explicit modules fully solve your problem, but on the other hand, I don’t think we’re ready for modular builds via package distributors for quite some time yet.

Hamza_Sood1 · January 18, 2017, 3:54pm

Our pcm format is not immutable; we are free to make such changes if necessary.

That's good to know.

One thing that might not be immediately obvious: in a highly parallel build, it can be beneficial to avoid blocking downstream compiles on the step that generates object code from a module interface. That is, we may want to generate a .pcm file without generating object code, and then later generate the object code from it, to improve build performance. This doesn't necessarily mean that the .pcm file must contain all function definitions -- we could generate the object file for the module interface by re-parsing the .cppm file -- but there's a tradeoff between parallelism and total CPU time in doing so.

Good point. I suggested producing both the precompiled interface and object file simultaneously to avoid having to parse the cppm multiple times, but being able to defer code generation could be useful in some cases.
If we went down the route of storing a fully optimised pcm, I suppose the CPU time wastage wouldn't be huge if the first parse is also optimised (skipping over non-exported function bodies etc.)

Also note that this affects linkage: even internal, non-exported functions in the module interface might be called that way, and if so, we need some way to link the symbol references in those template instantiations to the code we emitted for the module interface.

That's certainly an interesting scenario. Would that even be possible when linking against an optimised library?
On the topic of linkage, does the __module_private keyword you mentioned affect symbol visibility? I.e. Could it be used for names that need module linkage? I wasn't able to find any documentation on it.

At this point, the idea of a redistributable binary module interface format seems misguided, but we'll have to see how usage patterns develop and whether they ever start to make sense.

This may be an equally misguided idea, but how about if Clang had a tool to strip a cppm file of definitions and implementations that aren't needed by a consumer of the module? The resulting file could then be distributed as a portable module interface without distributing more than what's needed. Could that work?

Historically, Clang's approach has been to provide a mode that requires no changes to build systems, in order to make transition to modules and sharing code between a modules build and a non-modules build straightforward, but that introduces many problems (particularly with parallel and distributed builds), and with the Modules TS we are already making a break with the past, so we should simply treat the act of building a module as a first-class action performed by a build. The compiler should not become a build system.

So to clarify, it should be regarded as an error if you try to compile a source file before pre-compiling a module that it depends on? That's the current behaviour, but I was unsure as to whether that's because it just hasn't been implemented yet.

This does mean that build systems will need to track interface dependencies in a way they didn't before (you need to know which module interfaces should be built before which other module interfaces), and that information will either need to be provided or detected by the build system. If a build system wishes to automate this, it would not be dissimilar to the #include scanning that some existing build systems already perform.

In that case, should Clang have an option for generating a makefile fragment for module dependencies similar to how -MMD works for headers?

Stephen_Kelly · January 18, 2017, 9:20pm

Hi Manuel, I don’t know what “explicit module maps” means. The bug report links to a mailing list thread. Does that provide more context? IOW: Why make /usr/include/module.modulemap contended by everything that installs into /usr ? Was a solution of uncontended /usr/include/.modulemap considered and discarded? Why? Thanks, Steve.

r4nt · January 19, 2017, 7:08am

This does mean that build systems will need to track interface
dependencies in a way they didn’t before (you need to know which module
interfaces should be built before which other module interfaces), and that
information will either need to be provided or detected by the build
system. If a build system wishes to automate this, it would not be
dissimilar to the #include scanning that some existing build systems
already perform.

In my perspective the biggest problem with clang modules (and the reason I
stopped investigating it from a buildsystem perspective) is

https://llvm.org/bugs/show_bug.cgi?id=21593

I believe explicit module maps solve your problem, but your bug doesn’t contain enough information to know what you want

Hi Manuel,

I don’t know what “explicit module maps” means.

You can give the module maps necessary to clang at the command line instead of relying on clang finding them somewhere.

The bug report links to a mailing list thread. Does that provide more context? IOW: Why make

/usr/include/module.modulemap

contended by everything that installs into /usr ? Was a solution of uncontended

/usr/include/.modulemap

considered and discarded? Why?

You can have /usr/include/.modulemap if you want. Your build system just will need to support that.

You can even put your module map in /usr/lib or wherever else other parts of your library are going to be installed in, as long as build systems will pick up the right flags to compile your library with.

The current implementation of automatic detection of module maps is completely driven by the original Obj-C use cases.
For C++, I don’t expect that we will use that mechanism, much for the reasons you cite. My hope is that we’ll go into a world where we explicitly specify the module maps on the command line (less magic!).

Cheers,
/Manuel

Stephen_Kelly · January 21, 2017, 12:31pm

So, * We rely on library authors knowing to do that (how would they find out, given that clang documentation doesn’t recommend it?) * Library authors won’t automatically name things consistently, so we will also get libraries with lib.modulemap and lib.modulemap and doubtless other variations. * It would be preferable to have the buildsystem hide that detail, if it must exist. That would be possible with cmake usage requirements, but most libraries don’t install cmake config files, and consumers don’t necessarily use cmake anyway. I guess other buildsystems would have a similar issue. They won’t just ‘pick up’ the right flags by magic right? Or am I missing something? That doesn’t seem to be what clang documents as the preferred way to use the feature. Eg, “The file is placed alongside the header files themselves” ie, /usr/include/module.modulemap There would need to be a change in emphasis there at least to change things to your vision. I’m not convinced it can work. It seems to make the feature not suitable for widespread adoption. Maybe/hopefully I’m missing something though. Thanks, Steve.

_sean_silva · January 22, 2017, 4:50am

> This does mean that build systems will need to track interface
> dependencies in a way they didn't before (you need to know which module
> interfaces should be built before which other module interfaces), and
that
> information will either need to be provided or detected by the build
> system. If a build system wishes to automate this, it would not be
> dissimilar to the #include scanning that some existing build systems
> already perform.

In my perspective the biggest problem with clang modules (and the reason I
stopped investigating it from a buildsystem perspective) is

21593 – Modules design problems for linux/packagers

I tried reaching out several times about that issue, but so far I still
don't know how you will solve it.

After that is reported as solvable or solved (is the situation different
today compared to 2014?), I'd like to investigate other buildsystem
related
issues with modules in CMake. Unless I'm missing something, that's
currently
a waste of time for me given that I'm on linux and I can't see a future
for
the current design there. I hope I'm just missing something.

I believe explicit module maps solve your problem, but your bug doesn't
contain enough information to know what you want
Generally, a lot has changed since 2014. I believe explicit modules fully
solve your problem, but on the other hand, I don't think we're ready for
modular builds via package distributors for quite some time yet.

Explicit module maps (and the associated notion of explicit build steps to
build the .pcm files) provides the primitive mechanism which enables robust
integration of modules into build systems and all sorts of flexibility.
However, simply having the mechanism available does not directly solve the
issue linked in that bug or many related issues, which are largely social /
historical issues. One such such problem is ensuring that C++ code using
modules can interoperate across different build systems.

-- Sean Silva

r4nt · January 22, 2017, 8:30am

The bug report links to a mailing list thread. Does that provide more context? IOW: Why make

/usr/include/module.modulemap

contended by everything that installs into /usr ? Was a solution of uncontended

/usr/include/.modulemap

considered and discarded? Why?

You can have /usr/include/.modulemap if you want. Your build system just will need to support that.

So,

We rely on library authors knowing to do that (how would they find out, given that clang documentation doesn’t recommend it?)

This is going to become a priority going forward, but it hasn’t been yet. Specifically, we’ll need to discuss this in more detail, and carefully design it, optimally pulling in as many stakeholders as we can.

Library authors won’t automatically name things consistently, so we will also get libraries with lib.modulemap and lib.modulemap and doubtless other variations.

I’d have hoped most would call them name.modulemap

It would be preferable to have the buildsystem hide that detail, if it must exist. That would be possible with cmake usage requirements, but most libraries don’t install cmake config files, and consumers don’t necessarily use cmake anyway. I guess other buildsystems would have a similar issue.

Yep. Modules fundamentally change the C++ build, so I don’t see a way that wouldn’t require putting support into build systems.
Both with autotools and cmake you usually call an installed function to detect a library, which will give you the flags you need to compile it. Those will need to change for a library wanting to be compiled as a module.

You can even put your module map in /usr/lib or wherever else other parts of your library are going to be installed in, as long as build systems will pick up the right flags to compile your library with.

They won’t just ‘pick up’ the right flags by magic right? Or am I missing something?

Build system support is needed. This is not very different from the current state where many libraries you use require specific flags to build.

The current implementation of automatic detection of module maps is completely driven by the original Obj-C use cases.
For C++, I don’t expect that we will use that mechanism, much for the reasons you cite. My hope is that we’ll go into a world where we explicitly specify the module maps on the command line (less magic!).

That doesn’t seem to be what clang documents as the preferred way to use the feature. Eg, “The module.modulemap file is placed alongside the header files themselves” ie, /usr/include/module.modulemap

http://clang.llvm.org/docs/Modules.html

There would need to be a change in emphasis there at least to change things to your vision. I’m not convinced it can work. It seems to make the feature not suitable for widespread adoption. Maybe/hopefully I’m missing something though.

All this is in no way decided. Modules are still in the early stages. I have no idea when they’ll be standardized.

A final note: without build system integration and changes to your code structure, modules are likely to make your incremental builds slower. Apple has some experience here, see Daniel Dunbar’s talk at the llvm dev meeting.

If you think non zero effort will make people not use them, then they’ll not use them. I don’t think that’s bad, it’s an effort vs payout trade-off.

r4nt · January 22, 2017, 8:36am

Correct.

Barbara_Geller · January 22, 2017, 10:25am

Social/historical issues are difficult to solve (how something is
communicated might have to change - not just how the thing is implemented),
and still can have an effect on adoption. I don't know what the 'developer
story' is apart from 'wait for all of your dependencies to update with
modules support and wait for each buildsystem of your dependency to be
compatible with all the others for creating and consuming modules and then
you can start using modules'.

I realize this goes beyond Clang and the ModulesTS has buildsystem issues
too...

Thanks,

Steve.

Barbara_Geller · January 22, 2017, 10:54am

* We rely on library authors knowing to do that (how would they find out,
given that clang documentation doesn't recommend it?)

This is going to become a priority going forward, but it hasn't been yet.
Specifically, we'll need to discuss this in more detail, and carefully
design it, optimally pulling in as many stakeholders as we can.

I very much agree with you there!

However another issue is that the Clang modules implementation is
'intermediate' in that it doesn't follow the TS syntax. It's not clear to me
what 'standard modules' in Clang would look like in terms of what files
users write, and what the buildsystem needs to invoke (The Microsoft
implementation seems to require the user to write 'foo.ixx' files and
generate '.ifc' files IIRC).

So there may be two systems to carefully design, or maybe 'standard modules'
in Clang will also rely on modulemaps (so the user will have to create those
and .ixx files). It's all part of the design work you mention, so I'll look
forward to that kickstarting. Please keep me in the loop.

* Library authors won't automatically name things consistently, so we will

also get libraries with <name>lib.modulemap and lib<name>.modulemap and
doubtless other variations.

I'd have hoped most would call them name.modulemap

I see :). Hope! I guess the design you mention above can either rely on that
or not and make it important or not. It's hard to know whether it is
important until the design work is done.

* It would be preferable to have the buildsystem hide that detail, if it

must exist. That would be possible with cmake usage requirements, but
most libraries don't install cmake config files, and consumers don't
necessarily use cmake anyway. I guess other buildsystems would have a
similar issue.

Yep. Modules fundamentally change the C++ build, so I don't see a way that
wouldn't require putting support into build systems.

Indeed, this is true of both 'clang modules' and 'standard modules'. In the
case of the Microsoft implementation, an entire new step is required for
each module (whereas often compile flags for the compile step and link flags
for the link step are enough to build something, now there will be a
'generate modules' step which will need its own flags and need to be ordered
correctly). It is not clear to me how much of that is also the case for the
Clang implementation, and whether that will change if Clang supports
'standard modules' in the future.

Given that such fundamental buildsystem support is required, I find it
interesting that it is not part of the conversation about Modules. That's
critical to adoption, right?

Both with autotools and cmake you usually call an installed function to
detect a library, which will give you the flags you need to compile it.
Those will need to change for a library wanting to be compiled as a
module.

Yes, but as I wrote above, there will be an entirely new step (with the
Microsoft implementation at least - I don't know about Clang) with new
ordering requirements. Supplying different compile flags is not enough.

Build system support is needed. This is not very different from the
current state where many libraries you use require specific flags to
build.

I don't know that that's true. Perhaps that will be obvious when the design
work you mentioned above is done. If additional steps to the build are
required, I might term that 'very different'.

It's not clear to me how names of things will be relevant or will be
determined. Will files have to be parsed by the buildsystem to determine the
name of a module ('module M' syntax)? Is that realistic? From looking at
lack of Fortan support in Ninja (IIUC Fortran has a module system which
requires doing that) that would be problematic to the point of preventing
adoption.

A final note: without build system integration and changes to your code
structure, modules are likely to make your incremental builds slower.
Apple has some experience here, see Daniel Dunbar's talk at the llvm dev
meeting.

Link me please?

If you think non zero effort will make people not use them, then they'll
not use them. I don't think that's bad, it's an effort vs payout
trade-off.

Oh, I don't know about non-zero effort. I'm interested in what the effort is
and who will make it. That's not clear to me because as you say, it has not
been designed yet.

Thanks,

Steve.

_sean_silva · January 22, 2017, 11:29am

> Explicit module maps (and the associated notion of explicit build steps
to
> build the .pcm files) provides the primitive mechanism which enables
> robust integration of modules into build systems and all sorts of
> flexibility. However, simply having the mechanism available does not
> directly solve the issue linked in that bug or many related issues, which
> are largely social / historical issues.

Social/historical issues are difficult to solve (how something is
communicated might have to change - not just how the thing is implemented),
and still can have an effect on adoption.

That's why I mentioned it Just wanted to clarify the distinction between
the mechanism and other barriers.

The social/historical barriers are indeed *very* strong.

I don't know what the 'developer
story' is apart from 'wait for all of your dependencies to update with
modules support and wait for each buildsystem of your dependency to be
compatible with all the others for creating and consuming modules and then
you can start using modules'.

The problem for C++ modules is actually worse than just "wait for your
dependencies". You also may need to wait for your dependents since the
modules syntax is incompatible with pre-modules compilers (though last I
talked with Richard about this he had some pretty clear ideas for keeping
the overhead down during the transition period, which for some projects
will be "forever").

Waiting for your dependencies and dependents is a bit of a catch-22
unfortunately; we just need to hope that intermediate transitionary steps
are available to break the deadlock.

Between different build systems, package managers, etc. the deadlock is
even stronger (nobody wants to take the first step because nobody's first
step adds any value without the others).

I realize this goes beyond Clang and the ModulesTS has buildsystem issues
too...

One interesting point here is that it's clear that some sort of
standardization is going to be needed "outside the ISO C++ Standard" (e.g.
conventions for build systems etc.). It may be too early for that work to
start, but I haven't seen much on that front (though admittedly I'm not
actively following all this modules stuff in detail anymore). If you're in
a position where you can contribute to that effort it would be *hugely*
appreciated I'm sure!

-- Sean Silva

Barbara_Geller · January 22, 2017, 12:00pm

I don't know what the 'developer
story' is apart from 'wait for all of your dependencies to update with
modules support and wait for each buildsystem of your dependency to be
compatible with all the others for creating and consuming modules and
then you can start using modules'.

The problem for C++ modules is actually worse than just "wait for your
dependencies". You also may need to wait for your dependents since the
modules syntax is incompatible with pre-modules compilers (though last I
talked with Richard about this he had some pretty clear ideas for keeping
the overhead down during the transition period, which for some projects
will be "forever").

I would be interested to hear more from Richard about this.

Waiting for your dependencies and dependents is a bit of a catch-22
unfortunately; we just need to hope that intermediate transitionary steps
are available to break the deadlock.

That does sound like a problem.

Between different build systems, package managers, etc. the deadlock is
even stronger (nobody wants to take the first step because nobody's first
step adds any value without the others).

I'm not certain what you're referring to. Perhaps what you have in mind is
some standardization/conventions for buildsystems. I doubt that's possible.
I think it would be better if Modules were designed to not have that impact.

I realize this goes beyond Clang and the ModulesTS has buildsystem issues
too...

One interesting point here is that it's clear that some sort of
standardization is going to be needed "outside the ISO C++ Standard" (e.g.
conventions for build systems etc.).

I can't imagine how anyone would get something like that started.

It may be too early for that work to
start,

I don't agree with that. I think the design of C++ Modules and the impact
the design has on buildsystems are coupled. A brilliantly designed Modules
system which does not consider the buildsystem (or the impact on how people
write code - what files they write and how they relate to each other in the
way that .h and .cpp files do today etc) might not get the deserved
adoption.

but I haven't seen much on that front (though admittedly I'm not
actively following all this modules stuff in detail anymore). If you're in
a position where you can contribute to that effort it would be *hugely*
appreciated I'm sure!

I'm not sure how appreciated it will be, but I have attempted to ask some
starting questions here:

Redirecting to Google Groups

Those are not all the questions I have, but you are correct that the
conversation about the impact of Modules on buildsystems does not currently
exist at all, so starting smaller is better.

Thanks,

Steve.

mehdi_amini · January 22, 2017, 5:13pm

http://llvm.org/devmtg/2016-11/#talk10

And if you’re into build systems, I advise you this one as well:

http://llvm.org/devmtg/2016-11/#talk22 (Toy programming demo of a repository for statically compiled programs )

_sean_silva · January 23, 2017, 5:50am

>> I don't know what the 'developer
>> story' is apart from 'wait for all of your dependencies to update with
>> modules support and wait for each buildsystem of your dependency to be
>> compatible with all the others for creating and consuming modules and
>> then you can start using modules'.
>
> The problem for C++ modules is actually worse than just "wait for your
> dependencies". You also may need to wait for your dependents since the
> modules syntax is incompatible with pre-modules compilers (though last I
> talked with Richard about this he had some pretty clear ideas for keeping
> the overhead down during the transition period, which for some projects
> will be "forever").

I would be interested to hear more from Richard about this.

> Waiting for your dependencies and dependents is a bit of a catch-22
> unfortunately; we just need to hope that intermediate transitionary steps
> are available to break the deadlock.

That does sound like a problem.

> Between different build systems, package managers, etc. the deadlock is
> even stronger (nobody wants to take the first step because nobody's first
> step adds any value without the others).

I'm not certain what you're referring to. Perhaps what you have in mind is
some standardization/conventions for buildsystems. I doubt that's possible.
I think it would be better if Modules were designed to not have that
impact.

>> I realize this goes beyond Clang and the ModulesTS has buildsystem
issues
>> too...
>>
>
> One interesting point here is that it's clear that some sort of
> standardization is going to be needed "outside the ISO C++ Standard"
(e.g.
> conventions for build systems etc.).

I can't imagine how anyone would get something like that started.

> It may be too early for that work to
> start,

I don't agree with that. I think the design of C++ Modules and the impact
the design has on buildsystems are coupled. A brilliantly designed Modules
system which does not consider the buildsystem (or the impact on how people
write code - what files they write and how they relate to each other in the
way that .h and .cpp files do today etc) might not get the deserved
adoption.

> but I haven't seen much on that front (though admittedly I'm not
> actively following all this modules stuff in detail anymore). If you're
in
> a position where you can contribute to that effort it would be *hugely*
> appreciated I'm sure!

I'm not sure how appreciated it will be, but I have attempted to ask some
starting questions here:

Redirecting to Google Groups!
topic/modules/sDIYoU8Uljw

Those are not all the questions I have, but you are correct that the
conversation about the impact of Modules on buildsystems does not currently
exist at all, so starting smaller is better.

Thanks for getting the ball rolling on this. btw, if you weren't aware,
Manuel actually has been involved with rolling out Clang's explicit modules
(i.e. essentially a "-c step for headers" that produces a .pcm file) into
Google's internal build system

-- Sean Silva

_sean_silva · January 23, 2017, 5:53am

>> I don't know what the 'developer
>> story' is apart from 'wait for all of your dependencies to update with
>> modules support and wait for each buildsystem of your dependency to be
>> compatible with all the others for creating and consuming modules and
>> then you can start using modules'.
>
> The problem for C++ modules is actually worse than just "wait for your
> dependencies". You also may need to wait for your dependents since the
> modules syntax is incompatible with pre-modules compilers (though last I
> talked with Richard about this he had some pretty clear ideas for
keeping
> the overhead down during the transition period, which for some projects
> will be "forever").

I would be interested to hear more from Richard about this.

> Waiting for your dependencies and dependents is a bit of a catch-22
> unfortunately; we just need to hope that intermediate transitionary
steps
> are available to break the deadlock.

That does sound like a problem.

> Between different build systems, package managers, etc. the deadlock is
> even stronger (nobody wants to take the first step because nobody's
first
> step adds any value without the others).

I'm not certain what you're referring to. Perhaps what you have in mind is
some standardization/conventions for buildsystems. I doubt that's
possible.
I think it would be better if Modules were designed to not have that
impact.

>> I realize this goes beyond Clang and the ModulesTS has buildsystem
issues
>> too...
>>
>
> One interesting point here is that it's clear that some sort of
> standardization is going to be needed "outside the ISO C++ Standard"
(e.g.
> conventions for build systems etc.).

I can't imagine how anyone would get something like that started.

> It may be too early for that work to
> start,

I don't agree with that. I think the design of C++ Modules and the impact
the design has on buildsystems are coupled. A brilliantly designed Modules
system which does not consider the buildsystem (or the impact on how
people
write code - what files they write and how they relate to each other in
the
way that .h and .cpp files do today etc) might not get the deserved
adoption.

> but I haven't seen much on that front (though admittedly I'm not
> actively following all this modules stuff in detail anymore). If you're
in
> a position where you can contribute to that effort it would be *hugely*
> appreciated I'm sure!

I'm not sure how appreciated it will be, but I have attempted to ask some
starting questions here:

Redirecting to Google Groups
opic/modules/sDIYoU8Uljw

Those are not all the questions I have, but you are correct that the
conversation about the impact of Modules on buildsystems does not
currently
exist at all, so starting smaller is better.

Thanks for getting the ball rolling on this. btw, if you weren't aware,
Manuel actually has been involved with rolling out Clang's explicit modules
(i.e. essentially a "-c step for headers" that produces a .pcm file) into
Google's internal build system

Sorry, hit "send" by accident. I meant to link to Manuel's CppCon talk:

(google's internal build system is actually substantially open-sourced at
http://bazel.build/; I'm not sure how much of Manuel's work has made it
into the open source side though)

-- Sean Silva

Topic		Replies	Views
Modules TS Work Clang Frontend	17	193	November 6, 2018
Modules TS: binary module interface dependencies Clang Frontend	13	160	July 1, 2017
[Modules TS] feedback Clang Frontend	14	182	May 26, 2017
How do I try out C++ modules with clang? Clang Frontend	31	373	November 17, 2014
RFC: Supporting private module maps for non-framework headers Clang Frontend	24	163	December 2, 2014

[Modules TS] Have the file formats been decided?

Related topics