Maybe we could start to play with std modules

Given clang15 is going to be branched, it looks like we could start to play with std modules. I made a toy demo at: https://github.com/ChuanqiXu9/stdmodules.

import std; now is a standard feature of C++23. It requires us to keep headers available and not to export everything. So the style in the demo looks like proper in this direction.

Note that this is not a RFC to start to work on std modules in libcxx. Besides the compiler is not mature, there are following problems:
(1) The build system is not ready. It is not reasonable to write Makefile manually in a industry project nowadays.
(2) We lack experience to distribute modules.

The intention is to notice people who are interested to get involved in. The maturity of the compiler needs more tests to improve. Also more use experience is helpful for us to understand how should build/distributing modules.

I’m very interested in using modules. I’ve a toy project using the “old” clang modules. Looking at Modules — Clang 16.0.0git documentation this page only seems to contain information regarding the “old” clang modules. Is there any documentation about how to build a project using C++20 modules? (I don’t mind to manually teach CMake how to do its job.)

Ninja already has partial support for C++20 modules:
https://github.com/ninja-build/ninja/pull/1521
clang-scan-deps supports Clang header modules, but their is no support for C++20 modules.

If we would get them together, then I might maybe work.

Is there any documentation about how to build a project using C++20 modules?

Oh, not yet. I planned to sent the document before clang15 is released in September. The intention of the post is that people could run/edit std modules actually, although it is available for Makefile. But I found that people are not interested in Makefiles for real: )

If this is helpful, let me introduce the usage here briefly. You could find the examples in the toy demo repo.
(1) The source files of module interfaces (and module implementation partitions) should end with .cppm (or .ccm, cxxm).
(2) The *.cppm files could be compiled to *.pcm files by --precompile (and -std=c++20) option. If the module interface is a partition ‘Part’ of module ‘M’, then the file name of the *.pcm file should be M-Part.pcm. This is corresponding to the std-<partitions>.pcm in the demo .Otherwise, the module interface should be a primary module interface ‘M’, then the file name of the *.pcm file should be M.pcm. This is corresponding to std.pcm in the demo.
(3) When we import a module, we need to pass the search path for the pcm files by -fprebuilt-module-path option. We could find the examples in the demo.
(4) The *.pcm files need to be compiled to object files *.o and these object files should be linked finally. We could find the examples at: https://github.com/ChuanqiXu9/stdmodules/blob/2f709df98110f3b2ff44d18585e2daf3d1b4511f/Makefile#L45-L81.

My bad if this is not helpful. I should post the docs first indeed : (

Ninja already has partial support for C++20 modules.

After I took a slight look at the post, it looks like it supports the command line options for GCC instead of clang. Yes, currently the command line options are not compatible between GCC and clang sadly.

clang-scan-deps supports Clang header modules, but their is no support for C++20 modules.

If we would get them together, then I might maybe work.

I see. It makes senses.

Ninja doesn’t actually support modules. They just provide the mechanism to run e.g. an include scanner to discover dependencies. Fortran modules are similar to C++ modules. CMake uses this mechanism to build Fortran modules. They have a dependency scanner for Fortran.

If clang-scan-deps or another tool could discover dependencies between C++20 modules, then ninja could run it to discover the dependencies. And then build the code.

Oh, I roughly see. I’ve sent an issue at: https://github.com/llvm/llvm-project/issues/56770. If it is not too hard, I guess I may handle it.

The challenge with C++20 modules and Fortran modules is if you add another import statement and save the file, then the dependency graph changed, but the build system is not aware of it. One solution is running dependency scanners before the actual build. clang-scan-deps implements that for Clang header modules. The new Swift-driver is also getting support for dependency scanning.

I had a l look at the Makefiles before. It shows what you do but not the why. Thanks a lot for the additional information. This helps me to start toying with modules :slight_smile:

Indeed, Ben Boeckel has a starting implementation for module dependency scanning in clang here:
mathstuf/llvm-project at p1689r4 (github.com). I’ve used his GCC implementation of p1689r5 with great success (no header units, though). The dependency scanning is available in CMake 3.25, albeit with some experimental flags that work with MSVC and GCC (with patch, sans header units). Both implementations require Ninja to be used.

With respect to approaches, I see that ChuanqiXu has a toy implementation that uses separate module files to export std, in contrast to the single module interface file used in MS STL. I’m really curious as to the pros / cons of each approach in terms of maintainability / invasiveness and throughput. It’s kind of an open question as to the best approach. Out of curiosity and motivation from success in using modules with MSVC, I have a branch for this operativeF/llvm-project at import-std (github.com).

There are, however, a lot of issues that prevent further work on this:
Linking doesn’t work with header units from the standard library (see here: Issue #57571 · llvm/llvm-project (github.com).
This also affects the method for implementing import std that I used.
There are other possible bugs, but I can’t say for certainty until this issue is resolved. If someone already has an idea on fixing this, I’ll gladly try to fix it.

Yeah, I am looking at it too.

Do you know if the std modules in MS STL are open sourced? Personally I feel the separate style may be better since it is easier for reading and maintaining.

There are, however, a lot of issues that prevent further work on this:
Linking doesn’t work with header units from the standard library (see here: Issue #57571 · llvm/llvm-project (github.com).
This also affects the method for implementing import std that I used.
There are other possible bugs, but I can’t say for certainty until this issue is resolved. If someone already has an idea on fixing this, I’ll gladly try to fix it.

Yeah, there are bugs, of course. I’ll check the issue you give later. But we are talking about std modules, which are named modules. And I feel like the bugs from header units might not be a blocking issue. Do I misunderstand any thing?

Yup, Stephan Lavavej is heading that project over here as a branch from the main MS STL:
StephanTLavavej/STL at import-std (github.com)

There are, however, a lot of issues that prevent further work on this:
Linking doesn’t work with header units from the standard library (see here: Issue #57571 · llvm/llvm-project (github.com).
This also affects the method for implementing import std that I used.
There are other possible bugs, but I can’t say for certainty until this issue is resolved. If someone already has an idea on fixing this, I’ll gladly try to fix it.

Yeah, you’re not misunderstanding; it was my misunderstanding at the time. The problem there presented itself as a similar problem I was having (undefined references to std::allocator) when I tried to link.

P1689 seems to get the dependencies for one file. clang-scan-deps gets the dependencies for a set of files and is probably doing some caching and other optimisations.

Yup, Stephan Lavavej is heading that project over here as a branch from the main MS STL:
StephanTLavavej/STL at import-std (github.com)

I’ve looked it slightly. It looks the project defines std modules at STL/std.ixx at import-std · StephanTLavavej/STL · GitHub. I guess they did some compiler magics to std modules. Since in the top of std.ixx there is a macro definition _BUILD_STD_MODULE and this macro seems like not to be used in the std headers. So the implementation doesn’t export the standard components as far I see. This may be OK since the language standard spec says the modules starting with std are special, which gives the space for the compiler to do magics.

And my toy implementation didn’t involve any compiler magics. It looks not easy/responsible to say which style (the magic way or standard way) is better now. I think what we need to do now is to practice more. Then we can get the conclusion better instead of by imaging. Have you run the my toy examples? Or do you tried to use the toy std modules in other bigger projects? For example, I’ve used the toy std module examples in GitHub - alibaba/async_simple at CXX20Modules, where we replace all the #include to headers with import std;. Hope this helps.

@operativeF I took a look at your implementation at Add std.cppm and (WIP) CMake files. · operativeF/llvm-project@55abca4 · GitHub. I am not clear about MSVC. But for clang, the implementation may not be good. The reason is that the names of declarations in the module purview will be mangled differently with the original one. For example, from my implementation, you could find the examples need to link the libc++.so at the end. And if we implement std modules in your style, there will be linking errors. This is the reason why all the declarations appear in global module fragment i my example. If we want to support the your style, we need another mode in the compiler side, which may not be hard. Currently I haven’t seen any draw back from my toy implementation. (I don’t say it will be good at the end). Is there any concern? I remember @philnik raised a similar approach before.

If I read correctly, it looks like there is no such issues since you used was, is it?

Dropping by - the rules_ll project has recently added support for writing C++20 modules with bazel+clang. Bazel has a reputation for being a bit over-engineered but IMO it’s a much better experience then Makefiles :slight_smile:

So maybe that’s something you could use to start building the std module until CMake get better support for that.

The rules_ll requires write the dependency manually too. The priority of compiler is to support clang-scan-deps. But if you’re saying about the experimental support in the libcxx side, literally I have no idea now. I highly agree that it will be great to have testing in the libcxx side, but the approach is unclear now. How do we write cmake scripts? How do we write bazel scripts? How do we support the CI? All of them are not simple questions…

Update: I sent a draft implementation in ⚙ D135507 [Draft] [libcxx] introducing std modules.

Why

Previously I mentioned there are some reasons to not require to work immediately:
(1) The compiler is not mature enough. See Standard C++ Modules — Clang 16.0.0git documentation for example.
(2) The build system is not ready.
(3) We lack experience for distributing modules.

Then I changed my mind. I feel it is not bad to pursue the std modules in libcxx now. There are the reasons:
(1) The most important reason is to add in-tree tests. Due to we’re implementing modules actively in clang now, it may be possible that someone breaks the modules ability and we can’t notice it in time. This is very possible since the testing ability of clang tests are limited. It lacks runtime testing. Previously I tested it with offline testings. But I feel it will be much better to have the in-tree tests. So that every developers can notice the error in the earlier time.
(2) Although the compiler support is not mature enough and it has a lot of bugs, currently it can compile a lot things too. Now I feel it is bad to make the perfect the enemy of better. Previously I thought the libcxx users may feel bad if they found the std modules is not good enough. But I feel it might not be a big deal as long as we announce it is an experimental feature, is it?
(3) For the build system, since the dependency relationships of std modules is relatively simple. We can describe the dependency relationship manually in the build scripts. See ⚙ D135507 [Draft] [libcxx] introducing std modules for the example for CMake. I believe it’ll be simpler for bazel.
(4) For the distributing ability, it is also bad to make the perfect the enemy of better.

The methods

There are other methods to implement std modules like @operativeF mentioned. The only problem with them are that they need further support in the compiler side and it is unclear what the support should be. In another word, it looks like it lacks a clear design to me.

Then I feel my solution is not bad. Although it looks like it requires some additional typings, I feel it is not so bad. It is merely copying the synopsis from the spec. And I feel the style pretty clear to me. We can know what entites we exported in a small sets of files.

(I am open to accept other solutions.)