(I know libcxx folks prefer Discord. I just feel discourse may be the place for longer text. I’ll send this to Discord later)
Sorry in ahead since this is indeed frustrating.
I found the current implementation of the std module compiles slower than the non-module codes in small projects. Here is the example:
import std;
int main(int, char**) {
std::vector<int> v;
v.push_back(5);
v.push_back(7);
v.push_back(2);
v.push_back(7);
v.push_back(4);
v.push_back(1);
int t{3};
std::optional<int> op{3};
v.push_back((op <=> t) == std::strong_ordering::equal);
std::sort(v.begin(), v.end());
for (int i : v)
std::cout << i << " ";
std::cout << "\n";
return 0;
}
It takes 3.395s to compile. If we apply the optimization described in [Modules] Faster compilation speed or better diagnostic messages?, it will still take 2.455s to compile.
And for the non-modular version:
// import std;
#include <vector>
#include <optional>
#include <algorithm>
#include <iostream>
int main(int, char**) {
std::vector<int> v;
v.push_back(5);
v.push_back(7);
v.push_back(2);
v.push_back(7);
v.push_back(4);
v.push_back(1);
int t{3};
std::optional<int> op{3};
v.push_back((op <=> t) == std::strong_ordering::equal);
std::sort(v.begin(), v.end());
for (int i : v)
std::cout << i << " ";
std::cout << "\n";
return 0;
}
It will only take 1.861s to compile. So the std module is slower at least in this case.
The root reason is that the compiler has to deal with many redeclarations within different module units to make things correct. The compiler has tried really hard to make things happen correctly. And personally, I feel it is hard to fix the issue (slow to deal with the redeclarations in different modules) fundamentally.
(I’ll try to document the practice to avoid other users to fall into this)
Currently we implemented the std module by declaring the visibilities of each declarations in each partitions. The structure is clear. But I didn’t foreseen the performance penalty. Then here is the problem.
While the current implementation should be correct and we’re still busying with testing and distributing std modules, I think we can continue in the current direction. Since what we’re doing is orthogonal with the implementation of the std module. We can go back and refactor the implementation of the std modules.
Then for the alternatives of the implementation of std modules, we roughly have 2 solutions.
First is similar with the MSVC’s style:
- Include every standard header in the module purview.
- Wrap every thing with language linkage (
extern "C"/"C++"
) to avoid breaking ABI. - Export the declarations we want.
Roughly it should be:
// config
#ifdef IN_MODULES
#define LANGUAGE_LINKAGE_IN_MODULES extern "C++"
#else
#define LANGUAGE_LINKAGE_IN_MODULES
#endif
#ifdef IN_MODULES
#define EXPORT export
#else
#define EXPORT
#endif
// vector
...
LANGUAGE_LINKAGE_IN_MODULES
template<...>
class vector_base { ... };
EXPORT LANGUAGE_LINKAGE_IN_MODULES
template<...>
class vector : public vector_base<...> { ... }
// std.cppm
module;
#define IN_MODULES
// include headers from C libraries to avoid the C libraries appear in the module purview
export module std;
#include <vector>
...
Note: this is similar but not identical with MSVC’s style since I don’t see a lot of extern "C++"
in MSSTL. I am not sure if MSVC treat std module differently. Maybe it is worth to ask MSVC team about this. But at least for clang now, the declarations in the module purview without the language linkage will get a different mangle name. This is problematic.
The second solution is a simple trick to convert existing code bases to:
// std-vector.export.list
export namespace std {
using std::vector;
...
}
// std.cppm
module;
// include all the standard headers
export module std;
#include <std-vector.export.list>
#include <std-list.export.list>
...
In the method, we will only have one module unit. And all the other export using
declaration will be included into the primary module interface. So that we can keep the current structure and remain exactly one module unit.