I ported a big third-party header library to C++20 modules. I expected a decrease in compilation time, as it must only compile once per project and not per translation unit. Instead, the build times increased. -ftime-report reports that half of the time is spent in reading modules. I tried to profile it and most of the time is spent in clang::ASTReader::ReadDeclRecord(unsigned int).
Note that <iostream> is only used, to inflate the AST.
Maybe it would be possible to cache parts of clang::ASTWriter::GenerateNameLookupTable and speed up the AST reading. I don’t know anything about clang internals, so I can’t say how to best fix this issue.
The number of calls to ASTWriter::WriteDecl, ASTDeclWriter::Visit, and DeclContext::lookup stayed nearly the same. The most significant difference is that ASTReader::FindExternalVisibleDeclsByName gets called often when using import. I noticed that the call chains while reading the AST are deep. I tried to print the Declaration it visits by modifying the AST reader, however, I only got crashes.
I created a repo that highlights this issue, note this is neither minimal nor beautiful and only a bad port to C++ Modules, but it could help to find the root cause of this issue. Please notify me if there are problems building this.