As I mentioned on the list last week, I’ve been looking into the build time/memory usage of F18 builds and trying to narrow down where the problem lies and what possible solutions there might be. I thought I’d post an update on here to see if anyone else has any input about my findings so far.
First I’ll talk about things I tried that I don’t think would help:
I looked into using modules to improve build times without significant code changes first. However I quickly hit some stumbling blocks with this with regards to implementation; there does not as yet exist a module implementation that is compliant with the approved C++20 proposal on modules. Clang 9 has a partial implementation, however there are some gaps in the implementation, particularly as regards module header units (the import <header_file.hpp> mechanism) which I could not get to work correctly. Furthermore, cmake doesn’t have support for modules in either the current release or in their source trunk; they have some plans for module support but nothing concrete as yet. As such if we went forward with modules we would be changing our code to introduce a dependency to LLVM on an as-yet-nonexistent compiler and some future version of cmake. So my conclusion here is that this could be a solution 5 or 10 years down the line but doesn’t currently help.
With regards to writing our own llvm::variant type and using that, I did some investigation into the state-of-the-art of variant implementations to see what the build time characteristics of them are. First, I found online an incredibly simple implementation that didn’t have all of variants operations and simply used dynamic_cast/rtti everywhere rather than the template trickery found in std::variant (this was just a proof of concept as rtti is not acceptable for us). I couldn’t get this to compile fully because it didn’t implement all of the operations of variant we actually use, but it did take longer to tell me that it wouldn’t compile than our current implementation takes to compile in total, so this wouldn’t help even if I went further to get it to compile. I also investigated the variant implementation found here https://github.com/mpark/variant. This implementation has had a lot of build time performance work done on it as you can see from the website (https://mpark.github.io/variant/) however this gives very similar performance to the std::variant currently found in libc++, on which I believe it is based. It seems that the number/depth of our template instantiations is the problem rather than anything specific to the implementation.
I also briefly looked at using a custom tagged union implementation rather than std::variant, however I quickly discarded this idea due to the abnormal behaviour of non-trivial types in unions. In short constructors and destructors are not called automatically, and when changing the active field you need to manually call the destructor of the first type and then placement-new the second type into the same memory space, which is incredibly error prone. As such I don’t see this as a valid way forward.
I have also been looking at the parser, since after another look at the trace it seems a large amount of time is spent in the template instantiations of the overloaded operators that are used to generate the parser as well as the parse_tree. This is something I’m currently looking further into but it’s possible that with some changes to the way the parser is written we could still keep the current variant-based parse tree and see acceptable build times. I feel that it is the combination of these two things that is causing the build time behaviour we see, and that changing either one could bring us down to acceptable levels. I remember using Boost::Spirit (which the f18 parser seems to have a similar design to) for my dissertation and seeing very similar build time behaviour to what we see here. In the end I changed to using an external parser generator to avoid this behaviour.
If anyone has any input on this it’d be very welcome so we can look at a way forward.