Here is my plan of improving the performance of loading global stable function map for non-LTO setups:
-
The
Namesare unused during the actual consumption of the Use pass. I believe we should avoid loading them into theNameToIdmap to reduce memory footprint. Also I believe this should apply for all scenarios regardless of LTO configuration. To make it happen, we have to move theNamesto the end of the serialized file and make some adjustments around the existing assertions accordingly (like skipping them during Use pass). -
We should make a dedicated mode for non-LTO/traditional 1-compiler-process-per-source-file setups that the map should be lazily loaded. The mode should be toggleable via a new flag. This would also lead to a major structural change to the serialized file:
- Write
FuncEntries.size()as before. - Write all the (sorted) hashes. They would be the only part that we would unconditionally deserialize into memory for fast binary search.
- Write the fixed-size fields for each entry. Compute a offset that points to the variable-sized
IndexOperandHashvector. - Write the
IndexOperandHashvector: size first, followed by the entries. - Write
Names.
To make the changes transparent to the Use pass, we would need to make a new
StableFunctionMapsubclass that manages the lazy-loading process: whenat()is called for the first time for a given hash during the lifetime of the map instance, it would load all the matching entries from file into memory. - Write
What is your thought on this approach? Any feedbacks are appreciated.