[RFC] Global Function Merging

Here is my plan of improving the performance of loading global stable function map for non-LTO setups:

  • The Names are unused during the actual consumption of the Use pass. I believe we should avoid loading them into the NameToId map to reduce memory footprint. Also I believe this should apply for all scenarios regardless of LTO configuration. To make it happen, we have to move the Names to the end of the serialized file and make some adjustments around the existing assertions accordingly (like skipping them during Use pass).

  • We should make a dedicated mode for non-LTO/traditional 1-compiler-process-per-source-file setups that the map should be lazily loaded. The mode should be toggleable via a new flag. This would also lead to a major structural change to the serialized file:

    • Write FuncEntries.size() as before.
    • Write all the (sorted) hashes. They would be the only part that we would unconditionally deserialize into memory for fast binary search.
    • Write the fixed-size fields for each entry. Compute a offset that points to the variable-sized IndexOperandHash vector.
    • Write the IndexOperandHash vector: size first, followed by the entries.
    • Write Names.

    To make the changes transparent to the Use pass, we would need to make a new StableFunctionMap subclass that manages the lazy-loading process: when at() is called for the first time for a given hash during the lifetime of the map instance, it would load all the matching entries from file into memory.

What is your thought on this approach? Any feedbacks are appreciated.

1 Like