RFC: Introducing CfgTraits and type-erased CfgInterface / CfgBlockRef / CfgValueRef

Hi all,

This is a request for comment on a series of patches which introduce a new way of writing algorithms that are generic over different types of CFG.

What is this?

Hi Nicolai,

Thanks for sharing the proposal and the patches. I looked at most of them and think this is a good general direction.

There’s a lot of heavily templated code in generic DomTee construction/updater, MemSSA updater, and GraphDiff that has become really hard to modify. For the context, Alina (cc’d) was recently looking into making the domtree code work with ‘CFG views’; the basic idea is to take a concrete CFG and a series of updates and create a view over this CFG, as if the updates were applied, add more updates and repeat [1]. The initial attempt was based on GraphDiff and GraphTraits over GraphDiff, but this gets prohibitively verbose very quickly and caused difficult to fix compilation time regressions related to nested fancy iterator types (something like concat<filter<pair<vector<>::iterator, T*>>, …>) [2]. We got to the idea of trying polymorphic iterators, which I think is kind of similar to this RFC, except that your proposal does type erasure at the bottom, while polymorphic iterators from the top. Alina should know this problem space best.

I think that generic DomTree construction/updater could be rewritten to work on CfgTraits directly, while GraphDiff would provide CfgTraits directly usable by the updaters. Same with the generic IDF calculator. The biggest question for me here is how all of that would affect compilation times. What are your thoughts on more refactoring like this? How do you plan to evaluate compilation times impact of your changes?

Thanks,
Jakub

[1] https://github.com/llvm/llvm-project/commit/a90374988e4eb8c50d91e11f4e61cdbd5debb235
[2] http://llvm-compile-time-tracker.com/compare.php?from=37bcf2df01cfa47e4509a5d225a23e2ca95005e6&to=a90374988e4eb8c50d91e11f4e61cdbd5debb235&stat=instructions

Hi Jakub,

There's a lot of heavily templated code in generic DomTee construction/updater, MemSSA updater, and GraphDiff that has become really hard to modify. For the context, Alina (cc'd) was recently looking into making the domtree code work with 'CFG views'; the basic idea is to take a concrete CFG and a series of updates and create a view over this CFG, as if the updates were applied, add more updates and repeat [1]. The initial attempt was based on GraphDiff and GraphTraits over GraphDiff, but this gets prohibitively verbose very quickly and caused difficult to fix compilation time regressions related to nested fancy iterator types (something like concat<filter<pair<vector<>::iterator, T*>>, ...>) [2]. We got to the idea of trying polymorphic iterators, which I think is kind of similar to this RFC, except that your proposal does type erasure at the bottom, while polymorphic iterators from the top. Alina should know this problem space best.

Interesting! I'm not sure what those polymorphic iterators would look like.

I think that generic DomTree construction/updater could be rewritten to work on CfgTraits directly, while GraphDiff would provide CfgTraits directly usable by the updaters. Same with the generic IDF calculator. The biggest question for me here is how all of that would affect compilation times. What are your thoughts on more refactoring like this? How do you plan to evaluate compilation times impact of your changes?

Yes, that would be possible. And the compile time question is a big
reason why I didn't touch DomTree construction, but it's probably
worth the experiment. The main issue is iteration of predecessors /
successors.

Is it possible to run the llvm-compile-time-tracker tests on a branch
somewhere? That would be the most convenient approach as far as I'm
concerned.

Cheers,
Nicolai

Interesting! I’m not sure what those polymorphic iterators would look like.

I don’t have anything concrete yet, but I thought of creating type-erased range and iterator types over base types, i.e., the types you get by dereferencing iterators with operator*. This way you (1) don’t have to worry about begin and end returning different types and (2) can implement different graph traversal logics without having to introduce funny tag types. IIRC the main observation Alina made was that iterators derived from ranges cannot outlive these range objects, so generic ranges would have to provide storage for concrete range objects and generic iterators would have to carry references to their parent ranges. Here, the motivation is purely to work around the current template madness in GraphDiff and the updaters, but it’s unclear to me what the performance impact of the virtual calls and small/dynamic storage would be.

Next, we could implement range adapters (like concat, filter, etc.) to eagerly materialize or cache underlying rangest storing them in some user-provided external storage. For example, for some data structures (e.g., IntervalMap) it may be costly to just advance iterators and we may save some time by caching the underlying references until the underlying data gets modified.

All of this is just a bunch of ideas as of today, and I’d consider it completely orthogonal to this RFC.

Jakub

Hi,

The overall idea to provide a cleaner and unified infrastructure is great. The only caveat already mentioned is keeping an eye on the impact on compile-time.

Just to reiterate what Kuba said, I think what D77341 is looking to cleanup is somewhat orthogonal to this RFC. There are aspects that we played with there that may help, in particular potential solutions for avoiding the added compile-time. Vice-versa, it’s possible CfgTraits could be used in GraphDiff - I don’t have a good picture here yet.

So while the end goals are different, it seems like the scopes of the work can benefit each other :-).

Deep-diving into some of the compile-time regression: I did an analysis on where the added time was spent with the changes in D77341, and the main cause is merging the runtime check for a nullptr BUI into a single path that doesn’t add new children, but still added a concat iterator. This is the context in which we’re considering having GraphDiff return a type-erased iterator.
An option discussed was through virtual calls due to the runtime nullptr check, to avoid concating the empty iterator; another option was re-designing GraphTraits to provide storage and not return a light-weight/copyable range. Another option Dave looked into is improving the concat_iterator for the common case of having 1-2 iterators concatenated. We also looked into re-nesting iterators (the wrapped iterator with the concat/filter iterators) and found incompatibilities with existing by-value iterator implementations.
It looks like the compile-time issue is not as straight-forward for the DT full-cleanup, but we’re actively looking into it.

Alina

The overall idea to provide a cleaner and unified infrastructure is great. The only caveat already mentioned is keeping an eye on the impact on compile-time.

Just to reiterate what Kuba said, I think what D77341 is looking to cleanup is somewhat orthogonal to this RFC. There are aspects that we played with there that may help, in particular potential solutions for avoiding the added compile-time. Vice-versa, it's possible CfgTraits could be used in GraphDiff - I don't have a good picture here yet.
So while the end goals are different, it seems like the scopes of the work can benefit each other :-).

Sounds good.

Deep-diving into some of the compile-time regression: I did an analysis on where the added time was spent with the changes in D77341, and the main cause is merging the runtime check for a nullptr BUI into a single path that doesn't add new children, but still added a concat iterator. This is the context in which we're considering having GraphDiff return a type-erased iterator.
An option discussed was through virtual calls due to the runtime nullptr check, to avoid concating the empty iterator; another option was re-designing GraphTraits to provide storage and *not* return a light-weight/copyable range. Another option Dave looked into is improving the concat_iterator for the common case of having 1-2 iterators concatenated. We also looked into re-nesting iterators (the wrapped iterator with the concat/filter iterators) and found incompatibilities with existing by-value iterator implementations.
It looks like the compile-time issue is not as straight-forward for the DT full-cleanup, but we're actively looking into it.

So I spent some time porting dom tree construction to what I've done
as well. I'm seeing improved # instrs on opt -O3 on Z3, but for the
same mafft/constants.c test that Nikita brought up on D77341, I'm
seeing a 1.3% increase in the number of instructions. Not quite as bad
as those 2%, but still uncomfortably high. I haven't had a chance to
dig into it further yet, but just in case you're interested, the (very
WIP) branch on which I've done the translation is at
GitHub - nhaehnle/llvm-project at controlflow-wip-v5-domtreeconstruction.

Cheers,
Nicolai