Just thinking out loud…. I’m really not familiar with the vast majority
of what instcombine does, so please excuse me if my naiveté is too obvious.
I can’t help but notice all of the uses of ‘and’ in Daniel B’s description
of what instcombine is right now:
>
> FWIW, Before getting to "how to do it", I think InstCombine needs an
actual goal.
> Right now it's "do a bunch of instruction combination and
canonicalization and random kinds of semi-local optimization with some
weird analysis and other stuff in the middle.
> Iterate this as hard as we can"
> Nobody can really draw any real dividing line for what should be there
or not except based on how easy or expensive it is to shove it in.
> That's a recipe for pass creep.
This makes me wonder… is it sensible to be talking about splitting up
instcombine into multiple separate passes? Would such a thing even be
possible? For example, split by functionality into separate passes that
each do one of:
* instruction combinations
* canonicalization
* semi-local optimizations
* etc…
The problem is that all of these are really the same thing. Almost all
canonicalizations are also simplifications, the common underlying factor
being that they're mostly-local transformations that likely enable other
optimizations.
Or something like that.
As separate passes, each would probably have a natural way to be
implemented effectively and those implementations might vary.
One obstacle to that is that currently instcombine has an internal
fixed-point iteration that it does.
So when splitting it into separate passes we would need to either add
fixed-point iteration to the pass manager running the separate instcombine
passes (extending the pass management in this way is doable and has been
explored in the past, e.g. https://www.youtube.com/watch?v=c7iP43an5_Q )
or demonstrate/measure that we don't regress without the fixed-point
iteration across separate instcombine passes.
I agree, but I'll add that I view the fixed-point iteration here to be an
asset. It increases the robustness and consistency of the compiler, and
allows later passes to depend on the canonicalization (*) without worrying
about phase-ordering effects (**).
FWIW: I completely agree. At least as you are using it here, which is in
the sense of "system is built to apply a set of transformation. Fixed point
iteration ensures that all such transformations that can occur, do, even if
they are only exposed as a result of earlier transformations".
In that sense, a vast majority of our passes do or should be using fixed
point iteration.
The usual concern is cost, and it requires careful engineering (as you say
below) to ensure you are doing this in an effective manner. IE ad-hoc
solutions work but rarely scale, it requires a good understanding of theory
and engineering to get something that will scale really well and work well
Integrating multiple transformations/capabilities into a single pass is
much trickier than one would expect. You can easily screw stuff up by
doing things you think might only improve things.
A simple example from value numbering define void @hoge() {bb: br label %bb1bb1: - Pastebin.com
Note that if we ignore unreachable edges at first and just shove it in the
algorithm, we no longer get optimal answers in certain cases. Even though
we only changed exactly one initial state, and we're trying to make things
better!
This is fixable, but requires a lot of thought to see that this will happen
in the first case.
(the reason in this case is because the algorithm relies on the TOP state
to resolve cycles. There is no perfect comes-before iteration ordering in
this case. By changing the initial state that fed the cycle, without
resetting that cycle to TOP, it can no longer resolve the cycle to the
optimal answer. )
I'd like to see us do more fixed-point iteration in our optimization
pipeline, and our past work has shown this would be practical (i.e., only a
few iterations will be needed in practice), but even that won't remove the
advantages of using a fixed-point iteration within InstCombine.
Completely agree on this
Regardless, I think that if we had a model for what InstCombine did (i.e.,
as a graph-rewriting system), then it would be clear what could be part of
that system and what couldn't. Otherwise, I think that the real challenge
here is figuring out the underlying structural foundations to make the
process efficient and sound.
Also completely agree.