RFC: Move TOSA Layerwise Constant Optimizations to Their Own Pass

We have a number of canonicalizations in TosaOps.cpp. While many of them are narrowly scoped to just rewrite constants of splats, etc (and at a quick glance, seem “simple” and borderline ok as canonicalizations), we also have the ConstantTransposeOptimization which is not that.

This is doing an loopy rewrite of an ElementsAttr into a DenseElementsAttr. In profiles of compiling bert models with constant weights, this pattern is accounting for ~50% of the compilation time (approx ~21s on my machine) for IREE. Looking at perf, the cost is split between iterating to build the new output values and uniquing the result in the context (both of which are non optimal ways to be doing this but would need a more complicated implementation than what can be done in such a canonicalization pattern to correct).

Leaving aside the utility or wisdom of the implementation, having such an expensive pattern as a canonicalization leaves us with no control for applying it. Also, in an attempt to be cost model neutral, it is relatively restricted in when the optimization is allowed.

I suggest that we move this pattern (and any others like it) to a new “TosaLayerwiseConstantOptimizations” pass so that we can control their application more effectively. We need this control for IREE, since IREE already has a general constant partial evaluation facility that is designed to both bypass these overheads and consider a real cost model when determining how such optimizations apply. Having these apply unconditionally works against that.

Objections?

1 Like

Hello @stellaraccident,

Was already in the process of doing this by as you mentioned first moving over the ConstantTransposeOptimization.
Patch is here ⚙ D124685 [mlir][tosa] Moves constant folding operations out of the Canonicalizer and under review for a while now.

Fantastic - thank you, and glad I asked :slight_smile:

Some of those I’m not sure why they aren’t folders rather than canonicalization patterns. In practice that does not make a difference for canonicalization pass today (which I do have mixed feelings on).

Agreed - the ones constrained to splats seem like they should be folders.

I’d like to take a look through more closely at all of the ones with “Optimize” in the name. Some of them feel like folders to me, while others, like the transpose pattern we are discussing seem like optimizations. In general, I’d like to see the op definitions and lowerings not be responsible for actual “optimizations”.

One of the triggers for me is having folders or canonicalizers that are restricted to hasOneUse – since this is actually a hard-coded/primitive cost model which signals an optimization. Interestingly, ConstantTransposeOptimization does not have that restriction, and in a multi-use context, will result in it not being size-neutral. ReshapeConstOptimization has that restriction, but given that it is only operating on splats, this actually feels to me like it should be a folder and does not actually need to be restricted to hasOneUse.

I think that the diff that George references above is a strict improvement over the current state. How about we land that and then come through and audit/refactor the rest of these?

We’re strongly in favor of optionality here as well and I’d helped @GeorgeARM review this internally - we’ve encountered similar issues in backend codegen when a previously expected form changed overnight. Sometimes this is just a layering or sequencing problem, but sometimes the choice of transformation is backend dependent.