Inliner cost model

_sean_silva · March 11, 2021, 3:01am

Are there any plans/directions for the inliner cost model? Looking at the code it seems to just inline all feasible calls.

I have a program where the current inliner cost model results in significant code size bloat (about 8x). (IR before inlining + call count stats). Happy to help implement when this starts to block me. Also wondering if other folks in the community are running into inliner cost model issues.

cc @River707

_sean_silva · March 11, 2021, 11:38pm

River and I had a nice chat offline. Just a quick braindump of some stuff that came up:

cost model for inlining, especially in high level dialects is really hard. It seems to depend transitively on a potentially vast number of later lowering decisions.
code-size based heuristics (like LLVM):
- Consider: high level ops or similar library calls in high level (pytorch/tf/numpy) frameworks such as “add two tensors, which might have any shape and dtype” – which data types? what shapes to support? codegen size might vary wildly based on that, but calling out to a library might be tiny either way!
- Consider: linalg-on-tensors ops. Vastly different code size depending on fusion. E.g. if an “add” is fused, then it is just one “add” machine instruction, but if it isn’t, then it is at least a loop around an “add”, but depending on the runtime and device, it might expand into queuing a GPU grid dispatch, etc. which is way more.
performance-based heuristics are totally dependent on the backend/runtime, and require similar sorts of thinking as code-size
- would inlining enable fusion?
- would it enable intraprocedural shape inference to do a better job?
- does the function have context-dependent shapes and is beneficial to specialize?
inliner can do some amount of abstract interpretation of the callee to find unreachable blocks and adjust cost accordingly (e.g. prove we are only in the “fast” push_back case). LLVM does this, but it is unclear how to extend it to the MLIR level of generality.
At some level we’ll need an interface where we can plug in machine learning to learn good cost models. Maybe that’s the way out.
Some trivial “always beneficial” inlining is possible to do, like identity functions and private functions with only one callsite.

clattner · March 11, 2021, 11:59pm

I think that this has got to be an opinterface sort of thing. You might want to look at llvm::TargetTransformInfo::getInstructionCost as one example existing implementation of this idea.

I’d specifically recommend splitting the “measuring an operation” in code size or performance or whatever, from the policy of “should I inline a function containing a bunch of operations of some size”

River707 · March 12, 2021, 12:21am

From my perspective, and I think Sean touched on this, there can’t feasibly be one MLIR Inliner cost model. I just don’t see how we can define a reasonable model that crosses all possible domains. What I had in mind is that we provide an abstract CostModel class that clients can override to determine when to inline or not. In some abstractions, this can be based on code size using some dialect or target interface. In others, it can be based on some analysis that is performed by the frontend. In other cases it can be an ML model that does some seemingly random computation to compute a value score for the function. The point of all of this being, I agree with you and I don’t think any kind of “cost” notion is going to scale to all MLIR users. We may have several “default” cost models that users can pick from that use some combination of interfaces (and what not), but I don’t think the inliner should embed any of this into its core logic.

– River

_sean_silva · March 12, 2021, 2:23am

Yup, that seems right to me. I don’t think it is precisely an OpInterface though, because in general it is not a property of the Op, but a property of the “target”. (i.e. it is TTI::getInstructionCost not FSub::getInstructionCost). We don’t really have a “target” abstraction (which for MLIR needs to cover so many layers potentially). So we’re probably looking more like a “plugin” or InlinerCostModelBase to inherit rather than OpInterface, which a user will configure based on their understanding of their pipeline setup.

Also, it probably needs to reason at a granularity larger than a single instruction, to enable parameterization sufficient for ML model heuristics to do a great job.

clattner · March 12, 2021, 5:11am

Sure, but that isn’t why LLVM is putting this in its target libraries. This is done because llvm intrinsics are the “logical extension point” for targets, and the mid-level optimizer wants to be able to do queries without enumerating all the targets.

That is the MLIR related problem I was referring to. The “ops can lower to different ways on different targets” problem is going to be intractable in general I think, particularly for something like a tensor op. An accurate cost model is going to be nuts.

-Chris

PeimingLiu · March 21, 2023, 5:09pm

So, what is the correct thing to do to avoid unconditional inlining?

We, the sparse compiler team, are also facing the same issue as the sparse compiler deliberately generates a few complex function calls for operations like sparse tensor insert/sort.

If there is no such a way, can we introduce a filter callback to allow users to filter out call sites that should not be inlined at least? (similar to L102@MLIR: include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h Source File)

tschuett · March 21, 2023, 5:29pm

Could I do something like the following:

addCostModel<MyDialectCostModel>();

or

addTarget<MyDialectTarget>();

when I am initialising my dialect?

szakharin · January 31, 2024, 11:51pm

I am also interested in investigating the possibility of adding an inlining cost model in our compiler. Would it make sense to parameterize the existing MLIR inliner pass with an instance of an InlinerCostModelBase that would provide hooks answering an inlining profitability question for a certain call site (probably, given a wider context of the CallGraph as well)? That interface can then be used around shouldInline on top or instead some of the checks that are currently there.

stellaraccident · February 1, 2024, 6:03am

I’d love to see this. I’ve needed it a few times but never enough to do anything about it.

szakharin · February 2, 2024, 6:20am

Thanks for expressing the interest! Let me review the current state of the things in MLIR inliner pass and come up with some proposal around next week.

stellaraccident · February 2, 2024, 6:08pm

Thanks. I’ve often wished that instead of an op/identity local decision on whether to inline, we had an easy mechanism to construct a pass with a custom cost model (where the cost model could assign an “infinite” cost that disables inlining). I think it is mainly a job of prying some of the infra apart and making it a bit more pluggable.

mehdi_amini · February 2, 2024, 8:23pm

This has always been the plan actually: the « inliner library » exposes the interface for legality.
The inliner implements the logic to drive all this, but we punted on exposing the hooks for the profitability on the inliner until the concrete customers impact materialize (user driven development as often) so that requirements are better defined and someone volunteer to design it

The default « inline » pass should probably be renamed a « always-inline » (no cost model) and then folks can just wrap the inliner in their own pass injecting a cost model (we could also build such thing upstream for a generic enough cost model of course, and we’ll need at least a test pass)

szakharin · March 6, 2024, 11:48pm

Based on the recent refactoring of the MLIR inliner pass, I suggest adding a profitability callback to the Inliner implementation in [RFC][mlir] Add profitability callback to the Inliner. by vzakhari · Pull Request #84258 · llvm/llvm-project · GitHub

Please review and comment.

stellaraccident · March 7, 2024, 3:36am

Thank you for the patch - very appreciated. I’ll have a look in a bit, but it also looks like Mehdi had already taken a couple of passes.

Topic		Replies	Views
LLVM Inliner LLVM Dev List Archives	25	235	November 30, 2010
Path forward on profile guided inlining? LLVM Dev List Archives	1	118	September 1, 2015
[GSoC] Inlining: using locally optimal inlining decisions LLVM Dev List Archives	1	135	May 25, 2021
Extending the LLVM dialect InlinerInterface MLIR	6	678	January 16, 2023
[RFC] Target Description and Cost Model in MLIR MLIR	14	1127	March 13, 2024

Inliner cost model

Related topics