General question about enabling partial inlining

Hi,

I noticed some performance gains in some spec benchmarks without significant code size bloat when aggressively performing partial inlining, especially when the original callee spill CSRs in the entry block. I guess the partial inlining is not enabled mainly due to the code size. Is there any other issue which prevent the pass from being enabled? Do we have any plan or any on-going works to enable partial inlining ?
Thanks,
Jun

Hi Jun,

We’re actually looking at enhancing the partial inlining pass right now (see http://lists.llvm.org/pipermail/llvm-dev/2017-August/116515.html)

We’d be interested in turning on the pass by default some time in the future, if our enhancements prove beneficial.

Cheers,

Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu@ca.ibm.com

graycol.gifvia llvm-dev —09/13/2017 01:12:02 PM—Hi, I noticed some performance gains in some spec benchmarks without

Hi Graham,

Thanks for sharing this. Are you planning on enabling the pass only on PGO? Even in non-PGO, I noticed some performance gains when we are aggressive in partially inlining the early return part, especially when the callee spill CSRs in the entry block. At a high level, I have two questions:

  1. What is the main obstacle that prevent the pass from being enabled by default?
  2. Would it make sense to give some bonus in the cost model when we detect the possibility of spilling CSRs in the entry block?

Thanks,

Jun

graycol.gif

Hi Jun,

If we were to enable this by default, I think we’d make a push to enable it with or without PGO. I’m not sure what’s currently preventing the partial inlining pass to be enabled by default, actually. David Li may have a better sense on this as he’s been actively working on it most recently.

As for the cost model, right now it’s platform independent, so I’m assuming you’ll need some sort of hook for platform-specific costs/bonuses to detect spilling of different types of registers. Also, I’m not quite sure how you’ll do this type of detection at the IR level. Are you thinking some sort of heuristic?

Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu@ca.ibm.com

graycol.gif"Jun Lim" —10/03/2017 12:21:38 PM—Hi Graham,

As for the cost model, right now it’s platform independent, so I’m assuming you’ll need some sort of hook for platform-specific costs/bonuses to detect spilling of different types of registers. Also, I’m not quite sure how you’ll do this type of detection at the IR level. Are you thinking some sort of heuristic?

Agree, it should be a target specific hook if doing so make sense. As far as I see the CSRCost used in RegAllocGreedy is extremely low, so in most cases we allocate a CSR if a live range expand across a function call. I guess we can estimate this by checking if a user of a value defined in the entry block is reachable from a block with a function call.

graycol.gif

Hi Graham,

Thanks for sharing this. Are you planning on enabling the pass only on
PGO? Even in non-PGO, I noticed some performance gains when we are
aggressive in partially inlining the early return part, especially when the
callee spill CSRs in the entry block. At a high level, I have two
questions:

   1. What is the main obstacle that prevent the pass from being enabled
   by default?
   2. Would it make sense to give some bonus in the cost model when we
   detect the possibility of spilling CSRs in the entry block?

More enhanced shrink-wrapping will probably take care of this, so using

partial inlining to do that seems like a wrong motivation :slight_smile:

David

graycol.gif

Hi Jun,

If we were to enable this by default, I think we'd make a push to enable
it with or without PGO. I'm not sure what's currently preventing the
partial inlining pass to be enabled by default, actually. David Li may have
a better sense on this as he's been actively working on it most recently.

Longer term, the partial-inliner should really be just a 'function
outlining pass' that serves as an enabler for more aggressive inlining.
That of course can be enabled by default (assuming good cost
model/heuristics) before the regular inliner.

David

graycol.gif

Hi Graham,

Thanks for sharing this. Are you planning on enabling the pass only on PGO? Even in non-PGO, I noticed some performance gains when we are aggressive in partially inlining the early return part, especially when the callee spill CSRs in the entry block. At a high level, I have two questions:

  1. What is the main obstacle that prevent the pass from being enabled by default?
  2. Would it make sense to give some bonus in the cost model when we detect the possibility of spilling CSRs in the entry block?

More enhanced shrink-wrapping will probably take care of this, so using partial inlining to do that seems like a wrong motivation :slight_smile:

In some cases, shrink-wrapping cannot shrink and we need to spill in the entry block. In such case partial inlining might help avoiding the execution of spilling CSRs. For example, I also saw 10% improvement in spec2006/astar when we completely avoid executing CSR spills in the entry block using partial inlining. However, as far as I know, the enhanced shrink-wrapping will help more partial shrinking wrapping.

David

graycol.gif