[GSoC] Inlining: using locally optimal inlining decisions

Dear All,

I’m Liqiang Tao, a graduate student at the University of Tokyo, Japan. You can call me Liq or Tao, which is easy to pronounce I think.
My project is about exploring whether it is helpful to come up with locally optimal inlining decisions with the profile, which generalize the existing LLVM inline deferral logic.
You could access my full proposal here - https://docs.google.com/document/d/1nhJKSdPTtWzTbAsY3vvTlXbLZFI9GQMnU5F4d-3JLWc/edit#heading=h.us242diaa3ay
I think it would be an exciting summer with my great mentors Kazu Hirata and Mircea Trofin.

Best Wishes,
Liqiang Tao

Hi Liq,

This is a cool project, good luck!

At Azul we spent quite a bit of time looking into inlining. At this point we run a custom inliner pass which is similar to the paper in many aspects.

This is a loaded paper as it touches multiple aspects of inlining. Some of the aspects are applicable to the LLVM inliner others are not. I’m curious what you mean by porting this algorithm to the LLVM inliner. These are the main aspects as I remember them.

  • In the context of just-in-time compilation the full call graph is not available to the inliner. Because of that inliners in just-in-time compilers tend to be top-down. A part of this paper is heuristics on how to explore the call graph starting from a single top level method. This is not relevant to the LLVM inliner because it has the call graph available.

  • The clustering aspect. This can be viewed as a simulation of bottom up inlining without actually inlining. Clustering is performed in bottom-up order on the subgraph explored by the heuristics mentioned above. Bottom-up inlining achieves very similar results by representing clusters as functions with inlined callees. So I am not sure that this aspect is relevant for the LLVM inliner.

  • The cost/benefit model. They do cost/benefit analysis by specializing callees and propagating the facts from the callers into the specialized functions. Then they count how many simplifications happened after this specialization. LLVM does something similar in the inline cost analysis but without specializing the functions. Graal’s inliner propagates the facts to the callees, runs the simplifications and then observes how many simplifications took place. LLVM’s inliner looks at the callees and the facts available in the caller, and analyzes what kind of simplifications would happen once these facts are available after inlining. The inline cost analysis in LLVM has a very rudimentary benefit analysis. This aspect can be improved.

  • Deep inlining trials. Graal’s inliner propagates facts into callees multiple levels deep. For example, if a constant is passed as an agreement into a call foo and foo passes this argument into bar where the value is actually used, the constant will be propagated all the way to bar. This makes it possible to recognize simplifications multiple levels deep. This is a very interesting ability. Note that a bottom up inliner might be less sensitive to this, because it tends lift the code up to the callers, effectively compressing deep call chains. Because of that a bottom-up inliner can sometimes get away with a single level inlining trial.

You mention grouping locally related functions in the document. Looks like this is mostly about improving the cost/benefit model. Do I understand it right?