Extending the LLVM dialect InlinerInterface

A revision was recently merged that implements a rudimentary InlinerInterface for the LLVM dialect, using an allowlist approach that accepts pure operations only.
This interface is of course (intentionally) incomplete, since there are many additional cases where inlining would be legal.

For our downstream project, we need to allow inlining of a wider range of LLVM operations. However, since LLVMInlinerInterface is registered during dialect initialization (in LLVMDialect::initialize()), it is not extensible downstream.

Is the long-term vision to extend LLVMInlinerInterface towards the feature set of LLVM’s inliner? Or is this not desirable (e.g., due to the potential maintenance cost)?
If the latter is the case, we would need to extend or overwrite the interface downstream. Would the right approach be to make the registration optional, rather than part of dialect initialization?
If the former is the case, could this be done via a gradual allowlist approach to add more features from the LLVM inliner to the upstream interface?


1 Like

Restricting to “pure” seems overly conservative to me, I don’t see why we should limit it on the long term.
As most things in MLIR, the development is “demand driven” and this was added to support immediate needs, I don’t see a reason to not extend this further (reading the comments in ⚙ D139187 Allow inline of all pure ops from the LLVM dialect. seem to support this as well).

What would be interesting/challenging ultimately will be to have a pluggable cost-model for the inliner.

1 Like

Nobody needed more than just pure operations, which are also intuitively inlineable. I see no point in extending the inliner interface for the upstream dialect downstream. One should update upstream instead.

As Mehdi says, we will need a cost model for this eventually. However, it can be introduced as an extension to the current interface, e.g., as a set of additional isProfitable methods. The current pass can be treated as greedy inliner, and we can have a cost model-driven inliner later on.

LLVM has a module inliner. They call it isMoreDesirable and of course there will be ML. Here are the baby steps towards ML.

1 Like

As Alex said, before thinking about a cost model to decide what should be inlined, the first steps are to widen the support of what can be inlined and handle side effects and metadata attributes correctly.

Baby steps: ⚙ D141115 [mlir:LLVM] Rudimentary inlining support for LLVM load store.

1 Like

We started to look a bit deeper into inlining LLVM dialect operations that have alias scope metadata attached. LLVM’s inliner clones alias scope metadata (llvm-project/InlineFunction.cpp at dc891846b811cdf945e2cdacd9b227c2d0966e71 · llvm/llvm-project · GitHub) to obtain unique scopes at every call site.

The LLVM dialect models alias scope metadata using a module level metadata op that contains scope and domain operations in its region. These scope and domain ops are symbols and can be referenced from the entire module. The following simplified example illustrates how alias scope metadata looks like:

module {
 llvm.func @aliasScope(%arg1 : !llvm.ptr<i32>) {
     %0 = llvm.mlir.constant(0 : i32) : i32
     llvm.store %0, %arg1 { 
       alias_scopes = [@metadata::@scope1],
       noalias_scopes = [@metadata::@scope2, @metadata::@scope3] 
     } : !llvm.ptr<i32>

 llvm.metadata @metadata {
   llvm.alias_scope_domain @domain { description = "The domain"}
   llvm.alias_scope @scope1 { domain = @domain }
   llvm.alias_scope @scope2 { domain = @domain }

When thinking about implementing this with the current inliner interface and inlining pass we were wondering if this wouldn’t trigger threading problems.

The inlining pass is a module pass, which in principle allows us to update the global metadata operation. However, the inliner pass implements custom internal parallelism by inlining different SCCs of the call graph in parallel (if we understand it correctly). Assuming we extend the inlining interface to clone the metadata ops, multiple threads may try to concurrently add new operations to the body of the global metadata operation, which we believe is not thread safe.

Are we misunderstanding something and there is in fact no problem at all? Any ideas on how to circumvent the problem?

(The topic is somewhat related to the modeling of cyclic dependencies in debug info Handling Cyclic Dependencies in Debug Info - #2 by zero9178 since the way metadata is modeled may have implications on thread safety. That being said there would not be a problem if the inliner would not exploit SCC parallelism).


You probably want to consult with @River707 on both the inliner and parallelism.