Draft of custom python dialects

Hey @ftynse and @mehdi_amini, as we were discussing offline, I took a stab at creating some of the machinery for overlaying generated python dialect/ops code onto the mlir native extension: https://reviews.llvm.org/D89932

This is mostly plumbing but the test shows how you would navigate down to a dialect in the context, getting a dynamically loaded Dialect extension class that extends the core Dialect type and can have more capabilities stuffed into it.

The idea is that you can get custom dialects with something like:

ctx.dialect.std

And this would dynamically resolve to a Dialect class in an extension overlay (the search path for which can be augmented at runtime). The module holding that class would also have classes for each of the known ops and would use some meta-programming-ish decorators to tie it back to the Dialect so that you could ultimately do something like (we don’t have a builder facility in the python API yet, and we need to define that for this to be more than an example):

ctx.dialect.std.AddFOp.build(loc, lhs, rhs)

Then, as well, the AddFOp class would implement instance properties for accessing named operands/results/attributes/etc. We can similarly extend the query API to find the appropriate extension Op class and return it in place of a base Operation instance… or something like that (that will require fiddling to get right).

I think regardless of where we go on the actual extension code, tablegen integration, etc, we need a facility like this and I’m curious what you all think of the approach.

Then in addition, we need to decide on how we want to implement builders in order to get much further.

I think the extension mechanism is a neat idea, although I wouldn’t be shocked if I needed to write

import mlir
import mlir.dialects.std
import mlir.dialects.scf
...

to explicitly load the necessary dialects.

I am not super-convinced by the builder syntax, it has too much redundancy and too many dots for my taste. A thing that stands out for me is dialect being a field of ctx. I would expect to fetch the context from loc. I understand there needs to be a hook to trigger the extension loading mechanism somewhere, and that we want to load dialects in a specific context… Maybe we can later wrap it in something nicer using pure Python.

Wouldn’t we do:

std = ctx.dialect.std
add_op = std.AddFOp.build(loc, lhs, rhs)
# Alternatively:
add_op = std.AddFOp(loc, lhs, rhs)

Now that you mention that, it seems fairly obvious. I’m in favor of switching to the direct import syntax, and then re-purposing the current “extension” mechanism I proposed to instead be a list of parent modules to search for dialect implementation modules. I think we need that connection point to handle the query case (i.e. when traversing the IR, returning a typed Op class, but on further looking, I don’t believe it provides much benefit of wiring it up that way for building, which as you say, can just have people directly importing and grabbing the class they want.

We can provide some helpers/decorators/base classes to make it nicer under the covers.

I think the bigger question is what is the user API for these typed build functions. Ideally, without too much fuss, I should be able to both create an op without attaching it, and also have some kind of facility for tracking the insertion point like OpBuilder does. Maybe some spooky mixture of the two using with blocks…

from mlir.ir import *
from mlir.dialects.std import *

add_op = AddFOp(loc, lhs, rhs)  # Only creates - does not insert

ip = InsertionPoint(some_other_op)
add_op = AddFOp(loc, lhs, rhs, ip=ip)  # Inserts/moves the insertion point

with InsertionPoint(some_other_op):
  add_op = AddFOp(loc, lhs, rhs)  # Inserts/moves the implicit insertion point

I don’t see a lot of value in having a dedicated OpBuilder like thing like we have in C++ since most of the functionality is already easily accessible in the Python API, except for insertion point management.

We also have the complexity of function overloading not really being compatible with the way C++ build functions work. Ideally, I would like to avoid a lot of signature mangling at the Python level to try to resolve an overload. I think I’d rather just have the constructor be responsible for the “default” build entry-point, possibly overloading based on optional parameters that have defaults. Other build methods, imo, are better as auxillary static methods. I know that the ODS setup assumes C+±style overloading and I’m not sure what to do about that. We should probably have the low-level fully parameterized constructor accessible as a common static function name, as I suspect that higher level DSLs will want to use that exclusively vs relying on whatever sugared form we present as the default user-entrypoint.

Finally, having used this style of API before, I’ve found that the most frequent bug I encountered, no matter how many times I tried to remember, was due to the lack of implicit conversion that we have on the C++ side from an Op to the Value for the first result. The amount of times I accidentally ended up passing an Op where a Value belongs and getting an obscure error was really annoying. Example:

intermediate_value = SomeOp(loc, ...)
result = OtherOp(loc, intermediate_value)   # intermediate_value is actually still an `Op`
return result  # Oops. Just returned an Op from a function that is almost always supposed to return a Value

We could probably get around this by having some kind of protocol, applied consistently, for converting to a value (like to_value()) anywhere in the API where a Value is expected. It has corner cases, though, that are annoying.

I’ll update the patch to the explicit import form that Alex points out and simplify the extension machinery so it can just fulfill the purpose of finding a custom Op class given a dialect/operation-name, which should satisfy the traversing cases. I’d like to come up with some consensus on the builder syntax before writing too many versions of that, though.

I like this variant :slight_smile:

I’m missing something maybe, but isn’t it getResult() ? It also assume that you have single result in this form.

Yes it is getResult(). In a previous python binding, I just called it the result property, but same thing. My only point was that I found this to be the most error prone part of using the API. Without static typing, the mismatch can “travel” quite a way and lead to surprisingly hard to debug errors. I just want to see if we can make this better.

Ok, I think I got it all plumbed to work. Updated the patch: https://reviews.llvm.org/D89932 (tests not updated yet). The patch contains quite a few things that might appear unrelated but I bumped up against and need to land together.

I don’t have the insertion point stuff in yet (that is fairly separate and I can do in a follow-on).

Example:

# Boiler-plate (much of this can be simplified with additional sugar or further patches)
from mlir.ir import *
ctx = Context()
ctx.allow_unregistered_dialects = True
loc = ctx.get_unknown_location()
i32 = IntegerType.get_signed(ctx, 32)
m = ctx.create_module(loc)
m_block = m.operation.regions[0].blocks[0]
ip = 0
def createIntOp():
  global ip
  op = ctx.create_operation(
      "foo.intop", loc, results=[i32])
  m_block.operations.insert(ip, op)
  ip += 1
  return op
lhs = createIntOp().results[0]
rhs = createIntOp().results[0]

### Main point of this example:
# Note: Can also "from mlir.dialects import std"
std = ctx.d.std
AddFOp = std.AddFOp
### Construct a known op via constructor.
addf_op = AddFOp(loc, lhs, rhs)
m_block.operations.insert(ip, addf_op.operation)
print("Constructed op:", addf_op)
print(f"Operands: lhs={addf_op.lhs}, rhs={addf_op.rhs}")
print(f"Result: {addf_op.result}")

# Use the Raw version to cast the operation to the AddFOp view.
# This form will be used under the covers for casting/isa and
# returning an appropriate view from APIs that return operations.
cast_op = AddFOp._Raw(addf_op.operation)  # TODO: Make this implicit?
print("Casted op:", cast_op)
print(f"Operands: lhs={cast_op.lhs}, rhs={cast_op.rhs}")
print(f"Result: {cast_op.result}")
print(f"Module:\n{m}")
###

I ended up enabling either the import-oriented flow (i.e. from mlir.dialects import std) or the context oriented, which just flowed out naturally (ie. ctx.d.std.AddFOp).

Here is an example python module for the std/addf op (goal is to tablegen something like this someday):

from . import _cext

@_cext.register_dialect
class _Dialect(_cext.ir.Dialect):
  DIALECT_NAMESPACE = "std"

@_cext.register_operation(_Dialect)
class AddFOp(_cext.ir.OpView):
  OPERATION_NAME = "std.addf"

  def __init__(self, loc, lhs, rhs):
    super().__init__(loc.context.create_operation(
      "std.addf", loc, operands=[lhs, rhs], results=[lhs.type]))

  @property
  def lhs(self):
    return self.operation.operands[0]

  @property
  def rhs(self):
    return self.operation.operands[1]

  @property
  def result(self):
    return self.operation.results[0]

You’ve probably seen my affection to context managers, so I’d be happy with something like what you describe below. This may scale to blocks and ops with regions:

block = ctx.Block([FloatType.getF32(), FloatType.getF32()]) 
with InsertAtEndOf(block):
  add_op = AddFOp(loc, block.args[0], block.args[1])

I am also open to having an explicit “context” variable if that helps.

with InsertionPoint(other_op) as ictx:
  ictx.AddFOp(loc, lhs, rhs)

IMO, OpBuilder is insertion point manager + extension point via hooks/listener. If we don’t plan exposing something like ConversionPatternRewriter, we won’t need it.

We’d need to define what “default” means.

Given that we start from ODS, I am tempted to emit the overload-resolution code that looks at Python types and dispatches to one of the possible versions.

Can this be caught by type annotations? (Actually, can we have type annotations on pybind methods?)

This looks mostly good to me.

One mild concern is using the indexed insertion, which will have to traverse the entire block from the start every time. We added insertAfter/insertBefore to work this around IIRC.

That is just an artifact of stripping the example down once I realized I would want to address the insertion point management properly in a follow-up.

I’ll finish the patch and send out.

Good question that I don’t know the answer to off hand.

I think what I am actually reaching for is some kind of OpConvertible and ValueConvertible “protocol” (which would just manifest as casting through a .to_operation() and .to_value methods). Given that with this patch, I need something to implicitly convert from the new OpValue classes, I likely need that for operations anyway. Adding it for things that convert to Value would be nice, and it would give us a nice place to emit an exact error if the constraints are violated (!= 1 extent result, for example).

I’m not opposed but just haven’t really become a tablegen expert for this stuff. If I get it to the level build-wise of “generate constructor here”, is this something you can help with?

Then let’s stick with a simple InsertionPoint facility. That can generalize to a full builder in the future if ever needed (I don’t think it will get there).

I’ve been known to flirt with context managers myself :slight_smile: (sometimes too much, so I usually approach cautiously and make sure I’ve reduced it to the heart of the matter before committing).

In this case, I can’t think of a good reason to not handle insertion point this way: every alternative clutters the API or implementation quite a bit. I’ll put some more care into the implementation so it is self consistent (ie. I typically prefer that using a helper like InsertionPoint as a context manager is sugar for what you can do explicitly if you desire. Example:

ip = InsertAtEndOfBlock(b)
ip.insert(AddFOp(...))

For completeness, we’ll also need a with Disable InsertionPoint(): if going with the context manager approach.

I vote for doing it this way, starting with Operation. We are currently cheating with Block and Region, not allowing them to be created detached. That is likely defensible for Region but we should revisit with Block if taking this approach.

On this topic, could we eliminate the explicit context in most places? Could a manager be used in such cases for example:

i32 = IntegerType.get_signed(ctx, 32)

Yay, we’re at the point where we can talk about ergonomics/sugar :slight_smile:

I’m not opposed – just want to get a bit more functionally complete, have a look and make adjustments. There’s a lot of low hanging fruit like this.

Sent for review: https://reviews.llvm.org/D89932