[RFC] Building MLIR operation: observed caveats and proposed solution

ggladilov · July 5, 2025, 12:45am

Hello, MLIR community!

I’d like to discuss with you some observations and ideas on writing C++ code in a MLIR-based compiler to create operations. I’m looking forward to learn from your feedback and even maybe discuss the proposal if deemed useful.

I do have some noticeable amount of code below, hope that’s not a problem as I wanted to get feedback before even getting close to build a patch for review. Suggestions for better format of discussion are welcome.

Note: code presented below does not assume using specific code-style or naming

Problem statement

Inconsistency

To illustrate the problem let me go through a few examples:

Let’s start with simple op defined in tablegen:

def TestOp : SomeDialect_Op<"test_op"> {
  let arguments = (ins
    I32:$requiredOperand0,
    Optional<I32>:$optionalOperand1
  );
  let results = (outs I32);
}

Here we have 2 operands, first requiredOperand0, then optionalOperand1. We would be able to build an instance of this op using following C++ code:

builder.create<TestOp>(
  location, 
  resultType,
  someValue
);

So far, everything is nice, we were able to provide single mlir::Value (someValue) that would become our requiredOperand0.

Looking at generated static member-functions for this TestBuilderOp we see:

static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, ::mlir::Value requiredOperand0, /*optional*/::mlir::Value optionalOperand1);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value requiredOperand0, /*optional*/::mlir::Value optionalOperand1);
static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});

What was surprising to me is optionalOperand1 did not get default value generated in any of the signatures above. Our C++ builder.create call worked because the last overload was picked with implicit conversion from mlir::Value to mlir::ValueRange.

Another even more surprising behavior to me was that this would work even with following tablegen code:

def TestOp : SomeDialect_Op<"test_op"> {
  let arguments = (ins
    Optional<I32>:$optionalOperand1,
    I32:$requiredOperand0
  );
  let results = (outs I32);
}

We swapped operands and now optionalOperand1 is the first one, which leads to following build static member-functions generated:

static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, /*optional*/::mlir::Value optionalOperand1, ::mlir::Value requiredOperand0);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, /*optional*/::mlir::Value optionalOperand1, ::mlir::Value requiredOperand0);
static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});

The reason it still works and the same builder.create call assigns mlir::Value to requiredOperand0, not optionalOperand1 is implementation of getODSOperandIndexAndLength that accounts for mix of variadic and non-varadic operands.

However, there’s some inconsistency IMO, because changing tablegen code doesn’t impact the behavior of builder.create call above, but impacts the behavior of case when both mlir::Value are provided:

builder.create<TestOp>(
  location, 
  resultType, 
  someValue1, 
  someValue2
);

In this case with the last tablegen sample where optionalOperand1 is specified before requiredOperand0, someValue1 is assigned to optionalOperand1 and someValue2 is assigned to requiredOperand0. With the first tablegen sample the behavior is different and someValue1 is assigned to requiredOperand0 instead.

Another tricky moment is picking generic build overload for builder.create calls that try set only required operands does not always work.

If we have more than one requried operand:

def TestOp : SomeDialect_Op<"test_op"> {
  let arguments = (ins
    I32:$requiredOperand0,
    I32:$requiredOperand1,
    Optional<I32>:$optionalOperand2
  );
  let results = (outs I32);
}

We get following build overloads:

static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2);
static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});

And now just passing 2 mlir::Value objects to set 2 required operands won’t work anymore:

builder.create<TestOp>(
  location, 
  resultType, 
  someValue1, 
  someValue2
);

We cannot pick first 2 overloads of build because they require more operands (all of them require optionalOperand2) and the last one (generic) cannot be picked because only someValue1 would be implicitly converted mlir::ValueRange, the someValue2 would try to convert to argument for attributes and fail.

In order to make builder.create call above to work we need explicitly pass mlir::ValueRange:

builder.create<TestOp>(
  location, 
  resultType, 
  mlir::ValueRange{someValue1, someValue2}
);

While default value for corresponding argument of build static member-function is not generated for Optional or OptionalAttr, it is generated for UnitAttr and DefaultValuedAttr.

Error handling

Other things that I would like to see improved are related to error handling. First scenario is the case when incorrect amount of arguments were passed during op creation. E.g. for the last case, if we do:

builder.create<TestOp>(
  location, 
  resultType, 
  someValue1
);

// or

builder.create<TestOp>(
  location, 
  resultType, 
  mlir::ValueRange{someValue1}
);

That would trigger an error, since there’s no mlir::Value provided for requiredOperand1, however the error is caught at run-time:

Ops.cpp.inc:1637: static void mlir::xxx::TestOp::build(::mlir::OpBuilder &, ::mlir::OperationState &, ::mlir::TypeRange, ::mlir::ValueRange, ::llvm::ArrayRef< ::mlir::NamedAttribute>): Assertion `operands.size() >= 2u && "mismatched number of parameters"' failed.

Second error scenario is when we do get compile-time error, but the error message is somewhat indirect. Consider this:

def TestOp : SomeDialect_Op<"test_op"> {
  let arguments = (ins
    I32:$requiredOperand0,
    I32:$requiredOperand1,
    I32:$requiredOperand2
  );
  let results = (outs I32);
}

// code generated for this

static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, ::mlir::Value requiredOperand2);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, ::mlir::Value requiredOperand2);
static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});

When we will try to compile

builder.create<TestOp>(
  location, 
  resultType, 
  someValue1.getResult(), 
  someValue2.getResult()
);

We will get following compiler error:

llvm-project/mlir/include/mlir/IR/Builders.h:464:5: error: no matching function for call to 'build'
  464 |     OpTy::build(*this, state, std::forward<Args>(args)...);
      |     ^~~~~~~~~~~
note: in instantiation of function template specialization 'mlir::OpBuilder::create<mlir::xxx::TestOp, mlir::IntegerType &, mlir::Value, mlir::Value>' requested here
  205 |       auto testOp = builder.create<TestOp>(
      |                             ^
note: candidate function not viable: no known conversion from 'mlir::Value' to '::llvm::ArrayRef< ::mlir::NamedAttribute>' for 5th argument
 1023 |   static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
      |               ^                                                                                                                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate function not viable: requires 6 arguments, but 5 were provided
 1021 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, ::mlir::Value requiredOperand2);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate function not viable: requires 6 arguments, but 5 were provided
 1022 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value requiredOperand0, ::mlir::Value requiredOperand1, ::mlir::Value requiredOperand2);

It does tell us what is wrong (amount of passed arguments), but error message is arguably not easy to read (it does get worse with more build overloads provided by the user). Also, it cannot tell us which operand is missing exactly.

More realistic example inspired by a real codebase

Some modified sample of code in real project that motivated me to write this post:

builder.create<OpType>(
  builder.getUnknownLoc(),
  indexAttr, 
  /*taskLocation*/ nullptr, 
  barrierConfigConstOp.getOutput(),
  bufferOp.getBuffer(), 
  previousTask, 
  waitBarriers, 
  updateBarriers,
  /*startAfter*/ 0,
  /*cleanAfter*/ 0, 
  false, 
  false, 
  false, 
  0, 
  DMAAccMode::DISABLE, 
  nullptr, 
  nullptr,
  /*transactionAttr*/ nullptr, 
  dmaDescriptorAttr, 
  nullptr, 
  nullptr, 
  false, 
  nullptr, 
  enqueueBarrier,
  wlmPageAttr
);

Clearly, there’s a design problem of OpType having too many arguments, but I want to discuss different topic. This sample is used as just an illustration that situation can actually get extreme.

Note: even this specific example can get worse if more arguments have default value, like 0, nullptr or false. MLIR generates multiple build overloads that different in accepting either underlying type like bool or mlir::Attribute. So even if developer got arguments count right, there’s still a problem of getting to the right overload by e.g. using false instead of nullptr for UnitAttr (or vice versa) for different arguments.

So, to conclude so far following problems were observed:

Difficulties to set only required operands without writing any code for missing optional operands
Not very user-friendly compile-time error messages or run-time errors instead of compile-time ones in case of incorrect builder call.
Inconsistency in behavior regarding how list of arguments to builder.create map into op definition in ODS
Difficulties with picking the right build overload due to different attributes using underlying type or mlir::Attribute

Most if not all of it applies to attributes as well. Personally, I bothered by items 1 and 2 above the most.

Builder pattern as alternative solution

One idea I have been exploring to solve or at least improve problems above is inspired by (IMO quite interesting) talk by Ben Dean on cppcon 2018 Easy to Use, Hard to Misuse: Declarative Style in C++.

Basically, the idea is to write Builder pattern, but one that leverages strong types to guarantee at compile-time all required arguments were provided and generate more user-friendly error message if not.

Note: this idea is WIP and code samples below were tested in a limited scope, written manually and not auto-generated by tblgen, but hopefully is enough to kick-off the discussion.

To see how it could work, let’s take a look at this case:

def TestOp : SomeDialect_Op<"test_op", [SameVariadicOperandSize]> {
  let arguments = (ins
    Optional<I32>:$optionalOperand0,
    I32:$requiredOperand1,
    Optional<I32>:$optionalOperand2,
    I32:$requiredOperand3,
    OptionalAttr<I32Attr>:$optionalAttribute4
    I32Attr:$requiredAttribute5
  );
  let results = (outs I32);
}

// build member-functions generated

static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, /*optional*/::mlir::Value optionalOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2, ::mlir::Value requiredOperand3, /*optional*/::mlir::IntegerAttr optionalAttribute4, ::mlir::IntegerAttr requiredAttribute5);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, /*optional*/::mlir::Value optionalOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2, ::mlir::Value requiredOperand3, /*optional*/::mlir::IntegerAttr optionalAttribute4, ::mlir::IntegerAttr requiredAttribute5);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::Type resultType0, /*optional*/::mlir::Value optionalOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2, ::mlir::Value requiredOperand3, /*optional*/::mlir::IntegerAttr optionalAttribute4, uint32_t requiredAttribute5);
static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, /*optional*/::mlir::Value optionalOperand0, ::mlir::Value requiredOperand1, /*optional*/::mlir::Value optionalOperand2, ::mlir::Value requiredOperand3, /*optional*/::mlir::IntegerAttr optionalAttribute4, uint32_t requiredAttribute5);
static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});

Then, following code could be auto-generated for each operation:

// Ops.h.inc

template <uint64_t bits>
struct TestOpBuilderHelper;

template <>
struct TestOpBuilderHelper<0> {
  // only required arguments / results need to be tracked
  static constexpr auto RESULT_TYPE_BIT = size_t{1 << 0};
  static constexpr auto REQUIRED_OPERAND_1_BIT = size_t{1 << 1};
  static constexpr auto REQUIRED_OPERAND_3_BIT = size_t{1 << 2};
  static constexpr auto REQUIRED_ATTRIBUTE_5_BIT = size_t{1 << 3};
  static constexpr auto ALL_BITS = RESULT_TYPE_BIT | REQUIRED_OPERAND_1_BIT |
                                   REQUIRED_OPERAND_3_BIT |
                                   REQUIRED_ATTRIBUTE_5_BIT;

  ::mlir::Type resultType;
  ::mlir::Value optionalOperand0;
  ::mlir::Value requiredOperand1;
  ::mlir::Value optionalOperand2;
  ::mlir::Value requiredOperand3;
  ::mlir::IntegerAttr optionalAttribute4;
  ::mlir::IntegerAttr requiredAttribute5;

  ::mlir::xxx::TestOp build(::mlir::Location location,
                            ::mlir::OpBuilder &odsBuilder);
};

using TestOpBuilder = TestOpBuilderHelper<TestOpBuilderHelper<0>::ALL_BITS>;

template <uint64_t N>
struct TestOpBuilderHelper : TestOpBuilderHelper<N - 1> {
  using Base = TestOpBuilderHelper<0>;

  // required arguments clear corresponding bit

  TestOpBuilderHelper<N & ~Base::RESULT_TYPE_BIT>
  setResultType(::mlir::Type resultType) {
    this->resultType = resultType;
    return *this;
  }

  // optional arguments return itself as-is

  auto setOptionalOperand0(::mlir::Value value) {
    this->optionalOperand0 = value;
    return *this;
  }

  TestOpBuilderHelper<N & ~Base::REQUIRED_OPERAND_1_BIT>
  setRequiredOperand1(::mlir::Value value) {
    this->requiredOperand1 = value;
    return *this;
  }

  auto setOptionalOperand2(::mlir::Value value) {
    this->optionalOperand2 = value;
    return *this;
  }

  TestOpBuilderHelper<N & ~Base::REQUIRED_OPERAND_3_BIT>
  setRequiredOperand3(::mlir::Value value) {
    this->requiredOperand3 = value;
    return *this;
  }

  auto setOptionalAttribute4(::mlir::IntegerAttr value) {
    this->optionalAttribute4 = value;
    return *this;
  }

  TestOpBuilderHelper<N & ~Base::REQUIRED_ATTRIBUTE_5_BIT>
  setRequiredAttribute5(::mlir::IntegerAttr value) {
    this->requiredAttribute5 = value;
    return *this;
  }

  ::mlir::xxx::TestOp build(::mlir::Location, ::mlir::OpBuilder &) {
    if constexpr (N & Base::REQUIRED_OPERAND_1_BIT) {
      static_assert(false,
                    "an attempt to build TestOp with no requiredOperand1 set");
    } else if constexpr (N & Base::REQUIRED_OPERAND_3_BIT) {
      static_assert(false,
                    "an attempt to build TestOp with no requiredOperand3 set");
    } else if constexpr (N & Base::REQUIRED_ATTRIBUTE_5_BIT) {
      static_assert(
          false, "an attempt to build TestOp with no requiredAttribute5 set");
    } else {
      static_assert(false, "unexpected error");
    }
  }
};

// Ops.cpp.inc

::mlir::xxx::TestOp
TestOpBuilderHelper<0>::build(::mlir::Location location,
                              ::mlir::OpBuilder &odsBuilder) {
  return odsBuilder.create<::mlir::xxx::TestOp>(
      location, resultType, optionalOperand0, requiredOperand1,
      optionalOperand2, requiredOperand3, optionalAttribute4,
      requiredAttribute5);
}

With this, C++ code to build TestOp would be:

TestOpBuilder{}
  .setResultType(resultType)
  .setRequiredOperand1(someValue1)
  .setRequiredOperand3(someValue2)
  .setRequiredAttribute5(someAttribute)
  .build(location, builder);

It has some advantages: self-documenting code due to member-functions names, consistent behavior in terms what operand is assigned, no code for optional arguments, no dependency on order of calls.

Regarding error-handling, if we, for example, would make a mistake and forget to set some required operand:

TestOpBuilder{}
  .setResultType(resultType)
  .setRequiredOperand1(someValue1)
  //   .setRequiredOperand3(someValue2)
  .setRequiredAttribute5(someAttribute)
  .build(location, builder);

It’ll be following compile-time error:

error: static assertion failed: an attempt to build TestOp with no requiredOperand3 set
  114 |       static_assert(false,
      |                     ^~~~~
note: in instantiation of member function 'TestOpBuilderHelper<4>::build' requested here
  221 |           .build(location, builder);

Note: as you can see so far the idea is to have new patter as addition to current infrastructure, not replacement!

There are definitely some remaining opens:

How well that’s going to work with properties? Current expectation is properties are different in a way they may not be cheap to copy in generated setters, maybe using move-assignments/constructors would solve it.
What if something does not have default constructor / move operations? Default constructor maybe solvable through using std::optional, but not sure. Hopefully, lack of move operations is not a real issue (it will be just a copy?).
This implementation have the limit of up to 64 required things to be set - is it a problem? I’d say since new pattern is not a replacement, but rather opt-in feature, that’s okay. Especially, that having more than 64 required arguments is too much anyway, IMO.
What about new builder interacting with existing related code, OpBuilder, rewriters or anything else?
Is template instantiation or template code here in general is a problem?

That’s all I did so far in this direction and I would like to get some feedback before proceeding with further experiments as there could be some unexpected to me blockers for this proposal, existing solution to the problems or better ideas in general.

j2kun · July 5, 2025, 1:27am

I do like the builder style of object construction better and would support it as an option for operations (and attributes, types, etc). I prefer a little bit more verbosity to avoid the common compilation errors associated with getting argument order wrong (and the hard-to-parse errors that result from it).

makslevental · July 5, 2025, 2:39am

I always felt like builder pattern just works around the fact that cpp doesn’t support named/keyword args. So maybe this trick GitHub - cheshirekow/kwargs: python style kwargs in C++ could work? In the far far future (probably 2035 lol) we’ll even be able to do this natively Towards C++ kwargs | Brandon G. Nguyen.

mehdi_amini · July 5, 2025, 3:35pm

Thanks for the very clear post, it’s super high quality feedback and you’re quite on point overall!
I would say: “C++ is hard, and there are lot of tradeoffs to navigate”.

I always loved the approach of “easy to use, hard to misuse” APIs, and so I welcome any improvement in this direction. I find it really appealing that your builder API can be type safe like this!

Something else that such a builder API can solve that you haven’t mentioned is the IDE autocompletion: the templated indirection brought by the create<Op> method is tripping clangd and in general IDE completion unfortunately, which reduces the ergonomic of these APIs. It seems that your builder proposal wouldn’t suffer from this.

I have a question about your implementation: why don’t you have an OperationState as a member of the builder, and populate it as you go? We could skip the two-steps you have right now of collecting all the operands before passing them to a “build” and handle the “build” method directly in the builder.
Now it’s possible that you’re fluent API right now isn’t eliding the copy of the builder between each method call?

Very promising overall!

Some minor notes on your comments below (not a rebuttal of the overall impression you bring of course):

This is an edge case from having a single mandatory result, because of the implicit conversion from mlir::Value to mlir::ValueRange, if there were multiple non-optional operands it wouldn’t work that way I believe: you wouldn’t get the ValueRange overload and you’d get a compilation failure in the first place, forcing you to pass the optional as a Value{}.

The “optional” is really meant to be a verifier thing: it allows you to pass a Value{} but it is still explicit in the API, which you found out in the following example

ggladilov:

In order to make builder.create call above to work we need explicitly pass mlir::ValueRange:
builder.create<TestOp>(
  location, 
  resultType, 
  mlir::ValueRange{someValue1, someValue2}
);

The more usual / expected way is rather:

builder.create<TestOp>(
  location, 
  resultType, 
  /*requiredOperand0=*/someValue1, 
  /*requiredOperand1=*/someValue2,
  /*optionalOperand2=*/Value{}
);

This is also the only way to set some optional value but not others (when multiple are available).

jpienaar · July 5, 2025, 8:42pm

There is also a long standing open bug on populating optional attributes in the generated builders that one doesn’t need to specify them (I keep on forgetting if the phabricator change I had from years ago did that and I didn’t submit, or if partial/new change needed).

Builder style/fluent style APIs definitely allow for different options here. (The autocomplete one is fixable, but we require help from clang folks :-)).

3&4 could definitely use some TLC: most of these were implemented based on op usage at given point, convenience members added. So this is useful to consider for refinements there too (independent of builder style - we have folks who complain the existing style is too verbose and working on some alternatives to be presented ). I am surprised about the inconsistency part, beyond human added ones, the logic is rather direct, so more cases flagged there that confuses the better!

ggladilov · July 5, 2025, 9:11pm

Thanks for the comment!

I prefer a little bit more verbosity to avoid the common compilation errors associated with getting argument order wrong (and the hard-to-parse errors that result from it).

I’m not sure if I get this part. Could you please provide more specific example?

ggladilov · July 5, 2025, 9:15pm

@makslevental thank you for suggestions!

I’d agree keyword args would be interesting addition to C++ standard

Thanks for the heads up, I’ll check this, so far without going too deep it seems to me those kwargs would still suffer from being not verbose enough for required operands, because regular C++ positional argument passing is used, but maybe I just got it wrong. Would need to clarify if it also support reporting errors at compile-time as well.

ggladilov · July 5, 2025, 9:26pm

Thanks for such heart welcoming reply @mehdi_amini

TBH I’m not sure I see how API I proposed will help with IDE integration since it’s still using member-function template OpBuilder::create underneath. But if whatever final change will do it, I’ll only be happier.

The answer is simple: the results I showed are coming from me playing with Toy chapters to see if the idea is reasonable/works at all. I didn’t put much of thought into how to integrate Builder/Fluent API with existing MLIR code (e.g. OpBuilder), what I did was just simpliest/dumbiest thing to do in order to actually create operation. I also tried to highlight in my open #4 that I may need help/direction here as I’m not quite familiar with how MLIR works for operation creations (specifically what is OperationState, etc.) here:

What about new builder interacting with existing related code, OpBuilder, rewriters or anything else?

But since you already gave the idea, I’ll definitely would think about it, look at OperationState and see how to make implementation better, thanks! Other ideas are welcome as well.

Interesting! For now it occurs to me new proposed API would allow to “write nothing” for Optional operands, hopefully that’s not a problem, since it’s sort of less explicit.

UPD: actually, fluent API isn’t necessary less explicit, if conclusion would be to still require providing values for Optional, it’s possible, just instead of /*name*/ someValue among argument list, it’d be setName(someValue), which is still better IMO

mehdi_amini:

The more usual / expected way is rather:
builder.create<TestOp>(
  location, 
  resultType, 
  /*requiredOperand0=*/someValue1, 
  /*requiredOperand1=*/someValue2,
  /*optionalOperand2=*/Value{}
);
This is also the only way to set some optional value but not others (when multiple are available).

I’d admit that the snippet above is one of the reasons why I like fluent API more, it bothers me we need to write comments to explain the code, I’d prefer function names.

ggladilov · July 5, 2025, 9:34pm

Yeah, I didn’t said it directly, but I agree there’re some things that could be done to existing API to make it better.

I assume by 3&4 you mean problems I tried to identify with current code - yes I agree that this RFC does not block or diminishes the value of improving existing code creating operations as it’s considered opt-in feature: who wants to use fluent/builder API use it, others use whatever still exists.

@mehdi_amini @jpienaar

If you’re aware can you please share more on this? If there’re some other people working on improvements/refactoring on the same area/with similar goals - should I still continue my work or it’s better to have some sync or wait for those people to present their results first? I’m trying to avoid the situation when different people working on the same thing at the same time and then one side is just a throwaway. I actually were concerned with this before writing this post, this is why I tried to check discource for any recent activity in similar direction and found nothing.

jpienaar · July 5, 2025, 9:43pm

I’d not block: it’s rather different the other group I’m aware off (replacing an existing, opinionated builder API while moving to MLIR). The best way to find out the gaps here and be able to eval is having more to poke at!

I think Mehdi is referring to create being such a generic forwarding method that requires introspection into template param’s class 's member functions. Here one could have it as class members level, so it may already work.

mehdi_amini · July 5, 2025, 10:23pm

If I type TestOpBuilder{}. then the IDE will offer me the list of APIs available on the builder to auto-complete, and when I select setRequiredAttribute5( it’ll display me the signature of this method and I know what to provide. Then when I close this method and add a . in the fluent API chain it’ll offer me again the list of APIs available on the builder.
Right now, when I type builder.create<TestOp>( it does not know that it need to show me the auto-complete of the build method overloads (because it looks at the create method, which is templated, and it stops there). As @jpienaar said it could be fixed by teaching clangd to look through the template here, but it’s been years and it still hasn’t been done, so I wouldn’t hold my breath

For related work in MLIR, we looked into builders 4 years ago, in the context of an early DSL we had in MLIR to make building IR more ergonomic which was named “EDSC”, see these two threads: Evolving Builder APIs based on lessons learned from EDSC and [RFC] DSL-style / dialect-specific builders

makslevental · July 5, 2025, 11:09pm

Taking arith::ConstantIntOp as an example, why not just duplicate create with the args fully spelled out i.e. add

class ConstantIntOp {
  ...
  static arith::ConstantIntOp create(Builder& builder, Location location, int64_t value, unsigned width) {
    OperationState state(location, getOperationName());
    arith::ConstantIntOp::build(builder, state, value, width);
    auto *op = builder.create(state);
    auto result = dyn_cast<arith::ConstantIntOp>(op);
    return result;
  }
  ...
}

to the cpp emitted by tablegen? It’s basically the same signature we currently have (except with builder passed as the first arg instead of builder.create<arith::ConstantIntOp>) and it autocompletes (i.e., shows hints).

makslevental · July 6, 2025, 1:21am

Here’s a sketch of the above idea

https://github.com/llvm/llvm-project/pull/147168

If it pleases the jury (i.e. the PR can land) I’ll add tests and polish and etc.

ggladilov · July 6, 2025, 1:59pm

@mehdi_amini

I looked at it and unfortunately it seems like this approach is not without “tradeoffs”. I do agree keeping just mlir::OperationState inside new builder class is more appealing than separate member-variables for each operand and attribute, however, mlir::OperationState API is ordered. mlir::OperationState::addOperands just appends, using it inside new builder would means following two snippets are not equivalent anymore:

// sets someValue1 -> requiredOperand1 & someValue2 -> requiredOperand3

TestOpBuilder{}
  .setResultType(resultType)
  .setRequiredOperand1(someValue1)
  .setRequiredOperand3(someValue2)
  .setRequiredAttribute5(someAttribute)
  .build(location, builder);

// sets someValue2 -> requiredOperand1 & someValue1 -> requiredOperand3 

TestOpBuilder{}
  .setResultType(resultType)
  .setRequiredOperand3(someValue2)
  .setRequiredOperand1(someValue1)
  .setRequiredAttribute5(someAttribute)
  .build(location, builder);

This sort of goes against the idea of the design. IMO if we provide named member-functions to populate specific operand, then assignment happens by name, not index and can be done in any order.

Then, as I see it now, if we still would try to go with this idea and assume setters must be called in a specific order and guarantee correctness at compile-time (compile-time guarantees are a big deal to me), we still get the same problems as existing building code about having to write code for optional operands:

// imagine fluent API where we require specific order of member-function calls

TestOpBuilder{}
  .setResultType(resultType)       // must be called 1st
  .setRequiredOperand1(someValue1) // must be called 2nd
  .setOptionalOperand2(nullptr)    // must be called 3rd, cannot skip
  .setRequiredOperand3(someValue2) // must be called 4th
  .setOptionalAttribute(nullptr)   // must be called 5th, etc.
  .setRequiredAttribute5(someAttribute)
  .build(location, builder);

So we get more verbose. On the bright side of this approach, I think it may make it easier to support properties, but I’m not 100% sure if there’s a big difference as I didn’t evaluate yet if properties would require any extra effort.

My take on these 2 options is if having one member-variable per each argument and result is not performance problem, API where methods could be called in any order & no need to write code for optional pieces is better. What do you think?

Not sure what you referred to exactly, but I spotted a mistake in my original code snippets. Setters in fluent API must return object by reference, not value, in this case no TestOpBuilderHelper copy happens. Is that what you asked about?
I see possible copy of mlir::Value and mlir::Attribute objects as in my snippets they are stored once in TestOpBuilderHelper and then get inserted into mlir::OperationState behind mlir::OpBuilder::create member-function template, but initially I assumed it’s not a problem as these objects are cheap to copy. Corret me if I’m wrong.

ggladilov · July 6, 2025, 2:03pm

@makslevental

That’s great to see my post already helped new PRs to be opened . I will still try to continue work on my proposal as it seems like alternative API still make sense even if your change above is merged. Let me know if you have any objections.

kuhar · July 7, 2025, 1:45am

I wonder if we could get some of the benefits like better IDE/lsp integration without full builder methods that would require significant refactoring of existing code.

For example split the ‘insertion’/‘replacement’ from op construction via something like rewriter.create<OpT>().build(op args). We should be able to upgrade to this form fairly mechanically.

jpienaar · July 7, 2025, 7:32am

Definitely. I don’t think we should go down route needing such large changes. That’s part where creativity would be needed

Also we shouldn’t make a change to work around improvements elsewhere, so pinged the folks we had asked previously about autocomplete support to find out cost/complexity. I haven’t checked if the kwarg style comments are checked with the create as they normally are with functions and the one Maks showed with kwargs is quite interesting, especially if there were no naming overlap issues.

ftynse · July 7, 2025, 8:51am

Very nice write-up, thank you! I am highly supportive of evolving away from the current builder API, as one may notice based on my two failed attempts to do so cited by Mehdi above. I also have a penchant for fluent-style APIs and keep adding them in various places, so all the nice things there to convince me. There’s a but below.

In my opinion, the lack of autocomplete/documentation overlay on build/rewrite methods is the most egregious usability problem of MLIR API. These are by far the most commonly used methods, and there’s no tooling supporting them properly. I’ve made it a habit to first type OpName::build( so completion works, and then change it to builder.create<OpName> when I’m done. The builder pattern and/or fluent APIs address that, and I don’t necessarily mind more verbosity. I’d love that problem solved.

Better API for optional arguments would be nice to have for operations with long arguments lists.

The confusing edge case due to the ValueRange implicit constructor is also problematic, I had to debug a few of those and it wasn’t easy.

On the other hand, I wouldn’t insist so much on compile-time checking of missing mandatory arguments, especially if this increases the price/overhead elsewhere. Mandatory arguments are but one aspect of the op verification process. We may have operands that are required if a certain attribute is provided. We may have operand lists that must have the same length. See, for example, view/slice-style ops. We may have all sorts of validity rules that are fundamentally not checkable at compile time. So there little point in checking only that aspect at compile time. Furthermore, this will lead to logic duplication with the verifier because we will still have to check the presence of mandatory arguments in the verifier since there are other ways of creating an operation.

The objections to previous attempts of changing the op creation API boiled down to:

We wanted to have one true way of doing that instead of multiple dialect/component specific ones (I personally don’t agree with most arguments there because most things in MLIR can be done in multiple ways, but that’s what seemed to be the consensus back at the time).
The code bloat and compile time cost from having additional abstractions could have been significant.

If 1. still holds today, we then also need

A transition strategy to the newly adopted style.

Overall, something like what Max is proposing in his PR feels like it’s closer to a good balance for the things listed above. (Not the kwarg syntax that won’t help with autocompletion.) It doesn’t help with optional arguments though. For those, maybe we could generate fluent-style APIs for ops that have, e.g., >4 arguments, and then pass the object to the creation function. If the arguments are properties/attributes, it may be even sufficient to just have a builder function taking (an rvalue reference to) the property object.

Another point I could make is that an op with dozens of arguments may be a sign of missing abstractions. Maybe the properties should be grouped into a bigger attributes. Maybe the operands should be aggregated into a single named tuple-like type, etc.

mehdi_amini · July 7, 2025, 9:38am

Oh right, this is difficult to manage in a fluent API indeed, thanks for looking into it though!
(I wouldn’t want to impose an ordering on the fluent API indeed)

Yes this is what I was wondering about

This seems correct to me, I would hope that inlining everything would lead to a zero-copy in most cases?

This is not trivial, if you look at the create method (or [mlir][tblgen] add concrete create methods by makslevental · Pull Request #147168 · llvm/llvm-project · GitHub ) you’ll see that there is something to be done after the build method return.
We could change the build method signature entirely (returning an operation), but that would be a large breaking change which the PR above avoids.

kuhar · July 7, 2025, 3:14pm

I admit that I haven’t looked closely into the implementation details, but I’d think that this shouldn’t be harder than having the new create<T>() return a templated builder wrapper object with a new build(...) method that at the end can end up calling the same underlying functions as .create<T>(...) that we have today. But that won’t go as far as that the RFC suggest with full builder pattern.

Topic		Replies	Views
Writing OpBuilders in mlir MLIR	4	1095	October 20, 2021
Finalizing operand/result ordering in Python bindings MLIR	8	553	January 22, 2021
The way to inspect c++ interfaces for certain operations MLIR mlir	6	249	November 21, 2023
MLIR Beginner: MLIR backend for custom accelerator MLIR	16	1087	April 10, 2024
Beginner and do not know how to start MLIR	9	1293	March 27, 2021

[RFC] Building MLIR operation: observed caveats and proposed solution

Problem statement

Inconsistency

Error handling

Builder pattern as alternative solution

Related topics