Non-attribute properties and the inherent-discardable attribute split -- what needs doing

krzysz00 · August 8, 2025, 7:58pm

To provide background material for an upcoming Open Design Meeting and to get a lot of these plans out of my head, I’d like to summarize the state of both non-attribute properties and the inherent attribute / discardable attribute split, what work remains on these related fronts, and where I’ve been trying to get to.

Background

(If you’ve already been involved in discussions on this topic, you can skip these sections)

The inherent/discardable attribute split

For several years now, MLIR operations have been storing their inherent attributes (for most purposes, those that’re defined in the TableGen file) within a Properties struct. Each operation defines its own Properties struct, which is stored as trailing data that’s allocated within the operation. This was an improvement over the old system, where all attributes were combined into a single DictionaryAttr that was attached to the operation, meaning that accessing an attribute required a hash lookup and that any change to an operation’s attributes required recomputing the entire dictionary and creating another unique object that probably didn’t need that.

This was intended - and, IIRC, it did - improve performance.

This separation, however, wasn’t complete. There was - and still is - a lot of code that expected to be able to get “the set of all attributes on operation” or “the attribute dictionary”, and so the methods that returned such things were modified to construct them on the fly, incurring the very performance penalties that were part of the motivation for the split.

Non-attribute properties

The fact that inherent attributes are stored in the Properties struct was added in the RFC process for operation properties, which were initially planned to be a struct of plain C++ data that lived inline in operations. Such non-attribute properties are supported - that is, you can have a properties struct with fields like bool nontemporal; or std::string tag; in it, but the infrastructure for using such fields is still in the process of being developed.

One major motivation for such non-attribute properties is performance: small data of a specific type can live directly inline on an operation and won’t require hash lookups or uniquing. For example, the iterator_type = ["parallel", "parallel", "reduction"] array on a linalg.generic operation si currently an ArrayAttr of EnumAttrs, which means that both the array itself and each value within it are uniqued in the MLIR context. It could just be stored as a SmallVector<util::IteratorType> property - or, in tablegen, ArrayProp<IteratorTypeProp>:$iterator_type (if IteratorTypeProp were defined), saving us a whole bunch of uniquing for what’s usually an array of two or three integers.

Properties are now mainly defined as subclasses of Property in TableGen. Properties have a storage type (the type stored in the Properties struct) and a possibly-distinct interface type (which is what’s returned by getters, passed to setters, and used as an argument in builders). They also define:

Parsing and printing logic
Their serialization to bytecode
Constraints and default values (if applicable)
A conversion method to/from atttributes

The conversion to/from attributes is needed in order to allow operations with properties structs to be printed and parsed generically. This is done through the getPropertiesAsAttr() and setPropertiesFromAttr() methods, which serialize an operation’s Properties struct into a DictionaryAttr. This could also be used when there’s a need to introspect all the properties of an operation, but I don’t think there are uses for this.

Improvements to the non-attribute properties infrastructure

There’ve been a bunch of improvements to the usability of non-attribute properties. They gained the ability to hold constraints (and thus participate in op verification), the ability to be used in tablegen assembly formats (so if you have a I64Prop:$label, you can just use $label in the assembly format to parse/print it), and the ability to create EnumProps (which led to untangling enum generation from attribute generation), and the ability to match on properties in declarative rewrite rules.

However, these improvements haven’t fully brought non-attribute properties to parity with attributes, since there’s still future work needed. Some of this work is also needed to properly bring apart the inherent/discardable attribute split, since it involves handling the discardable attributes of on operation and the values in the Properties struct (whether they are attributes or not) more distinctly.

Note that unregistered operations do use an inherent/discardable attribute split - they just use a DictionaryAttr as their properties “struct”. (We could consider making that a wrapper around StringMap<Attribute> instead but that’s a minor tweak)

My proposed plan of future work

(If you’re not too interested in these details, you can skip to “Finale” below)

Improvements to operation creation and building

Currently, some forms of Operation::create() call setArrs(), which does a bunch of work to split up a dictionary that contains a potential mix of inherent and discardable attributes, and, in some cases, populateDefaultAttrs() (which’ll put default attributes into such a dictionary). The analogous OperationName::populateDefaultProperties(OpaqueProperties), which would initialize any fields in the properties struct) is not used anywhere.

This is a bunch of work that can be completely skipped if the caller of create knows that the inherent and discardable attributes of an operation have been kept separate. This will improve performance by getting rid of a bunch of O(numAttrs) filtering code.

This will involve adding a field to OperationState and an argument to create called something like bool hasAttrSplit = false;.

Then, in OperationState, you’ll generally cause this field to be set either by calling useProperties<Properties>(Properties& newProperties), which lets you point an an existing Properties struct to copy (for example, when cloning), or though a new method Properties& createDefaultProperties(), which will create the properties struct of interest within the state and return a reference to it.

So, in a builder, you’d have

Properties& props = state.createDefaultProperties<Properties>();
props.setFoo(foo);
if (bar) {
  props.setBar(bar);
}
...

instead of the current methods, which either just call getOrAddProperties<Properties> or still use result.addAttribute (which forces non-trivial work on the setAttrs() flow).

In order to enable generic code and bindings (see later), OperationState will also want to have the ability to take a OpaqueProperties properties, TypeID propertiesKind so that code that just grabs the properties struct of some Operaiton* and then recreates a new instance of the same operation can actually do so without needing to resolve a template.

Since the new behavior only occurs if you opt in to it, we can gradually transition to a world where inherent and discardable attributes are kept entirely separate by gradually moving various bits of code to a createDefaultProperties()-based system.

Autogenerated builders (and so on) can be moved to use the new style, since they already write properties into the properties struct
Manually-written builders can be audited for calls to result.addAttribute() which’ll be replaced by changes to the properties struct
Parsers … are the next section

More efficient operation parsing

Currently, we have std::optional<Attribute> genInherentAttr(name); and void setInherentAttr(name, value), which aren’t used much.

I first propose that these should be generalized to std::optional<Attribute> getPropertyAsAttr(name), and LogicalResult setPropertyFromAttr(name, value);, with the LogicalResult on the latter indicating whether or not an inherent attribute called name actually existed. (These’d also go on OperationState)

Then, these can be used to implement inherent/discardable attribute splitting during operation parsing. This would be done by teaching parseAttrDict (and parseOptionalAttrDict and so on) to take a callback, which would take a parsed (name, value) pair and do something with it.

When parsing an attr-dict (as defined in tablegen), the callback would try to setPropertyFromAttr() each parsed key/value pair, and, if there was no such property, add that pair to the list of discardable attribute. This, combined with createDefaultProperties(), would allow the inherent/discardable split logic to happen at parse time, not operation creation time. Furthermore, if the assembly format had a prop-dict, we would parse the attr-dict directly into the discardable attribute list, skipping all those checks.

Similarly, parsing a properties struct serialized as attributes would be able to use this callback to reject invalid keys and to avoid constructing a DictionaryAttr that is needed for setPropertiesFromAttr() - that lets us avoid needing to construct (and keep around forever) a rather useless DictionaryAttr that currently gets created.

Printing and `prop-dict`

One of the goals of this infrastructure work is to eliminate the need to construct the attribute forms of non-attribute properties and the need to construct the DictionaryAttr form of an operation’s properties.

One mechanism that could help achieve this is the prop-dict directive, which creates prints out a dictionary of the inherent attributes an non-attribute properties as <{k1 = v1, k2 = v2, ...}> (excluding anything mentioned in the assembly format), leaving attr-dict for the discardable attributes. This matches the generic form of operations.

Mont operations upstream have not been migrated to a separate prop-dict directive. This might even be a good thing, since I’d like tho propose some changes to what prop-dict does.

My vision is that prop-dict should act something like struct([all attributes/properties not mentioned elsewhere]).

That is, if I have IndexAttr:$foo, I64Prop:$bar, UnitProp:$baz, currently, the prop-dict would print them as <{bar = 2 : i64, foo = 1 : index}>. However, with the new prop-dict, that would become <foo = 1, bar = 2> (because, as in the assembly format, the types are known). This would also allow non-attribute properties to use their own printing code, instead of needing to fall back to an uglier generic form.

For backwards compatibilty, there’d be a bit printPropDictGenericly available. Furtheremore, to make copy-pasting from generic IR easier, when parsing a prop-dict, if there’s a { right after the <, that will be parsed as a DictionaryAttr that’ll get used to setPropertiesFromAttr() as it is currently. Then, we’d start parsing key/value pairs, using the key to determine which parsing logic to use for values.

(We could also get rid of all the parseProperties and printProperties stuff that’s currently running around, and migrate any users of those “if you have a freestanding function of this name it’ll get called” quasi-hooks - there aren’t any upstream - to custom<...>(prop-dict))

The advantage to this change is that we’d decouple printing and parsing of non-attribute properties from the attribute infrastructure, and, even when dealing with attribute, we’d avoid constructing a bunch of pointless DictionaryAttrs, saving us memory and cache entries that don’t do anything.

FFI Bindings

One issue that’s come up with non-attribute properties is that no one’s put in the work to make them interact reasonably with the Python (and/or C) bindings.

This is the area I’ve thought the least about in this plan, but I think what’s needed is to expose creation, getters, and setters for properties structs to C. This’d require tablegen-ing C headers, but that’s probably fine.

So if you had

def Example : My_Op<...> {
  let arguments = ins(..., I64Prop:$a, IndexAttr:$b, StringProp:$c, EnumProp<MyI32Enum>:$d
}

you’d get

enum MlirMyDialectMyI32Enum {
  ....
}
MlirMyDialectMyOpProperties mlirMyDialectInitializeMyOpProperties(context);
int64_t ...PropertiesGetA(MDMOP);
/// Getters omitteded, names abbreviated
void ...PropertiesSetA(MDMOP, int64_t value);
void ...PropertiesSetB(MDMOP, mlirIntegerAttr value);
void ...PropertiesSetC(MDMOP, MlirMyDialectMyI32Enum value);
void ...PropertiesSetD(MDMOP, MlirStringRef value);
MDMOP mlirMyDialectGetMyOpProperties(mlirOperation);

and so on. Once everything looks like C functions, it hopefully won’t be too hard to translate to other languages?

Alternatively, for a performance cost, we could expose setPropertyFromAttr() and the like and have the Python and C users just construct the attributes corresponding to non-attribute properties.

I’m less sure about what the “good” or feasible solutions are here.

Audits of `getAttrs()`, `getAttrDictionary()` and so on

Once we’ve got the builder improvements in place, we’ll want to take a look upstream to see where people are calling getAttrs(), getAttrDictionary(), and other such methods. If they’re just being used in a “maintain the attributes of some operation while tweaking it”, that can be replaced by a call to getOpaqueProperties() and getDiscardableAttrs() with the new builder methods.

This also means that, for example, ConvertToLLVMPattern would, instead of having an attribute converter and some special-case handling for overflow flags (which the LLVM dialect stores an a non-attribute property), it’d have convertProperties(const SrcProps& oldProps, TargetProps& newProps) as something you plug in to specify how to pull relevant data from one struct to the other. This’d then allow these conversion patterns to respect the inherent/discardable split. (Trying to make this generalization is how I noticed the need to create in the first place.)

Finale: the propocalypse

The long-term goal of all this work is to bring non-attribute properties up to parity with inherent attributes and to untangle existing inefficiencies. Once this is done, we should be able to bring about (on a dialect-by-dialect or even op-by-op basis) the propcalypse. This is the breaking change we replace any FooAttr that really doesn’t need to be an attribute (EnumAttrs or other scalars, basically any UnitAttr, many `ArrayAttr usages, are the ones I’ve got my eye on) with a non-attribute property. This means that these values will start being stored inline and mutably, and will never need to go through some weird uniquing.

I expect that this, along with the changes above, will improve the performance and memory usage of MLIR-based systems. Sadly, we’re not quite in a state where we can do that yet.

krzysz00 · August 8, 2025, 8:02pm

(cc @mehdi_amini @fabianmc )

mehdi_amini · August 9, 2025, 7:14pm

Thanks for the write-up, you’re touching on a bunch of important point of the (unfinished) migration. Here are some of my notes:

There is more to it than just runtime performance (even though it’s a pretty big thing!): there is also the fact that attributes are inherently leaking memory into the MLIRContext. That can prevents from reusing the MLIRContext across compilation because it could lead to an ever increasing memory footprint. Properties are tied to the lifetime of an operation, and that opens up for shared_ptr kind of pattern where you can release the memory from a pool when no operation is using the property anymore (have the best of both worlds).

Note that unregistered operations do use an inherent/discardable attribute split - they just use a DictionaryAttr as their properties “struct”.

Actually I believe they use an Attribute, and not necessarily a DictionaryAttr, since the properties storage does not have to fit a DictionaryAttribute in the first place: there is no expectation that you can have a dictionary-like view of properties.

I agree with the overall idea, but I don’t quite get why we need an extra field when we could discriminate on whether a properties pointer has been passed in or not?

What’s the difference with the existing OperationState::getOrAddProperties<T>()?

You say you want to generalize it, but you don’t really explain the proposed new behavior.
Accessing attributes is one thing, but I would be concerned about introducing implicit attribute conversion from individual properties in the system: this would be very inefficient.

I don’t follow the idea here: there is prop-dict already, and what I’d like to do is to actually error out in TableGen when a property or inherent attribute isn’t listed explicitly in the declarative assembly unless prop-dict is set. Basically we should deprecate the ability to have inherent attribute in attr-dict in the first place!

I think you’re making again an assumption about a DictionaryAttr being present which isn’t necessarily the case. But more importantly: parsing is rarely something we optimize for (textual format isn’t meant to be a production tool in general) and more importantly the attribute-to-property case is even less of an expected important case: it exists in the first place to support the generic format only (–mlir-print-op-generic).

krzysz00 · August 9, 2025, 10:01pm

Thanks for all the notes!

I suppose I mentally had that under the performance umbrella, but it’s worth pointing out explicitly.

True. I was tripped up by the common case, and the fact that populateInherentAttrs() exists.

Given that we have an operation with IndexAttr:$a, IndexAttr:$b, and assembly format $a attr-dict, the parser will put the value of b into the attributes section of the state but the value of a into the properties struct.

That is, it is currently not an invariant of the system that if you have created a properties struct, you’ll only be using that struct to store inherent attributes. Hence the need for opt-in behavior and the changes to the parser logic.

getOrAddProperties<T>() doesn’t - unless I’ve massively misread the code in question - construct the default values of inherent attributes.

The new behavior is that, in the case of non-attribute properties, you’ll be getting the property and calling it’s to-attribute code or setting the property by way of its from-attribute code, respectively.

We already have mass properties-from-attributes - including non-attribute properties - with setPropertiesFromAttr, but that’s designed to merge in parsed values in one go and requires summoning a DictionaryAttr for your attribute list

And that’s a nasty flag day to pull on people that breaks backwards-compatibility in a rather big way.

So, I do agree with you in the long term, and have a gentler deprecation plan (that also means we can skip all this callback stuff)

Improve prop-dict parsing so it’s not just parsing an attribute when there’s an assembly format (per below)
Have some opt-in mechanism for “warn on missing prop-dicts and read from the attr-dict instead” to allow migrating textual IR in a more gradual manner
Migrate upstream dialects over to a prop-dict / attr-dict separation
TableGen warning if you don’t have a prop-dict but have inherent attributes that would go in it
Once that’s been around for a while, upgrade it to an error

I don’t think that intent has survived contact with production, especially since the whole bytecode setup is both recent and rarely used.

Not true. llvm-project/mlir/tools/mlir-tblgen/OpFormatGen.cpp at main · llvm/llvm-project · GitHub , llvm-project/mlir/tools/mlir-tblgen/OpFormatGen.cpp at main · llvm/llvm-project · GitHub , llvm-project/mlir/include/mlir/IR/OpDefinition.h at main · llvm/llvm-project · GitHub - which show property-to-attribute and attribute-to-property conversion being used ot implement prop-dict in assembly formats.

(That’s not a good thing, and I’d like to change it, but also I’m trying to be careful about not breaking people here)

mehdi_amini · August 11, 2025, 4:33pm

OK, I see what you mean, but I’m not sure why we should do it in on the OperationState immediately instead of later when actually creating the operation?

Right: I do have strong concerns about such APIs. The conversion to/from attribute is something I add for properties only to deal more easily with generic printing and help with some things during the transition: this isn’t something that I’d like to see sprinkled everywhere and inserted in the production path as a long-term plan.

Sure: provide an opt-out to people as a flag, but leave them out of the optimized path. That is we don’t need to change the infrastructure to be efficient for backward compatibility cases.

How so?
I don’t know of any production path that is sensitive to textual printing/parsing: that does not mean they exists, but that’s absolutely not gonna be a reason that we should start optimizing the efficiency of it. Improvements are welcome as long as they don’t complexity anything (API surface, maintainability, etc.): that is performance of the textual printing/parsing is very much secondary to everything else.
As far as I know: Flang, ClangIR, JAX/XLA, CuteDSL, etc. are all MLIR-projects that are (correctly) avoiding the textual assembly in the production path.

krzysz00 · August 11, 2025, 5:10pm

I think both are reasobale options. If we want to do this at operation creation, we’ll want to scrap the unused populateDefaultProperties(), and instead go in for OperationName::initializeDefaultsForUnsetAttrs(OpaqueProperties) (I’m open to the bikeshed) and a setInherentDiscardableSplitGuarantee(bool) on the state.

One reason I liked doing this on the OperationState was that it would allow things like clones to skip the default-construction step since they already have a known-good set of properties … but that’s marginal and I’m not strongly invested in the exact shape of the changes to operation state and create.

I think there’s an argument for the ability to introspect properties in a general way, but there might be some other solution there. I think that the degree to which we want that sort of thing will be a good chunk of the ODM on Thursday.

The other reason you’d want marshaling into attributes is FFI binding, maybe.

And re production text format usage, I may also just have a funky perspective there, where I’ll often see textual IR for models (PyTorch-flavoref, IREE-flavored, and so on) being passed around as a useful artifact.

Topic		Replies	Views
Non-attribute properties? MLIR	12	351	July 28, 2025
[RFC] Introducing MLIR Operation Properties MLIR	23	4266	December 15, 2023
Python bindings + Properties MLIR	13	384	October 20, 2025
[RFC] Deprecating `usePropertiesForAttributes = 0` MLIR	9	412	February 19, 2026
MLIR Open Meeting: Properties design discussion and next steps Announcements mlir	9	654	August 27, 2025