[RFC] Introducing MLIR Operation Properties

This is a proposal, not completely fleshed out on every aspect (PDL and DRR for example), but worth discussing since this is touching on a very core aspect of our modeling.
I tentatively plan to present this in the Open Meeting next Thurday, but I may have to push it out one week.

Background

Operations extensibility has been centered around the use of _Attribute_s since the beginning: this is the only mechanism to attach “data” to an operation. The operation only keeps a reference to the data: the Attribute stores the data uniquely in the MLIRContext. When an Attribute is created, the data is hashed and checked against a dedicated storage in the context if it is already present in which case the existing data is used, otherwise the data is copied into the storage. As such all data is owned by the MLIRContext and only gets destroyed when the context is deleted. One advantage of this is that a simple pointer comparison is enough to compare two attributes for equality, which is advantageous for large data.

A new kind of Attribute was recently introduced around the concept of “Resource”. This is still a work in progress and the documentation does not exist yet, here is the dialect interface. The idea is to decouple 1) the storage of the data from the IR (the data isn’t printed to the MLIR file) and 2) the lifetime of the data from the MLIRContext: the resource does not copy the data into the context and as such we can create such an Attribute from already existing large blob of data in memory. The data can also be shared across multiple MLIRContext. This is convenient when referring to large data stored in files (like ML weights) that can be mmap and referred to by the IR.

An Operation actually holds a single Attribute: a DictionaryAttr. All interactions through accessors are wrappers around this DictionaryAttr. For example, let’s take arith.cmpi:

def Arith_CmpIOp
  : Arith_CompareOpOfAnyRank<"cmpi",
                             [DeclareOpInterfaceMethods<InferIntRangeInterface>]> {
  let summary = "integer comparison operation";
  let arguments = (ins Arith_CmpIPredicateAttr:$predicate,
                       SignlessIntegerLikeOfAnyRank:$lhs,
                       SignlessIntegerLikeOfAnyRank:$rhs);

Which appears in the textual assembly this way:

// Custom form of scalar "signed less than" comparison.
%x = arith.cmpi slt, %lhs, %rhs : i32

// Generic form of the same operation.
%x = "arith.cmpi"(%lhs, %rhs) {predicate = 2 : i64} : (i32, i32) -> i1

// Custom form of vector equality comparison.
%x = arith.cmpi eq, %lhs, %rhs : vector<4xi64>

// Generic form of the same operation.
%x = "arith.cmpi"(%lhs, %rhs) {predicate = 0 : i64}
    : (vector<4xi64>, vector<4xi64>) -> vector<4xi1>

The defined Attribute here models the predicate for the comparison, accessors are generated:

  ::mlir::arith::CmpIPredicate getPredicate();
  void setPredicate(::mlir::arith::CmpIPredicate attrValue);

The arith::CmpIOp::fold() method shows examples of this usage: getPredicate() and setPredicate(). The underlying implementation works by looking up the Attribute in the Dictionary by name.

Here is how the lookup is optimized:

  • Getting the key to use in the dictionary (a StringAttr), while avoiding getPredicate() having to query the MLIRContext and rehash the "predicate" string key.

    An Operation has an OperationName attached to it: this is the structure defined in the MLIRContext when an Operation is registered. It holds the traits, interfaces, etc. specific to an Operation. It also contains an array of StringAttr: these are the names of the attributes used as keys in the Dictionary, stored in a predefined order (for this operation there is a single entry matching the "predicate" StringAttr). This allows to directly retrieve the key at the price of 3 pointers dereferencing.

  • The Dictionary stores the attributes in an array sorted by keys, as long as there are less than 16 entries in the dictionary, we use a linear scan comparing just the pointer value of the StringAttr key. Otherwise we fallback to a custom binary search based on more expensive StringRef comparisons.

Now for the setter implementation:

  • The provided enum value has to be converted to an Attribute:

    mlir::arith::CmpIPredicateAttr::get(context, value) which will lookup the storage for this attribute in the MLIRContext, hash the value and try to find an already existing Attribute or create a new unique one.

  • The StringAttr key for the Dictionary entry is looked up the same way as for the lookup above.

  • Since Attributes are immutable, the DictionaryAttr attached to the operation must be entirely rebuilt. We’ll first copy the entire dictionary into a NamedAttrList (basically a SmallVector of NamedAttribute). The NamedAttrList is then modified in place to add or modify the attribute. This involves looking up the entry if it exists (using the same process as the lookup above) and modifying it in place, or inserting a new entry in the vector in the sorted position.

  • Finally the NamedAttrList is converted to a DictionaryAttr by looking up the storage in the MLIRContext, hashing the entire array of attributes, and returning an existing Attribute or inserting the array into the storage and returning the new DictionaryAttr.

These accessors are quite involved, this is why it is preferable to cache the result of the getter as much as possible:

  …
  … (...,  op.getPredicate(), …);
  …
  … (...,  op.getPredicate(), …);

would be better written:

  auto pred = op.getPredicate();
  …
  … (...,  pred, …);
  …
  … (...,  pred, …);

Also, changing multiple attributes using the setter is very inefficient. With an operation that has two i64 integer attributes axis1 and axis2, the sequence:

  op.setAxis1(0);
  op.setAxis2(1);

will go through the process described above twice and build two new dictionaries that will “leak” permanently in the MLIRContext. The most efficient way to update multiple attributes involves avoiding the convenient accessors:

  int64_t newAxis1 = 0, newAxis2 = 1;

  // Copy the dictionary into a vector of attributes.
  NamedAttrList attrs(op.getAttrDictionary());

  // Mutate the vector of attributes in-place

  // Using a string key for example.
​​  attrs.set("axis1",  IntegerAttr::get(IntegerType::get(ctx, 64), newAxis1);
  // It is more efficient to use precomputed StringAttr keys using accessors like
  // getAxis2AttrName() instead of a string key.
  attrs.set(op.getAxis2AttrName(),  IntegerAttr::get(IntegerType::get(ctx, 64), newAxis2);

  // Build a new DictionaryAttr in the context.
  DictionaryAttr dict = attrs.getDictionary(ctx);

  // Update the operation in-place by swapping-in the new Dictionary.
  op.setAttrs(dict);

This is clearly a lot of boilerplate, and it’s not surprising that the convenience accessors are pervasively used everywhere despite the cost.

Introducing Properties

While attributes are a very flexible and extensible mechanism to attach data to an Operation, the price to pay (both in terms of runtime cost associated with the accessors and the lifetime tied to the MLIRContext) does not make it necessarily a good tradeoff for most common small data attached to operations like the predicate on arith.cmpi example above.

This proposal isn’t completely fleshed out and comes with a few possibilities moving forward, each with their own tradeoffs. I will try to expose this in details, so that we can evaluate our options during the next open meeting.

I will first introduce a new concept I call “properties”: the idea is to allow operations to carry some data without involving any attribute:

  • The data is allocated inline with the Operation and can be used as an alternative to attributes to store data that is specific to an operation.

  • An Attribute can also be stored inside the properties storage if desired, but any other kind of data can be present as well.

  • This offers a way to store and mutate data without uniquing in the Context contrary to attributes.

  • The lifetime of the data is tied to the Operation.

D141742 is the implementation for this new feature and the OpPropertiesTest.cpp has an example where a struct with a std::vector&lt;> as well as a std::shared_ptr&lt;const std::string> attached to an operation and mutated in-place.

In terms of ODS, the integration is functional, but there is room for further improvements. Right now it’ll look like this for example:

​​// Op with a properties struct defined inline.
def TestOpWithProperties : TEST_Op<"with_properties"> {
  let assemblyFormat = "prop-attr attr-dict";
  let properties = (ins
    Property<"int64_t">:$a,
    StrAttr:$b, // Attributes can directly be used here.
    ArrayProperty<"int64_t", 4>:$array // example of an array
  );
}

All the boiler plate is generated automatically, and the properties can be accessed through a new accessor named getProperties(). In the example above it’ll return a mutable reference to a struct looking like:

struct Properties {
  int64_t a;
  StringAttr b;
  int array[4];
}

The generic syntax for operation is modified to print the properties as an Attribute in between &lt; > after the operand list (this implies that unregistered operations will have an extra Attribute storage available). The TestOpWithProperties above will be printed generically as:

"test.with_properties"() <{a = 32 : i64, 
                           array = array<i64: 1, 2, 3, 4>,
                           b = "foo"}> : () -> ()

Of course custom printers and parsers have full freedom as usual. The ODS declarative assembly does not support yet referring to individual properties members. Mutating the properties for this Operation in an optimal way is trivial and does not involve any gotchas:

  auto opWithProp = dyn_cast<TestOpWithProperties>(op.get());
  // Get a mutable reference to the properties for this operation and modify it
  // in place one member at a time.
  TestOpWithProperties::Properties &prop = opWithProp.getProperties();
  prop.a += 42;
  prop.b = StringAttr::get(ctx, "some label");
  prop.array[2] += prop.a;

Right now, build methods generated through ODS haven’t been updated to take Properties as input, however the properties can be updated post-creation in a similar way as the mutation above:

  auto opWithProp = builder.create<TestOpWithProperties>(loc);
  // Get a mutable reference to the properties for this operation and modify it
  // in place one member at a time.
  TestOpWithProperties::Properties &prop = opWithProp.getProperties();
  prop.a = 42;
  prop.b = StringAttr::get(ctx, "some label");
  prop.setArray({1,2,3,4);

We will update ODS generators to emit build methods that will support:

  auto opWithProp = builder.create<TestOpWithProperties>(loc, /*a=*/42;
                        /*b=*/StringAttr::get(ctx, "some label"),
                        /*array=*/{1,2,3,4});

Not Discardable

One aspect of the DictionaryAttr attached to an operation is that it mixes inherent and discardable attributes. As defined in LangRef:

_ The attribute entries are considered to be of two different kinds based on whether their dictionary key has a dialect prefix:_
  • inherent attributes are inherent to the definition of an operation’s semantics. The operation itself is expected to verify the consistency of these attributes. An example is the predicate attribute of the arith.cmpi op. These attributes must have names that do not start with a dialect prefix.
  • discardable attributes have semantics defined externally to the operation itself, but must be compatible with the operations’s semantics. These attributes must have names that start with a dialect prefix. The dialect indicated by the dialect prefix is expected to verify these attributes. An example is the gpu.container_module attribute.

On the other hand, the Properties data-structure is non-discardable and not designed to be introspectable opaquely (that is the client needs to know the Operation class to access the properties, possible through an OpInterface). It is an alternative solution to inherent attributes.

When to use Properties

Properties are almost always like an appropriate replacement for inherent attributes. In particular, Attribute can still be used inside properties, allowing to make every possible tradeoff within the Properties framework. Ultimately, defining a DictionaryAttr as Properties for an Operation would be strictly equivalent to the existing storage, but the current Attribute storage will always stay relevant for discardable attributes anyway.

When to not use Properties

  • When you intend to expose an attribute opaquely through the dictionary. However it is likely that we may prefer to rely on OpInterface to provide access to the attribute. For example for handling FastMathFlags, we likely should have a FastMathOpInterface wrapping the access to the flags instead of using name lookups in the dictionary.
  • When you need to model something that will be exposed through an AttributeInterface (you can’t attach interfaces to properties right now). Although, the Attribute can be stored inside the Properties and exposed as such.
  • PDL introspection is not supported right now for something not stored in the properties as an attribute. DRR support hasn’t been explored either.

Implementation Notes

Here is a walkthrough of the example in mlir/unittests/IR/OpPropertiesTest.cpp:

// A c++ struct with 4 members.
struct TestProperties {
  int a = -1;
  float b = -1.;
  std::vector<int64_t> array = {-33};
  // A shared_ptr to a const object is safe: it is equivalent to a value-based
  // member. A non-const shared_ptr would be unsafe: after cloning an Operation,
  // any mutation of the pointed value from the original would apply to the clone 
  // as well.
  // Here the label will be deallocated when the last operation
  // referring to it is destroyed. However there is no builtin pool-allocation:
  // this is offloaded to the client.
  std::shared_ptr<const std::string> label;
};

Before being able to use this as properties on an operation, three functions must be defined:

// Compute a hash for the structure: this is needed for
// computing OperationEquivalence, think about CSE.
llvm::hash_code computeHash(const TestProperties &prop);
// Convert the structure to an attribute: this is used when printing 
// an operation in the generic form.
Attribute getPropertiesAsAttribute(MLIRContext *ctx,
                                   const TestProperties &prop);
// Convert the structure from an attribute: this is used when
// parsing an operation from the generic form.
LogicalResult setPropertiesFromAttribute(TestProperties &prop,
                           Attribute attr,
                           InFlightDiagnostic *diagnostic);

Any operation can very easily register this structure through a simple type alias declaration:


/// A custom operation for the purpose of showcasing how to use "properties".
class OpWithProperties : public Op<OpWithProperties> {
public:
  // Begin boilerplate
  …
  // End boilerplate

  // This alias is the only definition needed for enabling "properties" for this
  // operation.
  using Properties = TestProperties;
};

The properties is then directly accessible, here is an example involving accessing and mutating the data attached to an operation:

  auto opWithProp = dyn_cast<OpWithProperties>(op.get());
  // Get a mutable reference to the properties for this operation and modify it
  // in place one member at a time.
  TestProperties &prop = opWithProp.getProperties();
  prop.a += 42;
  prop.b = 42.;
  prop.array.push_back(42); // std::vector privately attached to this operation.
  prop.label = std::make_shared<std::string>("foo bar");

The getProperties() accessor is cheap: it retrieves directly a member of the operation by computing an offset, this is no different from getting any other member of an Operation (result, operands, type, …). From there, everything is purely native C++: there is nothing hidden and nothing specific to MLIR involved (in particular the MLIRContext is not involved at all).

While by default, the parsing and printing will use the ​​getPropertiesAsAttribute()/setPropertiesFromAttribute() and rely on existing attribute printing/parsing format, 2 extra functions can also be used to customize the format:

static void customPrintProperties(OpAsmPrinter &p,
                                  PropertiesWithCustomPrint &prop);
static ParseResult customParseProperties(OpAsmParser &parser,
                                         PropertiesWithCustomPrint &prop);

See the TestDialect.cpp file in the revision above for an example.

In terms of ODS, the integration is functional, but there is room for further improvements. See some examples in TestOps.td, including this one for example:

​​// Op with a properties struct defined inline.
def TestOpWithProperties : TEST_Op<"with_properties"> {
  let assemblyFormat = "prop-attr attr-dict";
  let properties = (ins
    Property<"int64_t">:$a,
    StrAttr:$b, // Attributes can directly be used here.
    ArrayProperty<"int64_t", 4>:$array // example of an array
  );
}

TableGen will generate the following C++:

class TestOpWithProperties : public Op<TestOpWithProperties,  /*traits*/… > {
public:
  // Begin boilerplate
  …
  // End boilerplate

  struct Properties {
    using aTy = int64_t;
    using bTy = ::mlir::StringAttr;
    using arrayTy = int64_t[4];
    aTy a;
    bTy b;
    arrayTy array;

    int64_t getA() {
      auto &storage = this->a;
      return storage;
    }
    void setA(const int64_t &value) {
      auto &storage = this->a;
      storage = value;
    }
    ::mlir::StringAttr getB() {
      auto &storage = this->b;
      return storage;
    }
    void setB(const ::mlir::StringAttr &value) {
      auto &storage = this->b;
      storage = value;
    }
    ::llvm::ArrayRef<int64_t> getArray() {
      auto &storage = this->array;
      return storage;
    }
    void setArray(const ::llvm::ArrayRef<int64_t> &value) {
      auto &storage = this->array;
      ::llvm::copy(value, storage);
    }
  };

  …

The setPropertiesFromAttr, getPropertiesAsAttr, and computePropertiesHash are also automatically generated, dispatching to every single individual member. In this case no C++ code needs to be written by hand by the user, the boilerplate is entirely generated.

Performance Downsides & Considerations

Registered Operations that don’t use Properties don’t pay any price for the existence of this feature in the codebase. Unregistered operations (including in registered dialects) will have a single new attribute member added (we don’t know if they will use or not some property storage).

While the advantages of using Properties over Attribute should seem fairly straightforward by now, there are some downsides when using Properties:

  • Memory footprint may increase: Operation allocations get larger than before. While “properties” should stay small there is a tradeoff between “pooling” the allocation in the context (and leaking memory there “forever”) and inlining it within the operation. Note that the proposed scheme would allow a Dialect to implement some intermediate scheme: constant data could be pooled in the context but using reference counting to free the data when all operations referring to them are deleted.
  • Comparison is no longer “single pointer”: checking that two operations have the same Properties requires calling the Properties comparison operator, while with Attributes comparing the DictionaryAttr pointer is enough.
  • As seen on OperationName, more hooks are introduced to interact with Operation adding some extra runtime cost to come operations:
    • When creating an operation, we initialize the properties by calling its default constructor (through an indirect call) before calling the assignment operator.
    • When cloning an operation, we call the assignment operator and copy the properties.
    • When deleting an operation, we call the properties destructor.
    • OperationEquivalence (called by CSE for example) will hash the properties (through an indirect call).

Breaking the Model Further: opt-in all inherent attributes into Properties

The proposal above is designed with backward compatibility in mind: landing this as-is wouldn’t break any existing code out there immediately. This allows for a gradual adoption of this feature. However there is an argument that we’re breaking some fundamental assumptions of the underlying MLIR model: trying to piecewise clone or propagate information from one operation to another will be broken. Existing MLIR code that manipulates and transforms operations opaquely this way could become subtly broken when a dialect starts to adopt properties. A more immediately breaking but more consistent approach would be:

  • Stop storing the inherent attributes (so the ones declared in ODS) in the DictionaryAttr. Instead store them as individual members of the properties. The DictionaryAttr would be exclusively reserved for Discardable attributes.
  • Maintain Operation::getAttr(StringRef) behavior and so backward compatibility by first checking if the name is a registered inherent attribute, in which case return it from the properties, otherwise return it from the discardable attributes DictionaryAttr.
  • Rename Operation::getAttrs() to Operation::getDiscardableAttrs(), and Operation::setAttrs(DictionaryAttr) to Operation::setDiscardableAttrs(DictionaryAttr), so that any uses of the former APIs would be a build breakage and would have to be updated, allowing to audit and update to account for properties. This is the only breakage that downstream clients would incur.

This also presents with the benefit that every existing use of inherent attributes in ODS would immediately be turned into properties, maximizing the benefit of the feature. Although a proper migration to properties involves changing the ODS definition to declare members as “properties” so that they aren’t stored as attributes any more (for example storing the arith.cmpi comparator as a C++ enum inline instead of an IntegerAttr).

We can also very well get there in steps gradually, adopting properties as opt-in first before deprecating and removing the storage of inherent attributes in the DictionaryAttr (through an opt-in at the dialect level for example).

6 Likes

Huge +1 from me. I know there have been a lot of people who’ve been asking how to attach “arbitrary” data (mostly things that don’t really fit the attribute concept, like a pointer for example!) to operations and this would model this perfectly. As one of those mad man myself, having an Attribute within my dialect that stores just Opertion* never sat right with me either (both for technically being UB due to pointer zapping and just leaking memory a lot).

I have just a few questions/observations.
In your first suggested implementation that is backwards compatible, each Op class generated by TableGen would have a getProperties() method that would return the struct which then has getters and setters for accessing it.
I am wondering whether there is any reason why any access to properties would have to go through that instead of just generating the getters and setters in the Op class?

Next for your opt-in all attributes as properties, am I understanding correctly that the only breaking change would then be that the attributes can’t be introspected via the DictionaryAttr anymore?

And would this version also allow properties within the arguments DAG field instead of having a separate properties field?
I think the latter is a bit awkward while the former would make things a lot easier when migrating from attributes to properties as well (one would just have to change a I32Attr to Property<"int32_t">). We’d then also preserve the total ordering of all operands, properties and attributes (although the only use of this I can currently think of would be for DDR).

The reason is to “fix” something that never sat well with me initially: the fact that the setters/getters are “polluting” the class namespace. That is they may conflict with methods inherited from other traits/interfaces and may lead to surprising issues sometimes. I think we should have from the beginning organized this with “adaptors”: op.ods().getPredicate() or something like that.
This is up for debate of course, and adding the generation of forwarding accessors should be a trivial change to my current patch! :slight_smile:

Yes. We could preserve the existing accessors, we could support Operation::getAttr("predicate"), but we would remove DictionaryAttr Operation::getAttrs().

Yes, that would make sense I think.

1 Like

Haha and see I actually think of those as being the namespace ones and the rest as not op specific and should be namespaces/scoped as the accessors are for what is intrinsic to the op and the generated class is a view of the general for the specific op. I’d have had op.A iff op.ods().A would have made sense and the rest scoped (perhaps behind ->) but don’t know if possible with inheritance.

We also talked about bytecode here for these (and sorry I haven’t read yet, and quite a bit of info so will be great to discuss in ODM, perhaps one after next is good to give some discussion time, but will try to make time either way).

Great proposal! I’m generally in favor of this. Having to go through immutable data structures in a language that doesn’t have first class support and optimization for these has been bothering me for a while. I do have some questions though.

  • How difficult would it be to estimate the IR size (memory footprint) increase and the runtime improvement thanks to not reconstructing the dictionary? It would me much easier to make the decision when we clearly know we are trading off N% memory for M% time.
  • Attribute creation in the context is thread-safe, operation mutation is not be it for operands or properties. Do you foresee any multithreading-related issues with properties?
  • How do we handle the lack of opacity in C API and language bindings relying on those? We can currently create and manipulation operations through the C API easily thanks to generic opaque methods. With properties being non-opaque, we would have to define C APIs for each individual operation with properties similarly to what we do for attributes and types today. GIven that there are exponentially more operations, this looks like a significant overhead.
  • The prototype prototype currently avoids modifying the build functions, which implies that either all properties must have a valid default value or we may create operations in an “invalid” internal state using the builder (which should be impossible under the current model). Any suggestions on how to address that?
  • Do “system” attributes like operand/result_segment_sizes become properties?
  • One of the benefits of properties compared to attributes is being able to free up the memory they occupy. Can’t we solve this separately by reference-counting attributes in the context?
  • What is the longer-term user-friendly way to differentiate properties and attributes? I’m concerned with the conceptual complexity of an operation having both inherent/discardable attributes (or just discardable attributes) and properties that may or may not be attributes themselves. Maybe we should separate the notion of inherent/discardable (operation/property) from the context-level storage somehow. Ideally, we’d have an explanation that is as easy as the current “types contain compile-time information about a value, attributes contain compile-time information about an operation”.

My initial gut reaction is a slight -1 because:

  1. I think it is generally bad practice to be arbitrarily in-place mutating operations, which is the path that seems most optimized by this proposal.

  2. This proposal moves us further away from having a fixed ontology of the IR since it mixes in random C++ code in the properties. Having the IR be overall introspectable and free from opaque C++ code helps with bindings and interop.

I agree with the various issues you have pointed out though. I think of all the options, using something like properties to completely replace the concept of inherent attributes would be great. Realistically use of discardable attributes is almost always a bug (who has tests that verify that their system still works if they are discarded? or that documents/tests that their passes provide some useful contract about not discarding?), so moving away from that concept and more towards “Operations have a fixed set of introspectable properties declared in ODS, and we have some slow-path monkeypatching mechanism that can be used to hold discardable attributes” would be something I would support. It would be a huge change but I think it would overall clean up the system.

Seems quite difficult to me, as it would be highly use-case dependent: that is we can create synthetic example in the extreme in both directions.
We could take IREE for example as a test-bench, but we’d need to first rewrite all the involved dialects: we’ll get there eventually but that’s a non-trivial thing to do (beyond a quick experiment).

I don’t think so: no API on Operation is thread-safe, including changing attributes already.

I haven’t thought about this yet, we can bind these generically right by going through attribute but that’s not very efficient!
We may be able to auto-generate the C API for most properties from ODS though?

I was planning on changing build functions, this is just not implemented yet (didn’t want to invest too much in the prototype, pending discussions here). Do you foresee any specific issue here?

They could (and probably should!).

We may be able to, but when we looked into this in the past with River that was a non-trivial change with respect to how attribute storage works (for example bump pointer allocators…).

We are already beyond this with respect to inherent/discardable attribute? Actually I think the current situation isn’t great because we blend the two together and people have the “easy” conceptual modeling you mention instead of the “real thing”.

Can you clarify what do you mean by “arbitrarily in-place mutating operations”? That is: we expose right now a setter on the Op like this for example: CmpIOp::setPredicate(::mlir::arith::CmpIPredicate predicate), which seems like an “arbitrary in-place mutation” to me.
(actually Operation::setAttr(name, value) is probably is more “arbitrary” in terms of in-place mutation as there is no encapsulation)

It seems to me that you have here an “idealized” view of Attributes which makes sense only if we limit to the builtin attributes (and even there, we have resources/blob…). Unfortunately attributes can already model “any C++ code” and aren’t introspectable themselves!
As far as I know, the only limit that comes Attribute that is limited is 1) immutability & uniquing 2) lifetime tied to the MLIRContext.
(And for 1) it isn’t even absolute I believe: one can work around and mutate Attribute in-place somehow, with some ugliness).

Maybe something more blunt: print something when we reconstruct the dictionary and then grep/sort? Would at least give us an idea of how much this is happening and let us estimate the benefit?

I mean I don’t think it is good practice for passes to do lots of in-place mutations at all. I know there exist passes that go and attach attributes to “every op” but I don’t think those are very good practice.

I agree that is the current state, but it makes bindings harder. I think it would be useful to not move more in that direction.

I am surprised by this: the whole concept of any transformation is to mutate the code! Whether is it replacing an operands, updating a type, or changing an inherent attribute.
Even canonicalization does that!

For example turn a predicate from (-a < -b) to (a >= b)

  %minus_a = arith.negf %a : f32
  %minus_b = arith.negf %b : f32
  %cmp = arith.cmpf lt, %minus_a, %minus_b : f32
  return %cmp : i1

=>

  %cmp = arith.cmpf ge, %a, %b : f32
  return %cmp : i1

A canonicalization implementation here would just update the operands in-place and update the predicate in place as well.

I suspect we’re using some different terminology here and talking past each other?
Unless maybe you’d be advocating for a more “functional” compiler where Operation are “immutable” and we create new IR to replace existing one instead of mutating in-place?
In which case in my example above you would not mutate arith.cmpi but instead create a new one, RAUW the original one, and delete it?

What do you have in mind here? Because I can think of some passes using attribute for “temporary marking” and things like that, but that’s not for inherent attributes right? (and so not for properties).

Yes I think that mutating inherent attributes (or properties) is more ok, like in your cmpf example as a microoptimization. But I think that the level of in-place updating in your op like TestOpWithProperties is really atypical. Or, to say it another way, I don’t think that optimizing inherent attribute mutations is really going to be the big win of this proposal (that’s just my intuition, maybe you have measurements otherwise) – I think that the big win will be clearly separating inherent attributes and making them into “properties” that are stored more efficiently even in the read-only case.

Yes, I think that “temporary marking” with discardable attributes is kind of an antipattern (actually, I would even call it contradictory, since the concept of “discardable” is that they can be discarded, but usually the markings are load-bearing).

I can’t make the call later today due to a conflict, but I’m very excited about this work. This will be a huge step forward for MLIR’s IR efficiency, both by eliminating silly ‘uniquing’ trips through the context and by reducing string compares and pointer hops through the dictionary. It’s a big effort, but I think it will be a huge step forward!

How would arbitrary C++ properties interact with the serialization/deserialization? It feels like Attributes tend to be created with the text representation in mind, but maybe there are some that break it already. You briefly covered it in your presentation, but it would be nice to know if there are any guarantees.

Custom printing is always under the control of the users, but generic printing is also always possible because we enforce that the properties are convertible to Attribute. Search for setPropertiesFromAttr and getPropertiesAsAttr in this topic as well as in the patch to see these in action.

Of course someone could always inject any C++ object that would not properly serialize/deserialize, but that’s also already possible with Attributes today (you can manually write an Attribute in C++ and bypass every mechanism), it’s just not possible to exercise this through .mlir files anymore…

1 Like

+1, (although not exactly this problem, quite related) in compiling large (~1GB) code bases in circt, a lot of runtime and memory goes into making attributes for calling the folders, attribute which are otherwise unused for the reset of the execution. It would be nice to reduce the reliance on long-term, uniqued storage for transient purposes.

This has also been problematic and annoying. If for no other reason than the namespace of statically named, operation required data is mixed with arbitrary string-keyed extensible data.

1 Like

The patch is now somehow complete I believe, or at least for what I’d like to land initially (I know of some followup around the declarative assembly format) and it is supposed to be backward compatible.
It is also updated taking into account the feedback received during the Open Meeting presentation. In particular a new ODS flag for dialects (usePropertiesForAttributes) will automatically use properties to store inherent attribute. We’ll make this the default in the future and deprecate using the DictionaryAttr to store inherent attributes. This dictionary will be reserved to discardable attributes. We also introduced methods on the Operation class to access inherent attribute and discardable attribute through specific APIs to split their namespace, we’ll deprecate the generic getAttr, setAttr, etc. in the future. At the moment these are all made backward compatible.

The stack of changes you can see on Phabricator updates all the dialects in-tree, see for Linalg for example.

I have a question about this on the side: I’m not familiar enough with pre-patch implementation limitations, but does this take us considerably farther away from an observable IR or does it not make any difference in the grand scheme of things?

ATM all observability I know of is through OpBuilder listeners such as with PatternRewriter. They already can’t catch any direct mutations you do on ops. But that’s fine, they can still convert results, for example. Was attribute creation ever intercepted anywhere?

I think you can frame this more generally than « observability »: this is touching on introspection in general.
That said introspection of attributes and types has always been intrinsically limited: they are opaque structures defined by dialects.
Still we’re removing at least the ability to query the entries in a dictionary.

There isn’t any listener for modifying attributes at the moment, but properties will make it impossible to implement [RFC] Introduce the concept of IR listeners in MLIR comprehensively.

Patch is ready to land now, I have a chain of patches to migrate all the dialects in-tree. This seems to be mostly backward compatible from an API point of view (we’ll likely deprecate a few things later, but that’ll be worth another thread).
The breaking change will be in the generic printer syntax!

1 Like