[RFC] Data Layout Modeling

Motivation

A data layout description in MLIR is long overdue: there exist types that represent data in memory, but no specification on how exactly it is stored. At the LLVM dialect level, we reuse DataLayout from LLVM IR, which may lead to surprising behavior if transformations on the higher-level representations did not account for the DataLayout that will only be introduced later.

A set of decisions related to the target and layout are currently encoded as pass options, invisible in the IR, when converting to the LLVM dialect, for example, the bit width of the index.

This will help clarify address computation model in memref, enable support of custom element types in memref, and open the door for generic modeling of custom types with memory-reference semantics. (Tangentially to data layout, being able to identify types that can represent memory references is also important for alias analysis. Containers that require elements to have data layout are likely memory references.)

Similarly, this can also help making built-in container types whose semantics is related to size or other layout information, e.g., vectors, to support dialect-specific element types.

Requirements

  • The data layout mechanism should support MLIR’s open type system: we don’t know in advance which types will want to use it and how their layout can be parameterized (e.g., having size/abi-alignment/preferred-alignment tuple is likely not enough).
  • The data layout should be controllable at different levels of granularity, for example nested modules may have different layouts.
  • The data layout should be optional, i.e. types should have a well-specified default “natural” layout that can be used in absence of a layout descriptor.

Observations

MLIR does not allow one to express a type class, i.e. a set of all possible instantiations of the given parametric IR type such as IntegerType or FloatType, in the IR. It is not desirable to list all instantiations of a type class as their number may be huge (e.g., we support up to i16777215). At the same time, the relation between different instances of a type class when it comes to layout is specific to the type (e.g., integers may want to round to the closest power-of-two bits, structures may want to pad elements, etc.).

Data layout can contain entries that are not specific to any type, such as endianness.

Proposal

Operations Defining the Data Layout

The proposed mechanism is based on existing MLIR concepts - attribute, operation and type interfaces. Operations willing to support a concrete data layout implement an DataLayoutOperationInterface interface, which allows one to opaquely obtain the data layout to use for regions (transitively) nested in the operation. The layout parameters are described by an array attribute containing DataLayoutEntryAttr instances. Each instance is essentially a pair of PointerUnion<Identifier, Type> and Attribute. The first element in the pair identifies entry, and the second element is an arbitrary attribute that describes alignment parameters in a type-specific way. Data layout entries specific to a type or type class use Type as the first element of the pair, generic entries use an Identifier.

For example, ModuleOp will likely support the data layout attribute and may resemble the following in textual IR format:

module attributes { datalayout.spec = [
  #datalayout.entry<"endianness", "big">,
  #datalayout.entry<i8, {size = 8, abi = 8, preferred = 32}>
  #datalayout.entry<i32, {size = 32, abi = 32, preferred = 32}>
  #datalayout.entry<memref<f32>, {model = "bare"}>
]} {
  // ops
}

Types Subject To Data Layout

Types willing to use a layout must implement the DataLayoutTypeInterface by implementing the following functions:

static LogicalResult verifyLayoutEntries(ArrayRef<DataLayoutEntryAttr>);
size_t getSizeInBits(ArrayRef<DataLayoutEntryAttr>) const;
size_t getRequiredAlignment(ArrayRef<DataLayoutEntryAttr>) const;

and may additionally implement:

size_t getPreferredAlignment(ArrayRef<DataLayoutEntryAttr>) const;

The verification function is used to ensure the well-formedness of the list of relevant entries, e.g. the absence of duplicate entries or the use of the expected attribute kind to describe the type-specific layout properties. All the other functions are expected to return the corresponding value in bits. Their argument is an unordered list of DataLayoutEntryAttrs with the first element either belonging to the same type class (e.g., IntegerType will receive entries for i8, i32, i64, etc. when present) or being generic (i.e., all types receive all generic entries). Therefore, types cannot be affected by layout properties of other types otherwise than by querying those properties through the interface. The list may be empty if the layout is not specified, and the functions are still expected to return meaningful values, e.g. the natural alignment of the type, without failing the verification. Additional methods can be added later to this interface.

Each type class implements, and must document, an algorithm to compute layout-related properties. This algorithm is _fixed _and can use as parameters the parameters of the type instance (e.g., the integer bit width) and the data layout entries. The mechanism of interpreting the data layout entries is specified by the type class and is opaque to MLIR’s general mechanism.

Querying Data Layout Properties

The DataLayoutTypeInterface defines final methods that can be used to query layout properties of a type:

size_t getSizeInBits(Region &scope) const;
size_t getRequiredAlignment(Region &scope) const;
size_t getPreferredAlignment(Region &scope) const; 

// Potentially provide the default implementation.
size_t getSizeInBytes(Region &scope) const { 
  return ceil_div(getSizeInBits(scope), 8);
}

Note that these functions do not accept a list of data layout entries. Instead, the interfaces accept a region in which the request is scoped (different regions may belong to, e.g., different modules with different data layouts) and identifies the relevant data layout entries using the following procedure:

  • find the first ancestor operation of scope that implements the DataLayoutOperationInterface interface;
  • obtain the layout attribute from this op;
  • continue looking for further ancestors and extract layout attributes from those ops;
  • combine the attributes; if there are two entries with the same key, the innermost in the region nesting sequence is chosen and the rest are discarded.

We may also consider additional type- and dialect-specific mechanisms of how the nested data layouts specifications are combined, but this is excluded from this proposal for simplicity.

Corollary and Example: MemRef of MemRef

Enabling memref-of-memref is a frequent request that has been blocked by the lack of clear mechanism to allocate such objects and correctly index them due to unknown size of a single value of a memref type (depending on the lowering convention, memref is treated as either a descriptor containing dynamic shape and stride information, or as a bare pointer to the first element; the size of pointer may also be unspecified at levels higher than LLVM dialect). It can be achieved by relaxing MemRefType to accept as element type any type that implements DataLayoutTypeInterface, making it itself implement this interface, and defining the size of index type and the lowering convention in the data layout (assuming its equal to the pointer size).

The size computation algorithm, fixed for MemRefType as required by the mechanism, is as follows:

  • The data layout is expected to contain at most one entry, with a dictionary attribute containing a key “model” associated with a string attribute with value either “bare” or “descriptor”.
  • In absence of the entry, “descriptor” model is assumed.
  • If the model is “bare”, the size of the memref type is equal to that of the index type (query the mechanism recursively, IndexType is assumed to implement DataLayoutTypeInterface).
  • If the model is “descriptor”, the size of the memref type is equal (3 + 2 * memref-rank) * size of the index type (recursive query + using parameters of the type).
  • The required alignment is always equal to that of the index type.

Please note that this is an example illustrating the proposal. Objections specifically to modeling memrefs should not preclude the infrastructural proposal from being implemented (without the model for memrefs).

Corollary for Type Casting

The semantics of bitcast / reinterpret_cast can now be clearly defined for types with data layout as interpreting the bit representation of the type as another type.

In addition, the data layout mechanism can be used to (dis)allow certain casts in dialect conversions.

1 Like

Thanks for the write up! This is gonna be a big piece of infra :slight_smile:

Why is this something to implement for each type? What you’re describing here seems fairly generic?

That means that there has to be a default answer to all these queries right?

It isn’t clear to me what is the namespacing aspect for these keys in order to avoid conflict on keys across types.
Another aspect is how are the entries in the data layout handled by transformations when transforming the operation defining the layout, in particular in presence of arbitrary keys.

Hi Alex,

Thank you for tackling this. I think this could use an in-ODM-discussion. When you’re thinking about this, I’d encourage you to think beyond just data layout sorts of applications (where you need size and alignment) and to also think about how this generalizes to more complex things, like lowering a “CIL dialect” type to LLVM IR calling convention attributes. It’s the same problem, the later is just a more ugly form of this.

I see that you’re taking an approach similar to LLVM IR DataLayout, which tries to capture this in a declarative way. I’m concerned that this is not expressive enough, and bringing forward a failed model (one that doesn’t handle a better way to do C/C++ ABI lowering and other open type system very well). The LLVM model was motivated by not wanting to require tools like llvm-opt to link in target information, but that doesn’t really make sense, because the amount of target information required isn’t bounded (e.g. just grep for Target in include/llvm/IR) - and because the higher-level codegen done by MLIR will want even more than LLVM (e.g. cache hierarchy information).

Have you considered an alternate approach based on the notion of having a declared target, which hooks into then provides something equivalent (but much better designed) to TargetMachine? This would look something like:

module attributes { target.abi = "linux-x86_64-somethingorother" }

Clients would inferface to a llvm::TargetMachine like interface that has APIs for getting various different query APIs (including one for basic data layout, a different one for memory hierarchy information, one for LLVM IR calling convention lowering, etc). All of these would be failable (if the right target info isn’t linked into the current binary) in which case the transformations should behave conservatively.

One aspect of this is that dialect types would register an interface that describes how they are lowered to more simple things. For example, the layout for cdialect.complex<int> would default to being lowered to two ints in a row. A memref would default to lowering to the { pointer, unknown dims ... } and would call recursively into the data layout lowering stuff to lower their subcomponents.

The nice thing about making the target-specific part of this imperative is that you can give a DataLayout API that works similarly to the LLVM approach, but which can be implemented in arbitrary ways by the target. If it wants to handle “cdialect.complex” different than “cdialect.struct<int, int>” that would be very simple.

The consequence of this would be three things:

  1. A nice simple API for clients to use, which we all want. This would have to be failable of course, both because the target info may not be linked in, but also because the data layout may not be specified.

  2. Dialect types would specify a default lowering in terms of how their components would be lowered by trivial legalization.

  3. The target would specify as much specificity as they want about the various domain problems they care about.

The nice thing about this is that the general approach scales to all the target-description problems (incl things like __builtin descriptions etc) that eventually are needed in a high level compiler like Clang.

In any case, we should probably talk more about this in an ODM sometime, I’m just curious if you’ve thought about it from this lens,

-Chris

The duplicate entries may be indeed something to verify generically. However, only the type (we can also consider the dialect containing the type) has knowledge about the data layout properties relevant for the type.

Yes. That’s what I mean in:

The wording may not be sufficiently clear. There is always an association between a type instance and properties within a data layout entry, so we will have something like #datalayout.entry<memref<f32>, {model = "bare"}>. There cannot not be any conflicts because this specific attribute is used only for memrefs, other types use other attributes, not necessarily dictionaries.

Good question. There does not seem to be much we can do here. It is possible, for example, to also provide a hook that layouts before and after are “compatible” and let specific types implement the hooks as, e.g., it’s okay to decrease the alignment requirement but not increase it. That hook will have to be called by whoever performs the transformation, which may be just fiddling with the attribute list outside of any infrastructure we may want to attach this to, in order to ensure that the data layout after transformation is compatible with that before it.

I’ll start from the end. I have thought about it to some extent, but decided that I would prefer not to go too much wide in scope before we have concrete use cases where the resulting design can be exercised. This doesn’t mean we should just ignore potential extensions, but the discussion could benefit from being focused on a single issue. We have hit the problems specific to data layout and composability repeatedly in the past couple of months, so we can immediately see how this would work for them.

Below are some specific ideas.

The approach the RFC discusses looks like it actually can scale to more complex things as long as they are related to types. Taking the cdialect.complex<int> example, it can have the attribute that describes specifically how it is lowered to the LLVM dialect, in addition to other things: #datalayout.entry<cdialect.complex<int>, {llvm_equiv = [!llvm.int, !llvm.int], alignment = 42}>.

Calling convention is a bit harder because it looks like we want it attached to operations rather than types. (Although we can also have it on the function type, this is a separate discussion). I suppose we could something like “dialect-namespace . op-name” as key for related entries.

I would argue that declarative vs. imperative is mostly separable from expressivity. The proposal explicitly supports MLIR’s open type system and there is no restriction on the information associated with each type, so I’d like to see a specific example of expressivity limitation.

Note also that the proposal is not entirely declarative. The data layout attribute contains mostly the “configuration” for an otherwise fixed, imperative algorithm that computes the layout information based on that configuration. It feels this can also scale to cover most of the things you describe.

Let’s say we want the “default” MLIR pipeline (assuming there is one) for linux_x86_64, but also use std tuple for cdialect.complex, and use “bare pointer” convention for memrefs, and assume index is 32-bit for some reason. So we get {target.abi = "linux-x86_64-complex_is_tuple-memref_bare-index_32"}. Short of registering a new ABI for each combination of options, this starts to look like parsing a string to extract the same data I’d want to represent in a dedicate attribute.

I saw that the key is PointerUnion<Identifier, Type> and you have this example #datalayout.entry<"endianness", "big">, where the key does not seem be a type?

Yes, these are intended for some generic layout/target properties that types can’t require to be present. I don’t anticipate there will be many of these. We can consider prefixing them with a dialect (with unprefixed being built-in, same as for types) since the dialect is the only thing with names unique in context.

Which dialect would you prefix the “endianness” example with?

It would be a built-in:

if I don’t modify the example. I think discussing where this specific property should live is slightly beyond the point of the current RFC. It does not propose this property, only the mechanism, which I agree needs more detail on prefixing in general.

The way I see it, this data layout problem can basically be described, in terms of mechanism, as “module-scoped parameterization of interfaces [that can be persisted in IR]”. The current proposal is applying this to a specific problem of determining bit size / alignment of types after lowering, but I don’t think that’s essential. We could use the same mechanism for parameterizing lowering to LLVM.

For example, instead of #datalayout.entry<memref<f32>, {model = "bare"}> it feels more like we should have

module attributes {
  datalayout.spec = #datalayout.spec_table<#datalayout.entry...]>
  llvm.spec = #llvm.spec_table<[
    #llvm.type_lowering_entry<memref<f32>, {model = "bare"}>
  ]>
}

Then, when you query for the bit size of memref, one of the strategies we have for resolving that is:

  • query LowerToLLVMTypeInterface, which requires passing in a #llvm.spec_table attribute.
  • In this case, memref would implement that interface and say "yes, I know how to lower myself to LLVM with {model = "bare"}"
  • then use BitSizeAndAlignmentTypeInterface (analogous to LLVM’s datalayout) to resolve the bit size and alignment of the new type
    • All LLVM types would implement BitSizeAndAlignmentTypeInterface, and that interface would take a #datalayout.spec_table attribute as an argument to configure themselves.
  • using the information above, compute the final bit size of memref<f32>

I don’t expect memref itself to implement BitSizeAndAlignmentTypeInterface. And LowerToLLVMTypeInterface is independently reusable for lowering like Chris wants. (the question of how memref will implement LowerToLLVMTypeInterface without polluting builtin with a dep on the llvm dialect is a separate question, but an important one…)

Ultimately, what we want is pretty simple, just interfaces that are efficiently parameterized by some sort of persistent IR annotation on the module (or perhaps a scoped set of modules). In this case, the #foo.spec_table custom attributes create efficient C++ data structures, which are mandatory arguments of the methods on the corresponding interfaces.

We don’t need to couple BitSizeAndAlignmentTypeInterface with LowerToLLVMTypeInterface or make them part of some overarching target abstraction.

I think the important thing here is that each interface is likely to have a separate set of ways in which it is parameterized, so module attributes { target.abi = "linux-x86_64-somethingorother" } is insufficient, or at least needs to be layered on top of the individual attributes that configure each interface / dialect we are going to be using in the pipeline.

It seems to me that this showing a coupling between the memref type properties inside the system (the bit size) with the lowering strategy which is hard-coded here.
In such case I would rather have the datalayout encode the bit size directly, and have the frontend or whoever sets up the pipeline populate the data layout accordingly.
For example if you build your compiler pipeline with a bare lowering, you could populate the datalayout from the LowerToLLVMTypeInterface.
The point is that this is all resolved ahead of time, and not on the fly by the client analyses/transformations which really shouldn’t have to know about all the possible lowering interfaces.

I think we need on-the-fly. For example, in the non-bare convention, a memref can have many possible ranks and element types, you can’t just resolve all possible ones ahead of time. I can’t see how, without a callback into the memref type, one could compactly encapsulate knowledge like memref<?xf32> has the same size as memref<4xf32> and memref<?xi32>, but different from memref<?x?xf32>.

What I was saying about having BitSizeAndAlignmentTypeInterface query LowerToLLVMTypeInterface implies no coupling between them. We just don’t currently have a way to do it without coupling them. It’s conceptually simple: a pipeline author configures BitSizeAndAlignmentTypeInterface with a “fallback” or “extension” (for lack of a better word) that is just a class with some virtual methods that can be consulted as fallbacks if a type doesn’t implement the interface itself.

This is general functionality that would be useful to upgrade our interface system anyway. It seems like a similar mechanism would allow the LLVM dialect to inject a way of resolving LowerToLLVMTypeInterface for builtin types without requiring builtin to depend on llvm.

This is analogous to a problem that arises in programming language design with “trait”-like systems. Using Rust as an example that I’m familiar with, you can say “If a type implements this other trait, then the type implements my trait too”. You can also implement your traits on the builtin types without changing the builtin types. MLIR doesn’t have that kind of flexibility, and I think it’s going to be hard to scale our trait ecosystem without that.

I’m missing what this actually means and how this is relevant: we can’t know this at compile time.

The proposal is about MLIR types carrying relevant data layout information. What’s captured in the spec shouldn’t be concerned about downstream lowering to LLVM except of course being guided by how it was designed in LLVM for reference.

This looks great to me! A few comments.

  1. I think the RFC will benefit from a para (right before or after “Requirements”) on what information we’d like to see captured in the first place as part of the data layout — to start with and going forward in the near future. For eg., you have size, alignment (required and preferred), endianness, and something custom like model for memrefs for now.

  2. What happens if the data layout align attribute says 8 for i64, but an alloc on a memref<i64> op says alignment = 64 : i64 or alignment = 16 : i64?

  3. What about alignment info for vector types as part of data layout and how is that reconciled with the alignment info for the elemental types? Similarly, memref types?

  4. On a very minor note, consider renaming: abi -> align_abi / alignment_abi, preferred -> align_prefer / alignment_preferred.

Isn’t the key for a dictionary attribute always required to be an identifier? How does one encode the type here without being able to create a (dummy) constant of that type? Do specific hardcoded names map to specific types? For eg. memref<i32> vs memref<f32>.

This is a really nice topic, I’d propose we spend some time brainstorming more on this during the ODM tomorrow?

SGTM

[extra characters to satisfy discord]

I think this is an interesting extension, but I don’t really see what kind of infrastructural support it would require. The only common thing between such parameterized interfaces seems to be the scoped lookup. It looks very straightforward to implement, like a dozen of LoC, and different interfaces may want different “compose” rules. Unless we have many such interfaces, it looks easier to just write the lookup for each of them.

Whether we want to a LowerToLLVMTypeInterface is mostly an orthogonal discussion IMO. The memref “convention” is a quirk we have to live with because MLIR doesn’t have a way to annotate non-aliasing function arguments and translate that information to the LLVM IR. Once it does, the “convention” will hopefully disappear.

Why not?

And a side question, how to we do memref-of-memref or memref-of-custom-type?

We can’t know the size of the data the memref points to, but we can know the size of the memref object itself, e.g., sizeof(pointer).

This is a good point. I suppose any information that is relevant to answering the queries exposed by the interface (size, minimum alignment, preferred alignment).

Alignment requirements are implicitly “minimal”, so the final alignment is gcd of required (also probably assuming power-of-two alignments only). If something is required to be aligned at 8, and happens to be also aligned at 64, it’s not a big deal. Same here, the attribute required 8, but the op gave it 64, 8 is still respected.

Up to the type definition in both cases. For vectors, we may consider having flags/enums in the attribute that says how to treat them, e.g. same alignment as elements, different explicitly specified alignment, power-of-two-closest-to-num-elements times element alignment, etc. For memrefs, I’d go for same alignment as pointer/index (note that this does not specify how the data is aligned, only the memref itself).

The top-level attribute is not a dictionary, but a list of custom attributes, each of which is conceptually a key-value pair.

TypeAttr

The are types. Entries for both memref<i32> and memref<f32> (as well as any other MemRefType) will be sent to MemRefType::getSizeInBits(ArrayRef<DataLayoutEntry>). It’s up to the type how to interpret that. For memrefs specifically, I am thinking of only allowing one entry, regardless of the actual type, because they don’t need to change depending on element type.

Thank you for the great discussion today @ftynse. A conversation can be much higher bandwidth than forum posts sometimes, but I still miss whiteboards :slight_smile:

-Chris

I was thinking in terms of keeping the information normalized – we need something that knows how to lower the memref into more primitive types, and so we should be able to infer the BitSizeAndAlignmentTypeInterface from that. However, I think that Chris provided some experience in today’s talk that some amount of denormalization here is useful, which made sense to me. Thus, I have changed my mind about that statement (or at least am on the fence / not feeling very strongly about it).

Thanks for driving this, @ftynse! It looks very promising!

A few comments on the “bare” memrefs:

  1. Annotating non-aliasing function arguments (temporary workaround) is not the only use case. Another use case is to provide an alternative lowering for targets that are not able to deal with the “complexity” of the default memref descriptor. Another one is to model invocations to arbitrary functions from external libraries which take bare pointers as arguments instead of a memref descriptors.
  2. As of today, the bare pointer calling convention only impacts memrefs at the boundaries of a call/function, not all the memrefs. Therefore, the memref lowering to a bare pointer is not a generic type property right now. It depends on the operation in which the memref is being used. This was implemented like this to minimize the customization impact on the LLVM lowering. We could generalize this to apply to all the types, if needed, and have a much better/cleaner implementation if we could decouple the memref lowering from the LLVM lowering itself, as we discussed in the past. This would be great!
  3. To decouple the implementation, as you suggested in the past, we would need a way to represent pointers before the LLVM dialect and add operations to extract/insert pointers from/into a memref. However, this would need a separate discussion.

I hope this clarifies the situation.