# TensorList as tensor of tensors?

If tensors in MLIR allowed a tensor of tensors then tensorLists can fall out of this?

List of tensors can be

``````tensor<5xtensor<2x3xf32>>
or
tensor<?xtensor<2x3xf32>>
``````

List of scalars can be

``````tensor<?xtensor<f32>>
``````

List of list of tensors can be

``````tensor<?xtensor<?xtensor<2x3xf32>>>
``````

Shape type inference will also nicely flow.

Currently tensors cannot have another tensor as an elementType.

But is this something we could do? (Vector of tensors would make sense too I guess but we mostly deal with tensors)

Something unclear to me is: what is the difference between `tensor<5xtensor<2x3xf32>>` and `tensor<5x2x3xf32>>`?
(or rather why do we need to differentiate / when does it matter?)

Good question, let me add context, when one tries to add control flow even by using `scf.while`,

then in the gradient of the while loop you want to take the intermediates in the forward pass for each iteration.

So that is essentially a stack, which is done with a stateless tensorList with push in forward while and pop in the gradient while (reverses order nicely)

So while :

``````tensor<5x2x3xf32>
``````

is an extra dimension

a tensorList is a list of tensors of the shape `<2x3xf32>` which one wants to push and pop to.

You could work with concat and slice to do many copies, but a list allows you to just pass around pointers and avoid a whole bunch of copies at runtimeâ€¦

Summary, it is a different semantic structure, it clarifies what the â€ślayoutâ€ť of objects / data is, one is a collection of tensors, another is an extra dimension thus a â€ś1 bigger tensorâ€ť

That makes sense, but Iâ€™m question why the right type to model this is then â€śtensorâ€ť and not â€śtensor_listâ€ť or something like that

Itâ€™s a little bit hard to reason about this in the abstract. In TF for example, the operations defined for tensorlists make it pretty clear that it is more list-like than tensor-like: in TF 2.0 eager mode, in fact, it is just implemented as a Python `list`. There are various access patterns on them that can/should be canonicalized down to simple (non-nested) tensors.

In IREE, where we have taken opinions on runtime semantics, we lower all such things coming from TF to a real `list` type containing ref-counted, allocated arrays. There is not enough machinery upstream to represent such things cleanly, and pushing it further would put upstream firmly in the position of taking opinions that it currently avoids (not saying it shouldnâ€™t get there but that it is not a simple gap - it forces a number of other design decisions).

Right, we can have a special tensor_list type, but what we really just wanted was a tensor of tensors, tensor is a collection of elements, element here being a tensor.

Nested loops are going to have list of lists for intermediatesâ€¦ it is recursive at that point, extending a tensor type just seemed like a clean idea to some us, except MLIR has validation code forbidding tensor of tensors completely.

It is a question of reusing existing types verses making new tensor_list type. Currently MLIR doesnâ€™t have a canonical solution to the entire space itself, maybe it is on purpose but gradients of control flow has to be solved at some point anyways.

TF dialect did some games with `opaque` type and making `tf.variant`, but it kinda seems unintuitive TBH, it completely drops types for more than 1 recursion and somehow mixes optionals and lists (it is might be just us not getting it).

Tensor of tensors is just very clear in what it means and a lot of the code for shapes just carries forward.

Agreed, whether it be tensor of tensors or a tensor_list type, something should be done in MLIR (builtin types?) for this, it is a common problem which needs a standard solution sooner or later.

I also agree a tensor of tensors (or a list) is essentially a set of refcounted tensors under the hood, that can be a runtime/lowering detail left upto the lower dialects or runtimes.

Probably if tensor of tensors were legal, everyone could just use those whenever lists happened, unless we are missing something.

You still havenâ€™t mentioned which environment you are coming from. Is this one of the ml frameworks, something custom, etc? To my knowledge, they all do this differently and runtimes also all have their own mechanisms. We were unable to find a universal mechanism to reduce them all to and just chose to embrace the weird that exists.

Back when I used to work closely on this aspect of Tensorflow, we had discussed doing away with the variant based type erasing and just give it a real tensor list type in the IR, but it is still a very Tensorflow specific topic and would belong with that project (some of the ops and rules it implements are really quite weird and defy a completely generic implementation that would be used anywhere else). IREE does some type inference to get it into this form, but there was never any consensus to actually change Tensorflow at the source, and since the topic is moot on the others, we stopped putting energy into it.

Right now llvm/mlir doesnâ€™t really have a â€śnorth starâ€ť ML integrations project which tries to model such frontend and runtime characteristics. Perhaps it should, but it isnâ€™t entirely clear what opinions it would take on the esoteric parts like this.

The dialect â€śfor discussionâ€™s sakeâ€ť would be closest to something between TF dialect and TOSA/HLO dialect. With a bunch of ML ops like conv2D, pool, etc, along with the basic math ops which work on tensors of course.

We are probably actually looking for that same â€śnorth starâ€ť and whatever is missing we add it for ourselves and try to upstream when possible.

We donâ€™t want to change TF, we just want to represent lists for ourselves, there seem to be 2 ways (that we know of)

have tensor of tensors or have a list type.

Since tensor of tensors are illegal we will probably be forced to do the latter, but we just wanted to ask maybe the former is nicer and can be legal? I havenâ€™t heard from anyone why tensor of tensors might be bad, apart from the work involvedâ€¦

It isnâ€™t clear to me that a list is the same as a â€śtensor of tensorâ€ť, for example wouldnâ€™t a list have a dynamic size? You mentioned before wanting to push and pop for example.

Then there is the question of uniformity: would `tensor<2xtensor<?xf32>>` guarantee that the inner tensors have the same size?

Ok, thanks - didnâ€™t mean to pry but was just wondering if it was a case I had already studied at some point.

I kind of agree with Mehdi below on the nested tensor semantics questions. When Iâ€™ve seen this modeled before with just tensors, it has been really restrictive and complicated and not been a great match to the source language â€“ kind of one of those cases that is really calling for a real type and supporting ops to capture what is desired. Your case may be different, but sounds similar to ones Iâ€™ve struggled through semantic mismatches on.

Iâ€™d introduce a list type, supporting ops and folding/simplifications to simple dense tensors for conforming cases. I donâ€™t know if such a thing belongs upstream (ie. In the tensor dialect), but it might â€“ especially if being designed fresh/cleanly. The other examples weâ€™ve had all came with a lot of historical baggage and were never really defined to a level of fidelity that we expect for the core mlir project.

Great question again, and to that I point to the history of TF again, they used to have stateful tensorArrays with dynamic sizes, but then they moved to stateless TensorLists

When one pushes or pops from these lists (variant based) they get back a new list with a new size. So semantically it is NOT a dynamic size list it is literally fixed size.

The TF runtime notices that the input list is never used again and then mutates the list in place, it is a lower level dialect or runtime optimization. Thus tensor of tensor objects can be valid representation.

As to the question of uniformity:

It is uniform in the fact that the tensor of tensor objects will have types `tensor<?xf32>`

`tensor<2xtensor<?xf32>>`

beyond that it can be a tensor with 2 tensors in it where the 2 tensors CAN have 2 different sizes (same rank though), this is absolutely necessary because it is an actual use case, imagine a 2 iteration while loop with a concat going on inside it, the intermediate spit out for use in gradient pass added to this list will be sizes:

`tensor<1xf32>`
`tensor<2xf32>`

This is actually something missing in the scf dialect type validation we pointed out in previous posts and are trying to upstream a potential solution for.

So to your point, we donâ€™t intend to match a source of any language here, rather we want the source to follow what makes it easy in MLIR, so in our specific case we have no baggage.

Given that, if we had a tensor of tensors and the flexibility to do the â€śright thingâ€ť in the API then, what can be a blocker from having tensor of tensors be a valid type in MLIR?

The problem with list type seems, I canâ€™t see (especially in stateless list) how it is actually different than a tensor of tensorsâ€¦ and I would prefer reusing existing concepts than making a new one.

I am not married to tensor of tensors I am happy to get a reason why it is a bad idea so I may live in peace and focus on a specific list type then

Iâ€™m not married one way or another either, and Iâ€™m also not the best person to reason through the implications/intents behind the built-in tensor type (some other folks do have opinions on that and probably best to wait for non-weekend hours for a real discussion).

My main visibility into the usage comes with what we do with such types during lowering. There is always the TensorFlow runtime approach which uses a lot of runtime slight of hand to allow its tensor types to represent anything, but more from a compiler oriented system, one of the first things we do is separate such types depending on whether they contain things with value semantics or reference semantics in our setup: for a high performance implementation, there is very little overlap between those worlds and most of the machinery is dedicated to the containers that hold value/non-ref components, with the ref-containing containers all being simplified down to some variant of flat lists with runtime support for managing the ref-counts.

From that perspective, so long as the choices in the type system allow us to write transformations/canonicalizations which flatten everything that can be flattened, and so long as we have non ambiguous ways to separate the worlds during lowering between value/ref-containing tensors and perform other transformations, such as realizing a mutable list-of-array-refs from the original program â€“ it is fine with me. I will note that other frontends like PyTorch and various-Numpy oriented things just take the easy path and either a) use the mutability of their tensors and differentiation model to express these kind of computations, or b) just use a real mutable list type in the frontend.

My experience with TensorFlow in this area has been very non positive, and especially with seeing other options, my baggage is that I tend to bias towards a more complete frontend type system at this level vs an ever expanding definition of `tensor` into things that actually donâ€™t have anything to do with the primary goal of numerical optimizations of tensors. I may be over-correcting

A couple of heuristics Iâ€™ve used in the past to determine whether things â€śfitâ€ť in the `tensor` type:

• How would one express constants of the expanded type, and does that work with the way the attributes are modeled?
• Is this a job for a dialect type â€“ either a domain-specific replacement for `TensorType` or a dialect type for the element type (tensors may contain arbitrary dialect types, where it is then assumed that the dialect has defined the semantics)?
• Is the expanded type congruent with the decisions made in various bufferization approaches for realizing concrete buffers for tensors (or is this defining more of a type island)?

Does MLIR have a `ragged_tensor` type? Perhaps tensor lists could be modeled as a ragged tensors? [edit: was `ragged_tensor<?xtensor<?xf32>`, but I did not mean to be that specific]

No, it doesnâ€™t - and Iâ€™m not aware of any mlir based frontends that have modeled that in their IR either.

You definitely have some great points there, seems we just have to go with a custom List type for now, but I think it might be worth consideration to model lists in core MLIR somehow.

This has come up a few times in the things I have visibility into, and is surprisingly hard to model something abstract that is both generically useful and not tightly coupled to source or target concepts that MLIR core does not currently take opinions on (i.e. memory management semantics, mutability, value vs ref, ownership, heterogenous/homogenous, etc).

Some of the existing items:

• (npcomp) basicpy.list : decidedly quite â€śpythonâ€ť (also differentiated from `tuple`) when considering the operations defined for it. Currently standing in for TorchScript lists as well (but needs to be extended with type constraints).
• (iree) `vm.list` : Models the IREE virtual machineâ€™s built-in list type with ops for manipulating it. A subset of the functionality is also mirrored on the `iree.list` type, which is the â€śpublicâ€ť analog to `vm.list` (i.e. suitable for interop with IREE from the outside). This is type erased at runtime (variant), mutable, resizable, and able to store primitive or reference-counted VM objects. It crashes on illegal accesses.
• (iree) `tf_tensorlist.list` : Attempts to model a TensorFlow TensorList as a discrete type following some type inference to raise it from `tf` `tensor`/`variant` types. When compiling we lower this form to something like ireeâ€™s `vm.list` (but which predated `vm.list` and we are working to normalize things).

The last one may be somewhat like what you are looking for, but to my eye it is quite domain specific and isnâ€™t really a â€śuniversalâ€ť list type. It is very different both in level and capabilities to just the two others listed here. Weâ€™d be open to contributing the `tf_tensorlist.list` somewhere more useful, but discussions with the TF team to normalize any of this didnâ€™t go anywhere and we needed something. That was a while ago, though, and there may be different results nowâ€¦

In the absence of a real universally useful abstraction, it is perfectly fine to have domain specific types and ops â€“ in fact, it is even preferred if there is enough â€śweirdâ€ť that needs to be modeled such that a more common form would lose information.

1 Like

Ignoring the question of how best to model the real use cases (like a list-like thing holding tensors that are necessary for backprop, which is a valid use case)â€¦

I would support removing the â€śno tensors of tensorsâ€ť verifier restriction if that happens to help anyone (not saying itâ€™s a good approach). We allow arbitrary dialect-specific types, which clearly indicates that there are no real requirements or rationale for restricting the contained type. E.g. a user could implement a `!mydialect.tensor_element_type_wrapper<tensor<?xf32>>` as the element type to fool this verification â€“ letâ€™s just not have the constraint at all.

Or to put it another way: if the semantics of `tensor` are sufficiently broad to permit `!mydialect.tensor_element_type_wrapper<tensor<?xf32>>` as an element type, then clearly they must be broad enough to permit tensor itself as an element type.

1 Like