[RFC] Re-enable zero-sized dim tensors for some TOSA operators

Why?

Internally at Cruise, we’ve been using empty tensors created with TOSA when mapping to kernels that have toggle-able features. For example, consider a bias addition for kernel having biased and non-biased variants.

Roughly speaking we have some IR:

%0 = "tosa.const"() <{value = dense<> : tensor<0xi32>}> : () -> tensor<0xi32>
%2 = tcp.custom_op("kernel") %0 %other : tensor<0xi32>, tensor<32x32xi32> -> tensor<32x32xi32>

After several lowering passes, it becomes:

%0 = cruise.allocate_buffer() : !cruise.buffer<0xi32>
%ptr = cruise.get_buffer_ptr(%0) : !llvm.ptr
%2 = llvm.call @ kernel(%ptr, %other) : ( !llvm.ptr,  !llvm.ptr) -> i1

the kernel checks whether the tensor is empty to determine whether a particular feature (like bias) is required.

While zero-sized tensors don’t generally make sense for most TOSA operators, we do have tensor.empty which supports creating a 0-sized tensor. Should tosa.const also allow creating a 0-sized tensor in a similar manner?

This topic came up during TOSA 1.0 design review. At the time a decision was made to disable support for 0-dim tensors. The reasoning is fairly straightforward- it was unclear how this construct could fit into the functional description of any operator.

How would the construction of this be bounded ? Is it only required for a const or input feeding a non TOSA construct such as a custom op invocation? In other words if a tosa.const with this were ever connected to another TOSA op, can the verifier of the latter legitimately flag invalidity ?

This topic came up during TOSA 1.0 design review. At the time a decision was made to disable support for 0-dim tensors. The reasoning is fairly straightforward- it was unclear how this construct could fit into the functional description of any operator.

Makes sense. At least to me tosa.cosnst is the only TOSA operator where a zero dim tensor could have any useful semantics.

How would the construction of this be bounded ? Is it only required for a const or input feeding a non TOSA construct such as a custom op invocation? In other words if a tosa.const with this were ever connected to another TOSA op, can the verifier of the latter legitimately flag invalidity ?

In our Cruise use case, all instances involve a tosa.const feeding a non-TOSA construct—specifically, custom op invocations. If a zero-sized dim tosa.const result is connected to TOSA op, that op’s verifier should fail. Looking at the implementation, if I’m not mistaken, this change would loosen the type verifier around the tosa.const result; to TensorOf vs TosaTensorOf. The verifiers for inputs to other TOSA ops would remain the same, so all non-tosa.const zero-sized dim verifications would remain.

That’s probably too unconstrained. You can leverage existing capabilities in TosaTypesBase.td to define a variant of TosaTensorOf that does not have HasNo0Dimensions , and use that as the output type of tosa::ConstOp .

I am not a contributor to TOSA, or wouldnt claim any expertise on that, but I really think for rest of the eco-system (tensor/linalg/memref) zero-sized tensors, memrefs are quite a footgun. Also stepping back, it is not clear to me that a zero-sized tensor actually captures anything meangingful about the computation, and has the smell of a compiler-introduced footgun. If we can we should really try to cut off such paths.

Indeed.
“0 dim ? What does that even do ?” : me the first time we faceplanted on a unit test that had this. I have no idea how <Nx0xMxDType> would allocate either.

This is the current TOSA choice.

Also stepping back, it is not clear to me that a zero-sized tensor actually captures anything meangingful about the computation, and has the smell of a compiler-introduced footgun

In our case, we have some python stuff (being intentionally vague as it’s closed source) for mapping to custom C++ kernels, we use an empty tensor as a toggle for optional features. For example, if a kernel optionally applies a bias, a non-empty bias tensor will enable the addition, while a zero-dimensional tensor will disable it. This pattern relies on zero-dim tensors specifically to indicate that an optional feature should be skipped.

Concretely, in python we might have something like:

empty_optional = torch.zeros(0) 
kernel(other, empty_optional)

which lowers to MLIR like:

%empty_optional = "tosa.const"() <{value = dense<> : tensor<0xi32>}> : () -> tensor<0xi32>
%2 = tcp.custom_op("kernel") %other %empty_optional : tensor<32x32xi32>,  tensor<0xi32> -> tensor<32x32xi32>

It may be a footgun :sweat_smile:, but it’s not a compiler-introduced footgun in this case, the user wrote that program.

I have no idea how <Nx0xMxDType> would allocate either

Given IR like the following

%0 = cruise.allocate_buffer() : !cruise.buffer<0xi32>
%ptr = cruise.get_buffer_ptr(%0) : !llvm.ptr
%2 = llvm.call @kernel(%ptr, %other) : (!llvm.ptr,  !llvm.ptr) -> i1

our runtime passes @kernel a pointer to a C++ Tensor type with a .empty() method.

if (!bias.empty()) {
  // add bias
}

so at the runtime level the allocate is just creating an empty container type; like a vector<int> a; in C++.


This is a very odd use-case, the “emptiness semantics" we’re trying to achieve might run counter to TOSA’s intended design. If so, I’m happy to close this and look into alternatives.