[RFC] Re-enable zero-sized dim tensors for some TOSA operators

aaronstgeorge · February 26, 2025, 12:02am

Why?

Internally at Cruise, we’ve been using empty tensors created with TOSA when mapping to kernels that have toggle-able features. For example, consider a bias addition for kernel having biased and non-biased variants.

Roughly speaking we have some IR:

%0 = "tosa.const"() <{value = dense<> : tensor<0xi32>}> : () -> tensor<0xi32>
%2 = tcp.custom_op("kernel") %0 %other : tensor<0xi32>, tensor<32x32xi32> -> tensor<32x32xi32>

After several lowering passes, it becomes:

%0 = cruise.allocate_buffer() : !cruise.buffer<0xi32>
%ptr = cruise.get_buffer_ptr(%0) : !llvm.ptr
%2 = llvm.call @ kernel(%ptr, %other) : ( !llvm.ptr,  !llvm.ptr) -> i1

the kernel checks whether the tensor is empty to determine whether a particular feature (like bias) is required.

While zero-sized tensors don’t generally make sense for most TOSA operators, we do have tensor.empty which supports creating a 0-sized tensor. Should tosa.const also allow creating a 0-sized tensor in a similar manner?

sjarus · February 26, 2025, 7:37am

This topic came up during TOSA 1.0 design review. At the time a decision was made to disable support for 0-dim tensors. The reasoning is fairly straightforward- it was unclear how this construct could fit into the functional description of any operator.

How would the construction of this be bounded ? Is it only required for a const or input feeding a non TOSA construct such as a custom op invocation? In other words if a tosa.const with this were ever connected to another TOSA op, can the verifier of the latter legitimately flag invalidity ?

aaronstgeorge · February 26, 2025, 7:15pm

This topic came up during TOSA 1.0 design review. At the time a decision was made to disable support for 0-dim tensors. The reasoning is fairly straightforward- it was unclear how this construct could fit into the functional description of any operator.

Makes sense. At least to me tosa.cosnst is the only TOSA operator where a zero dim tensor could have any useful semantics.

How would the construction of this be bounded ? Is it only required for a const or input feeding a non TOSA construct such as a custom op invocation? In other words if a tosa.const with this were ever connected to another TOSA op, can the verifier of the latter legitimately flag invalidity ?

In our Cruise use case, all instances involve a tosa.const feeding a non-TOSA construct—specifically, custom op invocations. If a zero-sized dim tosa.const result is connected to TOSA op, that op’s verifier should fail. Looking at the implementation, if I’m not mistaken, this change would loosen the type verifier around the tosa.const result; to TensorOf vs TosaTensorOf. The verifiers for inputs to other TOSA ops would remain the same, so all non-tosa.const zero-sized dim verifications would remain.

sjarus · February 26, 2025, 7:33pm

That’s probably too unconstrained. You can leverage existing capabilities in TosaTypesBase.td to define a variant of TosaTensorOf that does not have HasNo0Dimensions , and use that as the output type of tosa::ConstOp .

MaheshRavishankar · February 26, 2025, 7:45pm

I am not a contributor to TOSA, or wouldnt claim any expertise on that, but I really think for rest of the eco-system (tensor/linalg/memref) zero-sized tensors, memrefs are quite a footgun. Also stepping back, it is not clear to me that a zero-sized tensor actually captures anything meangingful about the computation, and has the smell of a compiler-introduced footgun. If we can we should really try to cut off such paths.

sjarus · February 26, 2025, 7:48pm

Indeed.
“0 dim ? What does that even do ?” : me the first time we faceplanted on a unit test that had this. I have no idea how <Nx0xMxDType> would allocate either.

This is the current TOSA choice.

aaronstgeorge · February 26, 2025, 9:54pm

Also stepping back, it is not clear to me that a zero-sized tensor actually captures anything meangingful about the computation, and has the smell of a compiler-introduced footgun

In our case, we have some python stuff (being intentionally vague as it’s closed source) for mapping to custom C++ kernels, we use an empty tensor as a toggle for optional features. For example, if a kernel optionally applies a bias, a non-empty bias tensor will enable the addition, while a zero-dimensional tensor will disable it. This pattern relies on zero-dim tensors specifically to indicate that an optional feature should be skipped.

Concretely, in python we might have something like:

empty_optional = torch.zeros(0) 
kernel(other, empty_optional)

which lowers to MLIR like:

%empty_optional = "tosa.const"() <{value = dense<> : tensor<0xi32>}> : () -> tensor<0xi32>
%2 = tcp.custom_op("kernel") %other %empty_optional : tensor<32x32xi32>,  tensor<0xi32> -> tensor<32x32xi32>

It may be a footgun , but it’s not a compiler-introduced footgun in this case, the user wrote that program.

I have no idea how <Nx0xMxDType> would allocate either

Given IR like the following

%0 = cruise.allocate_buffer() : !cruise.buffer<0xi32>
%ptr = cruise.get_buffer_ptr(%0) : !llvm.ptr
%2 = llvm.call @kernel(%ptr, %other) : (!llvm.ptr,  !llvm.ptr) -> i1

our runtime passes @kernel a pointer to a C++ Tensor type with a .empty() method.

if (!bias.empty()) {
  // add bias
}

so at the runtime level the allocate is just creating an empty container type; like a vector<int> a; in C++.

This is a very odd use-case, the “emptiness semantics" we’re trying to achieve might run counter to TOSA’s intended design. If so, I’m happy to close this and look into alternatives.

Topic		Replies	Views
Where can we put the shared verification among multiple dialect ops? MLIR	4	247	July 5, 2023
[RFC] TOSA-to-Linalg lowering of element-wise ops MLIR	9	877	October 3, 2023
Canonicalization of 'x + (+0.0)' in tosa MLIR	7	720	December 17, 2021
[RFC] Remove tosa.fully_connected operator MLIR	0	280	December 7, 2021
Tosa.tile op only support 1d-4d tensor MLIR	9	377	May 3, 2022

[RFC] Re-enable zero-sized dim tensors for some TOSA operators

Why?

Related topics