Failed to run tensor.insert in scf.for loop

taiqzheng · April 21, 2024, 1:05pm

I’m using insertOp to update tensor value, but result in " legalize operation ‘builtin.unrealized_conversion_cast’ that was explicitly marked illegal".

Here’s the code snippet,(which update three numbers for the initial tensor):

func.func @example() → (tensor<4xf64>) {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.000000e+00 : f64

%num = arith.constant dense<[37.62, 15.43, -2.401]> : tensor<3xf64>
%index = arith.constant dense<[0, 2, 4]> : tensor<3xindex>

%splat = tensor.splat %f0 : tensor<4xf64>
%newTensor = scf.for %iv = %c0 to %c3 step %c1 
    iter_args(%tmpTensor = %splat) -> (tensor<4xf64>){
  %scalar = tensor.extract %num[%iv] : tensor<3xf64>
  %idx = tensor.extract %index[%iv] : tensor<3xindex>
  %nextTensor = tensor.insert %scalar into %tmpTensor[%idx] : tensor<4xf64>
  scf.yield %nextTensor : tensor<4xf64>
}
return %newTensor : tensor<4xf64>

}

The lowering pass i use:

-tensor-bufferize -arith-expand -linalg-bufferize -tensor-bufferize -convert-linalg-to-loops -func-bufferize -arith-bufferize -convert-scf-to-cf -expand-strided-metadata -memref-expand -arith-expand -convert-cf-to-llvm -convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts

The complete error message:

example.mlir:2:11: error: failed to legalize operation ‘builtin.unrealized_conversion_cast’ that was explicitly marked illegal
%c0 = arith.constant 0 : index

When i remove the scf.for loop(also the code in it), everything work fine with the above passes. So it seems like the problem comes from the tensor.insert operation, but result in message for airth.constant. I’m open to any guidance or suggestions you might have, and I’m eager to learn from your expertise.

asiemien · May 2, 2024, 1:41pm

Unrealized conversion cast errors tend to be not the most informative.
But overall, your snippet looks correct at the first glance. The fact that it lowered all the way to reconcile-unrealized-casts suggests that it might be a problem with some lowering or the pipeline.

So, let’s start by reproducing the error. Running the provided lowering passes on the code snippet, I get similar failed to legalize operation 'builtin.unrealized_conversion_cast' error.
In this case, it is often useful to step back one pass and see what’s the IR state just before the reconcile-unrealized-casts pass.

Rerunning the same passes just without the last one. I’d expect all the ops to be at LLVM dialect. However, quick glace over the IR, brings up some bufferization leftovers:

%75 = bufferization.to_memref %71 : memref<4xf64>
%76 = builtin.unrealized_conversion_cast %75 : memref<4xf64> to !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)>
%77 = llvm.extractvalue %37[1] : !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)>

This suggests that something might be wrong with bufferization or rather how it is invoked. These days it is much better to rely on one shot bufferization.

Let’s replace all the individual bufferization passes with -one-shot-bufferize and also enable its bufferize-function-boundaries option to allow the function parameter to be converted to memref.
The new pipeline will be:

-one-shot-bufferize=bufferize-function-boundaries -arith-expand -convert-linalg-to-loops -convert-scf-to-cf -expand-strided-metadata -memref-expand -arith-expand -convert-cf-to-llvm -convert-arith-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts

This successfully produces IR with all operations lowered the LLVM dialect.

If more unrealized conversion cast errors occurred, the above process could be repeated to identify other missing passes and/or their incorrect order.
When a pipeline produces unexpected or invalid results, it is often also helpful to examine IR after each individual pass or at least some key points like right before and after bufferization.

Hopefully this small debugging walkthrough is enough to give you some insight how to handle similar issues in the future

taiqzheng · May 6, 2024, 12:32pm

Thank you very much for your help. The ‘one-shot-bufferize’ pass is indeed useful, especially work well with tensor dialect. My previous idea was to convert tensors into memrefs outside the ‘scf.for’ loop, and then process the memrefs within the loop. This idea works, but in an ugly way. All other parts of the program operate at the tensor level, and here comes up with a memref. Once again, thank you for your detailed debugging process. I think i can handle other similar problems now(rather than simply trying different passes).

asiemien · May 6, 2024, 12:51pm

Generally, it is better to avoid mixing tensor and memref abstractions unless there is a strong reason to do it. But occasionally such approach can have its uses.

And if mixing these abstractions is unavoidable, bufferization dialect can help to bridge the two worlds e.g., bufferization.to_memref/to_tensor/materialize_in_destination.

Topic		Replies	Views
Scf::for content failure when lowering to standard mlir MLIR mlir	2	106	February 3, 2024
Need help in better understanding SCF to SPIR-V lowering MLIR	3	237	July 6, 2023
How to update the step of scf::forop? MLIR	1	213	April 16, 2023
Failure to lower scf::IfOp that yields local values MLIR	8	695	June 17, 2021
Verifiers and conditionally legal operations MLIR	2	125	April 21, 2024

Failed to run tensor.insert in scf.for loop

Related Topics