Tfl.mul to tosa lowering in mlir

I have tfl dialect as:

%1 = tfl.mul(%0, %0) {fused_activation_function = “NONE”} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>, tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0047209961339831352:-128>>

But when convert this to tosa :
%1 = “tosa.rescale”(%0) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2xi8>) → tensor<1x2xi32>
%2 = “tosa.rescale”(%0) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2xi8>) → tensor<1x2xi32>
%3 = “tosa.mul”(%1, %2) {shift = 0 : i32} : (tensor<1x2xi32>, tensor<1x2xi32>) → tensor<1x2xi32>
%4 = “tosa.rescale”(%3) {double_round = true, input_zp = 0 : i32, multiplier = [1077952512 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [38 : i32]} : (tensor<1x2xi32>) → tensor<1x2xi8>

Here I wanted to understand why rescaling i8 to i32 is done before passing inputs to tosa.mul

Thanks

It uses rescale to subtract out the zero point. When you subtract the zero point, you need at least a 9-bit value to hold the result. The scale factor is set to 1.0, so it doesn’t change the values.

The TFLite kernels perform the same zero point subtract internal to the operator. TOSA makes it explicit.