I have tfl dialect as:
%1 = tfl.mul(%0, %0) {fused_activation_function = “NONE”} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>, tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0047209961339831352:-128>>
But when convert this to tosa :
%1 = “tosa.rescale”(%0) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2xi8>) → tensor<1x2xi32>
%2 = “tosa.rescale”(%0) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2xi8>) → tensor<1x2xi32>
%3 = “tosa.mul”(%1, %2) {shift = 0 : i32} : (tensor<1x2xi32>, tensor<1x2xi32>) → tensor<1x2xi32>
%4 = “tosa.rescale”(%3) {double_round = true, input_zp = 0 : i32, multiplier = [1077952512 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [38 : i32]} : (tensor<1x2xi32>) → tensor<1x2xi8>
Here I wanted to understand why rescaling i8 to i32 is done before passing inputs to tosa.mul
Thanks