Tosa.rescale op

Hi,

Input IR looks like:

func @main(%arg0: tensor<1x2x!quant.uniform<u8:f32, 0.0043027559295296669>> {iree.identifier = “serving_default_input_1:0”, tf_saved_model.index_path = [“input_1”]}) → (tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>> {iree.identifier = “PartitionedCall:0”, tf_saved_model.index_path = [“add”]}) attributes {tf.entry_function = {inputs = “serving_default_input_1:0”, outputs = “PartitionedCall:0”}, tf_saved_model.exported_names = [“serving_default”]} {
%0 = “tosa.rescale”(%arg0) {double_round = false, input_zp = 0 : i32, multiplier = [1073741824 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<u8:f32, 0.0043027559295296669>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>
%1 = “tfl.quantize”(%0) {qtype = tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>
%2 = tfl.add(%1, %1) {fused_activation_function = “NONE”} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>, tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>
%3 = “tfl.quantize”(%2) {qtype = tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>>} : (tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>
%4 = “tosa.rescale”(%3) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>) → tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>>
return %4 : tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>>
}

used iree-import-tflite --mlir-print-ir-after-all add.tflite -o add.mlir to get tosa dialect.

// -----// IR Dump After TosaLegalizeTFTFLPass //----- //
func @main(%arg0: tensor<1x2x!quant.uniform<u8:f32, 0.0043027559295296669>> {iree.identifier = “serving_default_input_1:0”, tf_saved_model.index_path = [“input_1”]}) → (tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>> {iree.identifier = “PartitionedCall:0”, tf_saved_model.index_path = [“add”]}) attributes {tf.entry_function = {inputs = “serving_default_input_1:0”, outputs = “PartitionedCall:0”}, tf_saved_model.exported_names = [“serving_default”]} {
%0 = “tosa.rescale”(%arg0) {double_round = false, input_zp = 0 : i32, multiplier = [1073741824 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<u8:f32, 0.0043027559295296669>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>
%1 = “tosa.rescale”(%0) {double_round = true, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>
%2 = “tosa.rescale”(%1) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [11 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2xi32>
%3 = “tosa.rescale”(%1) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [11 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0043027559295296669:-128>>) → tensor<1x2xi32>
%4 = “tosa.add”(%2, %3) : (tensor<1x2xi32>, tensor<1x2xi32>) → tensor<1x2xi32>
%5 = “tosa.rescale”(%4) {double_round = true, input_zp = 0 : i32, multiplier = [1073741824 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [50 : i32]} : (tensor<1x2xi32>) → tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>
%6 = “tosa.rescale”(%5) {double_round = true, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = -128 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>) → tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>
%7 = “tosa.rescale”(%6) {double_round = false, input_zp = -128 : i32, multiplier = [1073741824 : i32], output_zp = 0 : i32, per_channel = false, scale32 = true, shift = [30 : i32]} : (tensor<1x2x!quant.uniform<i8:f32, 0.0086055118590593338:-128>>) → tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>>
return %7 : tensor<1x2x!quant.uniform<u8:f32, 0.0086055118590593338>>
}

here may I know how shift,output_zp,input_zp,double_round values are calculated for tosa .rescle from tfl.quantize

Thanks

For the TFL quantize op, the legalization code can be found in the TensorFlow tree. The area of interest is here: tensorflow/legalize_tfl.cc at master · tensorflow/tensorflow (github.com)
This appears to be a QuantizeOp where both input and output are quantized tensors, so the input and output zero points come from the input and output tensors. The shift and multiplier are calculated from the ratio of the input and output scales. The calculation for this lives in the MLIR tree: llvm-project/QuantUtils.cpp at main · llvm/llvm-project (github.com)
double_round is always true when converting the TFL quantize op, as that gives us the matching behavior of the TFL implementation.

Eric