Missing translations when using tf-opt to translate to tosa dialect

Hi! I’m new here so please let me know if this question is better suited for elsewhere.

I am trying to translate some of the hugging-face tensorflow models to tosa mlir, but I am running into some trouble as it seems like the translations of some tensorflow operations are dropped.
First I use the following python code to translate the model to a pbtxt:

import tensorflow as tf
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
from transformers import AutoConfig, TFAutoModel

config = AutoConfig.from_pretrained('bert-base-cased')
model = TFAutoModel.from_config(config)
# Convert model to ConcreteFunction
full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
    tf.TensorSpec(model.dummy_inputs["input_ids"].shape, model.dummy_inputs["input_ids"].dtype))
# frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()
print("Frozen model inputs: ")
print(frozen_func.inputs)
print("Frozen model outputs: ")
print(frozen_func.outputs)
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
                  logdir='',
                  name=f"frozen_model_bert.pbtxt",
                  as_text=True)

Then I use tf-mlir-translate to translate the resulting pbtxt into tf dialect using this command:

tf-mlir-translate --tf-input-arrays=x:0 --tf-input-shapes=3,5 --tf-output-arrays=Identity:0,Identity_1:0 --tf-enable-shape-inference-on-import --graphdef-to-mlir frozen_model_bert.pbtxt -o bert_tf.mlir

Finally, I use tf-opt to translate tf dialect to tosa using this command:

tf-opt --tf-to-tosa-pipeline --tf-executor-break-up-islands --tf-executor-graph-pruning  bert_tf.mlir -o bert_tosa.mlir

I am able to generate the tosa mlir file without errors, but the resulting mlir has a mix of the tf dialect and the tosa dialect. For example here is a couple of snippets of the generated MLIR:

%outputs_62, %control_63 = tf_executor.island wraps "tosa.const"() {value = dense<0> : tensor<i32>} : () -> tensor<i32>
%outputs_64, %control_65 = tf_executor.island wraps "tf.Range"(%outputs_62, %outputs_60, %outputs_58) {device = ""} : (tensor<i32>, tensor<i32>, tensor<i32>) -> tensor<5xi32>
%outputs_66, %control_67 = tf_executor.island wraps "tf.ExpandDims"(%outputs_64, %outputs_26) {device = ""} : (tensor<5xi32>, tensor<i32>) -> tensor<1x5xi32>
%outputs_68, %control_69 = tf_executor.island wraps "tf.GatherV2"(%outputs_34, %outputs_66, %outputs_32) {batch_dims = 0 : i64, device = ""} : (tensor<512x768xf32>, tensor<1x5xi32>, tensor<i32>) -> tensor<1x5x768xf32>
%outputs_70, %control_71 = tf_executor.island wraps "tf.Tile"(%outputs_68, %outputs_56) {device = ""} : (tensor<1x5x768xf32>, tensor<3xi32>) -> tensor<3x5x768xf32>
%outputs_72, %control_73 = tf_executor.island wraps "tosa.const"() {value = dense<0.000000e+00> : tensor<768xf32>} : () -> tensor<768xf32>
%outputs_1195, %control_1196 = tf_executor.island wraps "tosa.add"(%outputs_1193, %outputs_40) : (tensor<3x5x768xf32>, tensor<3x5x768xf32>) -> tensor<3x5x768xf32>
%outputs_1197, %control_1198 = tf_executor.island wraps "tf.Mean"(%outputs_1195, %outputs_52) {device = "", keep_dims = true} : (tensor<3x5x768xf32>, tensor<1xi32>) -> tensor<3x5x1xf32>
%outputs_1199, %control_1200 = tf_executor.island wraps "tosa.identity"(%outputs_1197) : (tensor<3x5x1xf32>) -> tensor<3x5x1xf32>
%outputs_1201, %control_1202 = tf_executor.island wraps "tosa.sub"(%outputs_1195, %outputs_1199) : (tensor<3x5x768xf32>, tensor<3x5x1xf32>) -> tensor<3x5x768xf32>
%outputs_1203, %control_1204 = tf_executor.island wraps "tosa.mul"(%outputs_1201, %outputs_1201) {shift = 0 : i32} : (tensor<3x5x768xf32>, tensor<3x5x768xf32>) -> tensor<3x5x768xf32>

Is there a way to get a full translation to tosa? Looking at legalize_tf.cc it looks like most of these ops should be supported? Am I going about attempting to translate the model correctly, or is there a better way to do it?

1 Like

(Drive by) I don’t know the TOSA flow well, but this seems to need different passes. I’d expect one to run graph pruning first and not do any island breakup (as you aren’t exporting back to graphdef but heading to TOSA). Did you find this same flow the test directory?

The tests in the tosa test directory only contained tests for going from the TensorFlow (or tflite) mlir dialect to tosa and doing that only seems to require the tf-to-tosa-pipeline pass. The output from tf-mlir-translate was in the tf_executor dialect, and I had used the tf-executor-break-up-islands and tf-executor-graph-pruning passes to clean up the output a bit. The missing translations still exist without these options enabled. Is the fact that I am trying to do tosa translation on the tf_executor dialect potentially causing issues? / Is there any way to make tf-mlir-translate output the pure TensorFlow dialect instead of the tf_executor dialect?

This should be a no-op on the output of tf-mlir-translate when importing a GraphDef, but also the opposite of what’s needed I believe. You should try something like tf-standard-pipeline instead

1 Like

No, translate is meant to be very direct from external format to MLIR. It is meant to be as simple and direct as possible while converting between different data formats. TF dialect requires additional jumps (and the current import translation is already a bit much [Aside: We are going to change to it producing TFG dialect which is even more direct, and then to executor and TF dialect would be passes.])

Using the option/pass pipeline that Mehdi suggested would result in doing island coarsening and after that more legalizations should work. Let us know. One could also look at the TFlite converter and the passes it runs, of course there are special cases there and at some point the pipelines would diverge and be TFLite specific, but it is an end-to-end flow from TF Graph/GraphDef to TFL dialect.

1 Like

Thank you for all the suggestions so far! I have been able to use the tf-standard-pipeline in tf-opt to convert from the tf_executor dialect to the standard tf dialect. i.e. After using tf-mlir-translate:
First I do:

tf-opt --tf-standard-pipeline bert_tf.mlir -o bert_tf_opt_standard.mlir

Then I do:

tf-opt --tf-to-tosa-pipeline bert_tf_opt_standard.mlir -o bert_tf_opt_tosa.mlir

Now when I run the tf-to-tosa-pipeline I now only get one error:

bert_tf_opt.mlir:989:12: error: 'tf.MatMul' op MatMul: a/b/output rank must match
    %889 = "tf.MatMul"(%888, %78) {device = "", transpose_a = false, transpose_b = false} : (tensor<3x768xf32>, tensor<768x768xf32>) -> tensor<3x768xf32>
           ^
bert_tf_opt.mlir:989:12: note: see current operation: %1260 = "tf.MatMul"(%1259, %84) {device = "", transpose_a = false, transpose_b = false} : (tensor<3x1x768xf32>, tensor<768x768xf32>) -> tensor<3x768xf32>

On this section of code:

%888 = "tf.StridedSlice"(%886, %cst_3, %cst_2, %cst_1) {begin_mask = 1 : i64, device = "", ellipsis_mask = 0 : i64, end_mask = 1 : i64, new_axis_mask = 0 : i64, shrink_axis_mask = 2 : i64} : (tensor<3x5x768xf32>, tensor<2xi32>, tensor<2xi32>, tensor<2xi32>) -> tensor<3x768xf32>
%889 = "tf.MatMul"(%888, %78) {device = "", transpose_a = false, transpose_b = false} : (tensor<3x768xf32>, tensor<768x768xf32>) -> tensor<3x768xf32>

It looks like the translation for TFStridedSlice to tosa didn’t preserve the output shape of the TF op <3x768xf32> and the output shape was instead <3x1x768xf32>. Adding a check to the code for TFStridedSlice for this condition and inserting a reshape op if the shapes weren’t equal made this error go away. (I inserted the following code at the end of the reverseNegativeStride function in legalize_common.cc in tosa directory):

RankedTensorType tf_out_type = op->getResult(0).getType().dyn_cast<RankedTensorType>();
RankedTensorType tosa_out_type = input.getType().dyn_cast<RankedTensorType>(); 
if(tf_out_type.getShape() != tosa_out_type.getShape()){
    auto output_reshape_op = CreateOpAndInfer<tosa::ReshapeOp>(
      rewriter, op->getLoc(),
      RankedTensorType::get(tf_out_type.getShape(), tf_out_type.getElementType()),
      input, rewriter.getI64ArrayAttr(tf_out_type.getShape()));
    input = output_reshape_op.getResult(); 
}
return input;

Now I can generate full tosa code from the model with only one TF op untranslated (tf::ErfOp) but it doesn’t seem like this op is implemented yet in legalize_tf.cc so I guess this is probably expected. Snippet of resulting output:

%964 = "tosa.mul"(%962, %8) {shift = 0 : i32} : (tensor<3x5x3072xf32>, tensor<1x1x1xf32>) -> tensor<3x5x3072xf32>
%965 = "tf.Erf"(%964) {device = ""} : (tensor<3x5x3072xf32>) -> tensor<3x5x3072xf32>
%966 = "tosa.add"(%965, %7) : (tensor<3x5x3072xf32>, tensor<1x1x1xf32>) -> tensor<3x5x3072xf32>
%967 = "tosa.mul"(%963, %966) {shift = 0 : i32} : (tensor<3x5x3072xf32>, tensor<3x5x3072xf32>) -> tensor<3x5x3072xf32>
%968 = "tosa.reshape"(%967) {new_shape = [1, 15, 3072]} : (tensor<3x5x3072xf32>) -> tensor<1x15x3072xf32>
%969 = "tosa.matmul"(%968, %17) : (tensor<1x15x3072xf32>, tensor<1x3072x768xf32>) -> tensor<1x15x768xf32>
%970 = "tosa.reshape"(%969) {new_shape = [3, 5, 768]} : (tensor<1x15x768xf32>) -> tensor<3x5x768xf32>

FYI, one invocation should work as well, you can chain invocation of passes and pipeline: tf-opt --tf-standard-pipeline --tf-to-tosa-pipeline bert_tf.mlir -o bert_tf_opt_tosa.mlir

1 Like

Sorry about the delay in responding to this - I didn’t see the question over the holidays!

We do the following when doing this pipeline:

tf-mlir-translate --graphdef-to-mlir [--tf-output-arrays=<insert output nodes> --tf-input-arrays=<input nodes> --tf-input-shapes=<input shapes> --tf-enable-shape-inference-on-import] yourmodel.pb | tf-opt --tf-executor-to-functional-conversion --tf-to-tosa-pipeline] -o yourmodel.tosa.mlir

1 Like