How to access an element inside a MLIR Value isntance?

I’m currently writing a rewrite pattern and struggling to access an element inside a MLIR Value instance.

func.func @main(%arg0: tensor<1x4x4x64xf32> {tf_saved_model.index_path = ["input_2"]}) -> (tensor<1x4x4x64xf32> {tf_saved_model.index_path = ["group_normalization"]}) attributes {tf.entry_function = {inputs = "serving_default_input_2:0", outputs = "StatefulPartitionedCall:0"}, tf_saved_model.exported_names = ["serving_default"]} {
  %cst = arith.constant dense<[1, 4, 4, 64]> : tensor<4xi32>
  %cst_0 = arith.constant dense<0.000000e+00> : tensor<32x2xf32>
  %cst_1 = arith.constant dense<[1, 4, 4, 32, 2]> : tensor<5xi64>
  %cst_2 = arith.constant dense<[1, 1, 1, 32, 2]> : tensor<5xi64>
  %cst_3 = arith.constant dense<1.000000e-03> : tensor<1x1x1x32x1xf32>
  %cst_4 = arith.constant dense<[1, 2, 4]> : tensor<3xi32>
  %cst_5 = arith.constant dense<[1, 4, 4, 32, 2]> : tensor<5xi32>
  %0 = call @func_0_CPU_FLOAT(%arg0, %cst_5, %cst_4, %cst_3, %cst_2, %cst_1, %cst_0, %cst) {tac.device = "CPU", tac.inference_type = "FLOAT", tac.interface_name = "func_0"} : (tensor<1x4x4x64xf32>, tensor<5xi32>, tensor<3xi32>, tensor<1x1x1x32x1xf32>, tensor<5xi64>, tensor<5xi64>, tensor<32x2xf32>, tensor<4xi32>) -> tensor<1x4x4x64xf32>
  return %0 : tensor<1x4x4x64xf32>
}
func.func private @func_0_CPU_FLOAT(%arg0: tensor<1x4x4x64xf32>, %arg1: tensor<5xi32>, %arg2: tensor<3xi32>, %arg3: tensor<1x1x1x32x1xf32>, %arg4: tensor<5xi64>, %arg5: tensor<5xi64>, %arg6: tensor<32x2xf32>, %arg7: tensor<4xi32>) -> tensor<1x4x4x64xf32> attributes {tac.cost = 0x4B18A2E4 : f32, tac.device = "CPU", tac.inference_type = "FLOAT", tac.interface_name = "func_0"} {
  %cst = arith.constant dense<[1, 4, 4, 32, 2]> : tensor<5xi32>
  %cst_0 = arith.constant dense<[1, 2, 4]> : tensor<3xi32>
  %cst_1 = arith.constant dense<[1, 1, 1, 32, 2]> : tensor<5xi64>
  %cst_2 = arith.constant dense<[1, 4, 4, 32, 2]> : tensor<5xi64>
  %cst_3 = arith.constant dense<[1, 4, 4, 64]> : tensor<4xi32>
  %0 = "tfl.reshape"(%arg0, %cst) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x64xf32>, tensor<5xi32>) -> tensor<1x4x4x32x2xf32>
  %1 = "tfl.mean"(%0, %cst_0) {keep_dims = true, tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x32x2xf32>, tensor<3xi32>) -> tensor<1x1x1x32x1xf32>
  %2 = tfl.sub(%0, %1) {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x32x2xf32>, tensor<1x1x1x32x1xf32>) -> tensor<1x4x4x32x2xf32>
  %3 = "tfl.square"(%2) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x32x2xf32>) -> tensor<1x4x4x32x2xf32>
  %4 = "tfl.mean"(%3, %cst_0) {keep_dims = true, tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x32x2xf32>, tensor<3xi32>) -> tensor<1x1x1x32x1xf32>
  %5 = tfl.add %4, %arg3 {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : tensor<1x1x1x32x1xf32>
  %6 = "tfl.rsqrt"(%5) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x1x1x32x1xf32>) -> tensor<1x1x1x32x1xf32>
  %7 = "tfl.broadcast_to"(%6, %cst_1) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x1x1x32x1xf32>, tensor<5xi64>) -> tensor<1x1x1x32x2xf32>
  %8 = "tfl.broadcast_to"(%7, %cst_2) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x1x1x32x2xf32>, tensor<5xi64>) -> tensor<1x4x4x32x2xf32>
  %9 = tfl.mul %0, %8 {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : tensor<1x4x4x32x2xf32>
  %10 = "tfl.broadcast_to"(%1, %cst_1) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x1x1x32x1xf32>, tensor<5xi64>) -> tensor<1x1x1x32x2xf32>
  %11 = tfl.mul %10, %7 {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : tensor<1x1x1x32x2xf32>
  %12 = tfl.sub(%arg6, %11) {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<32x2xf32>, tensor<1x1x1x32x2xf32>) -> tensor<1x1x1x32x2xf32>
  %13 = "tfl.broadcast_to"(%12, %cst_2) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x1x1x32x2xf32>, tensor<5xi64>) -> tensor<1x4x4x32x2xf32>
  %14 = tfl.add %9, %13 {fused_activation_function = "NONE", tac.device = "CPU", tac.inference_type = "FLOAT"} : tensor<1x4x4x32x2xf32>
  %15 = "tfl.reshape"(%14, %cst_3) {tac.device = "CPU", tac.inference_type = "FLOAT"} : (tensor<1x4x4x32x2xf32>, tensor<4xi32>) -> tensor<1x4x4x64xf32>
  return %15 : tensor<1x4x4x64xf32>
}

This is an input graph(which is a GroupNormalization) and I’m trying to extract the epsilon value, which is 1.000000e-03 of %cst_3 from %5 = tfl.add.
After getting the operand with mlir::Value epsilon = add_op.getRhs();,
I believe there’s no direct method to access the elements inside. How can I get the value of the first element inside?
I’ve tried to cast it to a ConstantOp or DenseElementsAttr, but both approaches failed.
Fail Case 1)

DenseElementsAttr dense = epsilon.cast<DenseElementsAttr>();
// error: no viable conversion from 'mlir::Value' to 'mlir::Attribute'

Fail case 2)

auto op = epsilon.getDefiningOp();
auto const_op = llvm::dyn_cast<mlir::arith::ConstantOp>(op);
// op is a nullptr

I don’t quite see %cst_3 as input to tfl.add on this IR?

If this fails, then op is not what you think it is. Add a op->dump(); right there, or breakpoint in a debugger and call op->dump()

Yes, using debugger I found it’s a nullptr as in the comment.

%cst_3 is passed to tfl.add as %arg3

So just dump it?

There is some confusion here, arg3 is an argument of the function, I don’t see how it relates to cst_3 ?

I got this. It seems it couldn’t grab any information, because the op is a nullptr?

(lldb) p op->dump()
error: Execution was interrupted, reason: EXC_BAD_ACCESS (code=1, address=0x30).
The process has been returned to the state before expression evaluation.
(lldb) p op
(mlir::Operation *) $0 = nullptr

There are two func operation in the mlir I attached in the beginning. %cst_3 is declared in @main func and while calling @func_0_CPU_FLOAT, the %cst_3 is passed as the 4th argument.

Sure, getDefiningOp returns nullptr, that confirms what I was saying before: it is a function argument.
Try to dump epsilon?

Sure, but unless you implement yourself an inter-procedural traversal, I’m not sure how you’d get from the use of arg3 on the tfl.add to the call site argument and then to the definition.

This may help Understanding the IR Structure - MLIR ?

(lldb) p epsilon.dump()
<block argument> of type 'tensor<1x1x1x32x1xf32>' at index: 3

Thank you. Let me clarify my understanding. So, is it right that there is no known way to access the definition of a value, which is passed as an argument? So, perhaps implementing an inter-procedural traversal is required?

Think about this (in C syntax):

void bar(int arg) {
  use(arg);
}

void foo() {
   int a = 2;
   int b = 3;
   bar(a);
   bar(b);
}

Now when you process use(arg) and you check the operand of arg, what do you expect to see? How do you expect an API (which is really a simple accessor) to give your more of an answer than “this is arg, the first parameter of the function” ?

If the arg is a class instance, it’s available to access its members. So, in my case where a ConstOp is passed as an argument, I was thinking that if there’s a method something like getValue(), I can access the value of the argument.

I sent you the example to help thinking about “what would you even return here?” : there are two calls to the function with two different arguments! There nothing that would make sense when processing the use(arg) in my example.

Anyway, the MLIR tutorial as well as this doc I sent above Understanding the IR Structure - MLIR should really help here.