Correctness properties of `fold` methods

I’m trying to figure out what are correctness expectations of *::fold methods w.r.t. the resulting types of the attributes that are produced. As a motivating example, I have the following TOSA dialect operation:

func.func @main(%arg0: tensor<1000x12544xf32>) -> tensor<1000x12544xf32> {
  %cst_0 = arith.constant dense<0.000000e+00> : tensor<1x1xf32>
  %5 = "tosa.mul"(%arg0, %cst_0) {shift = 0 : i32} : (tensor<1000x12544xf32>, tensor<1x1xf32>) -> tensor<1000x12544xf32>
  return %5 : tensor<1000x12544xf32>
}

where the tosa.mul is being folded away to

func.func @main(%arg0: tensor<1000x12544xf32>) -> tensor<1000x12544xf32> {
  %0 = "tosa.const"() <{value = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1000x12544xf32>
  return %0 : tensor<1000x12544xf32>
}

The specified type of the tosa.const seems correct (matching the original op). But the dense constant held by the operation has a different type, simply reusing the original attribute without broadcasting the original dimensions. Is it legal for the attributes coming out of a ::fold operation to not match the expected type of the op?

It seems like additional burden on downstream folders to determine what is the ‘correct’ type of the input attributes coming from prior folds, especially in dialects like TOSA with broadcasting behavior.

I think you’re right here, this looks like a bug in the fold. You’ve lost the broadcast that happened in the original tosa.mul. In some places, we’ve suggested using tosa.add and tosa.mul as ways to implement a broadcast since there isn’t an explicit broadcast. It wouldn’t be very helpful if those immediately got folded away.