[emitC][bufferization] convert arith.constant to bufferization.to_memref to emitC

I’m trying to lower this code to emitC:

module {
  func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
    %cst = arith.constant dense<[1, 0, 0, 0]> : tensor<4xi8>
    %0 = bufferization.to_memref %cst : tensor<4xi8> to memref<4xi8>
    linalg.add ins(%arg0, %0 : memref<4xi8>, memref<4xi8>) outs(%arg1 : memref<4xi8>)
    return
  }
}

using this pipeline:

  one-shot-bufferize,
  convert-linalg-to-loops,
  canonicalize,
  convert-scf-to-cf,
  convert-memref-to-emitc,
  convert-func-to-emitc,
  convert-arith-to-emitc

However, the OneShotBufferize pass converts the arith.constant into a memref.global, which makes the convert-memref-to-emitc pass fail:

linalg.mlir:4:10: error: failed to legalize operation 'memref.global' that was explicitly marked illegal
    %0 = "arith.constant"() <{value = dense<[1, 0, 0, 0]> : tensor<4xi8>}> : () -> tensor<4xi8>
         ^
linalg.mlir:4:10: note: see current operation: "memref.global"() <{alignment = 64 : i64, constant, initial_value = dense<[1, 0, 0, 0]> : tensor<4xi8>, sym_name = "__constant_4xi8", sym_visibility = "private", type = memref<4xi8>}> : () -> ()

am I missing a pass here?

There is a pattern in -convert-memref-to-emitc that converts memref.global to emitc.global. I suspect that the pattern does not match because of this:

    if (op.getAlignment().value_or(1) > 1) {
      // TODO: Extend GlobalOp to specify alignment via the `alignas` specifier.
      return rewriter.notifyMatchFailure(
          op.getLoc(), "global variable with alignment requirement is "
                       "currently not supported");
    }

You could try running -one-shot-bufferize="buffer-alignment=1".

1 Like

that helped yes, but then it fails on the memref.alloc. It seems that emitC has no lowering of that:

// -----// IR Dump Before ConvertMemRefToEmitC (convert-memref-to-emitc) //----- //
module {
  memref.global "private" constant @__constant_4xi8 : memref<4xi8> = dense<[1, 0, 0, 0]> {alignment = 1 : i64}
  func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
    %c1 = arith.constant 1 : index
    %c4 = arith.constant 4 : index
    %c0 = arith.constant 0 : index
    %0 = memref.get_global @__constant_4xi8 : memref<4xi8>
    %alloc = memref.alloc() {alignment = 1 : i64} : memref<4xi8>
    memref.copy %0, %alloc : memref<4xi8> to memref<4xi8>
    emitc.for %arg2 = %c0 to %c4 step %c1 {
      %1 = memref.load %arg0[%arg2] : memref<4xi8>
      %2 = memref.load %alloc[%arg2] : memref<4xi8>
      %3 = arith.addi %1, %2 : i8
      memref.store %3, %arg1[%arg2] : memref<4xi8>
    }
    return
  }
}


linalg.mlir:5:18: error: failed to legalize operation 'memref.alloc' that was explicitly marked illegal
    %alloc_cst = "bufferization.to_memref"(%0) : (tensor<4xi8>) -> memref<4xi8>
                 ^
linalg.mlir:5:18: note: see current operation: %5 = "memref.alloc"() <{alignment = 1 : i64, operandSegmentSizes = array<i32: 0, 0>}> : () -> memref<4xi8>

Is there a way to force the OneShotBufferize pass to use
memref.alloca instead of memref.alloc?

You can try adding the PromoteBuffersToStack pass to your pipeline: promote-buffers-to-stack{max-alloc-size-in-bytes=1024 max-rank-of-allocated-memref=1}
See this prototype: llvm-project/mlir/test/Conversion/TosaToEmitC/tosa-to-emitc.mlir at 72b8086f31ccfc1a1719ff658c94c5e5c08abfa2 · simon-camp/llvm-project · GitHub

1 Like

that worked thanks! The example is very useful, is it in the process of getting upstreamed?

Are any additions to the emitC dialect needed for the TosaToEmitC example?

Fixing the memref.alloc issue, leads to the next one:

// -----// IR Dump Before ConvertMemRefToEmitC (convert-memref-to-emitc) //----- //
module {
  memref.global "private" constant @__constant_4xi8 : memref<4xi8> = dense<[1, 0, 0, 0]>
  func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
    %c1 = arith.constant 1 : index
    %c4 = arith.constant 4 : index
    %c0 = arith.constant 0 : index
    %0 = memref.get_global @__constant_4xi8 : memref<4xi8>
    %alloca = memref.alloca() : memref<4xi8>
    memref.copy %0, %alloca : memref<4xi8> to memref<4xi8>
    emitc.for %arg2 = %c0 to %c4 step %c1 {
      %1 = memref.load %arg0[%arg2] : memref<4xi8>
      %2 = memref.load %alloca[%arg2] : memref<4xi8>
      %3 = arith.addi %1, %2 : i8
      memref.store %3, %arg1[%arg2] : memref<4xi8>
    }
    return
  }
}


linalg.mlir:5:18: error: failed to legalize operation 'memref.copy' that was explicitly marked illegal
    %alloc_cst = "bufferization.to_memref"(%0) : (tensor<4xi8>) -> memref<4xi8>
                 ^
linalg.mlir:5:18: note: see current operation: "memref.copy"(%4, %6) : (memref<4xi8>, memref<4xi8>) -> ()

skipping the bufferizazion.to_memref by defining the constant directly as memref works better:

memref.global "private" constant @cst : memref<4xi8> = dense<[1, 0, 0, 0]>        
  func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {               
    %0 = memref.get_global @cst : memref<4xi8>                                      
    linalg.add ins(%arg0, %0 : memref<4xi8>, memref<4xi8>) outs(%arg1 : memref<4xi8>)
    return                                                                          
  } 

There is the following PR for a working TOSA to EmitC test: [mlir][EmitC] Add pass that combines all available emitc conversions by simon-camp · Pull Request #117549 · llvm/llvm-project · GitHub

More complex examples will again contain copy ops though.

IIRC the function that generates copies between memrefs (and also new allocations) can be configured in the one shot bufferization. Linalg copy ops could be lowered to loops and to emitc then.
Though I don’t think these are exposed as a pass option on the command line currently.