Doodah
January 6, 2025, 4:09pm
1
I’m trying to lower this code to emitC:
module {
func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
%cst = arith.constant dense<[1, 0, 0, 0]> : tensor<4xi8>
%0 = bufferization.to_memref %cst : tensor<4xi8> to memref<4xi8>
linalg.add ins(%arg0, %0 : memref<4xi8>, memref<4xi8>) outs(%arg1 : memref<4xi8>)
return
}
}
using this pipeline:
one-shot-bufferize,
convert-linalg-to-loops,
canonicalize,
convert-scf-to-cf,
convert-memref-to-emitc,
convert-func-to-emitc,
convert-arith-to-emitc
However, the OneShotBufferize pass converts the arith.constant
into a memref.global
, which makes the convert-memref-to-emitc
pass fail:
linalg.mlir:4:10: error: failed to legalize operation 'memref.global' that was explicitly marked illegal
%0 = "arith.constant"() <{value = dense<[1, 0, 0, 0]> : tensor<4xi8>}> : () -> tensor<4xi8>
^
linalg.mlir:4:10: note: see current operation: "memref.global"() <{alignment = 64 : i64, constant, initial_value = dense<[1, 0, 0, 0]> : tensor<4xi8>, sym_name = "__constant_4xi8", sym_visibility = "private", type = memref<4xi8>}> : () -> ()
am I missing a pass here?
There is a pattern in -convert-memref-to-emitc
that converts memref.global
to emitc.global
. I suspect that the pattern does not match because of this:
if (op.getAlignment().value_or(1) > 1) {
// TODO: Extend GlobalOp to specify alignment via the `alignas` specifier.
return rewriter.notifyMatchFailure(
op.getLoc(), "global variable with alignment requirement is "
"currently not supported");
}
You could try running -one-shot-bufferize="buffer-alignment=1"
.
1 Like
Doodah
January 7, 2025, 1:14pm
3
that helped yes, but then it fails on the memref.alloc. It seems that emitC has no lowering of that:
// -----// IR Dump Before ConvertMemRefToEmitC (convert-memref-to-emitc) //----- //
module {
memref.global "private" constant @__constant_4xi8 : memref<4xi8> = dense<[1, 0, 0, 0]> {alignment = 1 : i64}
func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
%c1 = arith.constant 1 : index
%c4 = arith.constant 4 : index
%c0 = arith.constant 0 : index
%0 = memref.get_global @__constant_4xi8 : memref<4xi8>
%alloc = memref.alloc() {alignment = 1 : i64} : memref<4xi8>
memref.copy %0, %alloc : memref<4xi8> to memref<4xi8>
emitc.for %arg2 = %c0 to %c4 step %c1 {
%1 = memref.load %arg0[%arg2] : memref<4xi8>
%2 = memref.load %alloc[%arg2] : memref<4xi8>
%3 = arith.addi %1, %2 : i8
memref.store %3, %arg1[%arg2] : memref<4xi8>
}
return
}
}
linalg.mlir:5:18: error: failed to legalize operation 'memref.alloc' that was explicitly marked illegal
%alloc_cst = "bufferization.to_memref"(%0) : (tensor<4xi8>) -> memref<4xi8>
^
linalg.mlir:5:18: note: see current operation: %5 = "memref.alloc"() <{alignment = 1 : i64, operandSegmentSizes = array<i32: 0, 0>}> : () -> memref<4xi8>
Is there a way to force the OneShotBufferize
pass to use
memref.alloca
instead of memref.alloc
?
You can try adding the PromoteBuffersToStack pass to your pipeline: promote-buffers-to-stack{max-alloc-size-in-bytes=1024 max-rank-of-allocated-memref=1}
See this prototype: llvm-project/mlir/test/Conversion/TosaToEmitC/tosa-to-emitc.mlir at 72b8086f31ccfc1a1719ff658c94c5e5c08abfa2 · simon-camp/llvm-project · GitHub
1 Like
Doodah
January 8, 2025, 11:44am
5
that worked thanks! The example is very useful, is it in the process of getting upstreamed?
Are any additions to the emitC dialect needed for the TosaToEmitC example?
Fixing the memref.alloc issue, leads to the next one:
// -----// IR Dump Before ConvertMemRefToEmitC (convert-memref-to-emitc) //----- //
module {
memref.global "private" constant @__constant_4xi8 : memref<4xi8> = dense<[1, 0, 0, 0]>
func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
%c1 = arith.constant 1 : index
%c4 = arith.constant 4 : index
%c0 = arith.constant 0 : index
%0 = memref.get_global @__constant_4xi8 : memref<4xi8>
%alloca = memref.alloca() : memref<4xi8>
memref.copy %0, %alloca : memref<4xi8> to memref<4xi8>
emitc.for %arg2 = %c0 to %c4 step %c1 {
%1 = memref.load %arg0[%arg2] : memref<4xi8>
%2 = memref.load %alloca[%arg2] : memref<4xi8>
%3 = arith.addi %1, %2 : i8
memref.store %3, %arg1[%arg2] : memref<4xi8>
}
return
}
}
linalg.mlir:5:18: error: failed to legalize operation 'memref.copy' that was explicitly marked illegal
%alloc_cst = "bufferization.to_memref"(%0) : (tensor<4xi8>) -> memref<4xi8>
^
linalg.mlir:5:18: note: see current operation: "memref.copy"(%4, %6) : (memref<4xi8>, memref<4xi8>) -> ()
Doodah
January 8, 2025, 7:43pm
6
skipping the bufferizazion.to_memref by defining the constant directly as memref works better:
memref.global "private" constant @cst : memref<4xi8> = dense<[1, 0, 0, 0]>
func.func @forward_pass(%arg0: memref<4xi8>, %arg1: memref<4xi8>) {
%0 = memref.get_global @cst : memref<4xi8>
linalg.add ins(%arg0, %0 : memref<4xi8>, memref<4xi8>) outs(%arg1 : memref<4xi8>)
return
}
There is the following PR for a working TOSA to EmitC test: [mlir][EmitC] Add pass that combines all available emitc conversions by simon-camp · Pull Request #117549 · llvm/llvm-project · GitHub
More complex examples will again contain copy ops though.
IIRC the function that generates copies between memrefs (and also new allocations) can be configured in the one shot bufferization. Linalg copy ops could be lowered to loops and to emitc then.
Though I don’t think these are exposed as a pass option on the command line currently.