I’m trying to generate platform specific code that can be lowered to emitC (targeting accelerators or any bus peripheral at specific addresses and no dynamic allocations).
I’m trying to generate code that looks like this:
int8_t my_data[] = {1, 2, 3, 4};
int32_t my_accelerator_address = 0x1234;
memcpy((void*)my_accelerator_address, my_data, 4);
I came up with the following code:
%5 = arith.constant dense<[1, 2, 3, 4]> : memref<4xi8>
%6 = arith.constant 1048576 : i32
%7 = arith.constant 4 : i32
func.func private @memcpy(%8 : i32, %9 : memref<?xi8>, %10 : i32)
func.call @memcpy(%6, %5, %7) : (i32, memref<4xi8>, i32) -> ()
but this gives an error and I’m not sure if this is the best way to do that in general:
experiments/xdsl_exploration/emitc_memcpy.mlir:8:3: error: 'func.call' op operand type mismatch: expected operand type 'memref<?xi8>', but provided 'memref<4xi8>' for operand number 1
func.call @memcpy(%6, %5, %7) : (i32, memref<4xi8>, i32) -> ()
^
experiments/xdsl_exploration/emitc_memcpy.mlir:8:3: note: see current operation: "func.call"(%1, %0, %2) <{callee = @memcpy}> : (i32, memref<4xi8>, i32) -> ()
I also tried to cast the type of %5 to memref<?xi8> but there is not lowering of that to emitC.
Would this be better done with defining a memory space for the accelerator? But then how to I tell the specific address in the conversion to emitC?