How to write platform-specific code in MLIR emitC

I’m trying to generate platform specific code that can be lowered to emitC (targeting accelerators or any bus peripheral at specific addresses and no dynamic allocations).
I’m trying to generate code that looks like this:

int8_t my_data[] = {1, 2, 3, 4};
int32_t my_accelerator_address = 0x1234;

memcpy((void*)my_accelerator_address, my_data, 4);

I came up with the following code:

  %5 = arith.constant dense<[1, 2, 3, 4]> : memref<4xi8>
  %6 = arith.constant 1048576 : i32
  %7 = arith.constant 4 : i32
  func.func private @memcpy(%8 : i32, %9 : memref<?xi8>, %10 : i32)
  func.call @memcpy(%6, %5, %7) : (i32, memref<4xi8>, i32) -> ()

but this gives an error and I’m not sure if this is the best way to do that in general:

experiments/xdsl_exploration/emitc_memcpy.mlir:8:3: error: 'func.call' op operand type mismatch: expected operand type 'memref<?xi8>', but provided 'memref<4xi8>' for operand number 1
  func.call @memcpy(%6, %5, %7) : (i32, memref<4xi8>, i32) -> ()
  ^
experiments/xdsl_exploration/emitc_memcpy.mlir:8:3: note: see current operation: "func.call"(%1, %0, %2) <{callee = @memcpy}> : (i32, memref<4xi8>, i32) -> ()

I also tried to cast the type of %5 to memref<?xi8> but there is not lowering of that to emitC.
Would this be better done with defining a memory space for the accelerator? But then how to I tell the specific address in the conversion to emitC?

You can use an emitc.call_opaque op to generate a function call without adding a declaration; Compiler Explorer

I don’t know if an arith constant with memref type is well defined however. At least this can not be lowered to EmitC. On top of that I dont’t see a way to initialize a local array variable from a dense attribute in EmitC currently, that should be easily fixable though.

1 Like

I see, I was hoping I could insert the accelerator specific code on a higher level than emitC and lower that afterwards to emitC.
An example I’m currently thinking of is a matmul accelerator. I would start with something like this:

  memref.global "private" constant @const_matrix : memref<2x2xi32> = dense<[[1, 2], [3, 4]]>

  func.func @test(%input : memref<2x2xi32>, %result: memref<2x2xi32> )  {
    %const_memref = memref.get_global @const_matrix : memref<2x2xi32>
    linalg.matmul
      ins(%const_memref, %input : memref<2x2xi32>, memref<2x2xi32>)
      outs(%result : memref<2x2xi32>)

    return
  }

And convert that to something like this:

  memref.global "private" constant @const_matrix : memref<2x2xi32> = dense<[[1, 2], [3, 4]]>

  func.func private @memcpy(%8 : i32, %9 : memref<?x?xi32>, %10 : i32)

  func.func @test(%input : memref<2x2xi32>, %result: memref<2x2xi32> )  {
    %const_memref = memref.get_global @const_matrix : memref<2x2xi32>

    %my_accelerator_address = arith.constant 0x1234 : i32
    %size = arith.constant 4 : i32

    // copy the constant matrix to the accelerator
    func.call @memcpy(%my_accelerator_address, %const_memref, %size) : (i32, memref<2x2xi32>, i32) -> ()
    // copy the input matrix to the accelerator
    func.call @memcpy(%my_accelerator_address, %input, %size) : (i32, memref<2x2xi32>, i32) -> ()

    return
  }

Then lower that to be able to emit C code.
But maybe it is easier to work directly with emitC.