Reading data using external cpp file

Hi All,
I am trying to read data using an external cpp file. The cpp file works fine, however, while trying to print the data using MLIR, it gives 0 for the array.

External cpp file -

extern "C" int readData(std::vector<int> vect)
{
    std::string myText; 
    // Read from the text file
    std::ifstream MyReadFile("filename.txt");
    while (getline(MyReadFile, myText))
    {
        vect.push_back(stoi(myText));
    }
    MyReadFile.close();
    return 0;
}

MLIR code to read the file and print -

int generateMLIRForQ6(mlir::MLIRContext &context, mlir::OwningOpRef<mlir::ModuleOp> &module, Location loc){
    OpBuilder builder(&context);

    auto funcType = builder.getFunctionType(std::nullopt, std::nullopt);
    auto funcn = builder.create<func::FuncOp>(loc, "main", funcType);

    Type elementType = builder.getI64Type();
    auto memTp = MemRefType::get({10}, elementType);
    auto loadFuncnType = builder.getFunctionType(memTp, elementType);
    auto loadFunc = builder.create<func::FuncOp>(loc, "readData", loadFuncnType);
    loadFunc.setPrivate();

    // auto memTp2 = MemRefType::get({10, 3}, elementType);
    UnrankedMemRefType castMemrefType = UnrankedMemRefType::get(memTp.getElementType(), /*memorySpace=*/0);
    auto printMemRefFuncnType = builder.getFunctionType(castMemrefType, std::nullopt);
    auto printMemRefFunc = builder.create<func::FuncOp>(loc, "printMemrefI64", printMemRefFuncnType);
    printMemRefFunc.setPrivate();

    Block *entryBlock = funcn.addEntryBlock();
    // Region *funcBody = entryBlock->getParent();
    builder.setInsertionPointToEnd(entryBlock);

    
    Value mem = builder.create<memref::AllocOp>(loc, memTp);
    auto x = builder.create<func::CallOp>(loc, loadFunc, ValueRange{mem});

    
    Value castedMem = builder.create<memref::CastOp>(loc, castMemrefType, mem);

    auto printFuncCall = builder.create<func::CallOp>(loc, printMemRefFunc, castedMem);
    builder.create<func::ReturnOp>(loc);

    module->push_back(funcn);
    module->push_back(loadFunc);
    module->push_back(printMemRefFunc);
    return 0;
} 

Output it produced -

Unranked Memref base@ = 0x55b92baa0eb0 rank = 1 offset = 0 sizes = [10] strides = [1] data = 
[0,  0,  0,  0,  0,  0,  0,  0,  0,  0]

I could not understand where it is failing. Would you please help to figure out the issue.
Thanks in advance.

Regards,
Sudip

I may miss something, but where does it read the file?
(It’d be easier if you printed the textual IR and provided some informations on how you executed it)

Thank you so much for the quick response.
This is the MLIR file generated -

module {
  func.func @main() {
    %alloc = memref.alloc() : memref<10xi64>
    %0 = call @readData(%alloc) : (memref<10xi64>) -> i64
    %cast = memref.cast %alloc : memref<10xi64> to memref<*xi64>
    call @printMemrefI64(%cast) : (memref<*xi64>) -> ()
    return
  }
  func.func private @readData(memref<10xi64>) -> i64
  func.func private @printMemrefI64(memref<*xi64>)
}

I am compiling it with the below command

./bin/casair-opt --convert-scf-to-cf --convert-cf-to-llvm --convert-arith-to-llvm --finalize-memref-to-llvm --convert-func-to-llvm='use-opaque-pointers=1' --convert-vector-to-llvm='use-opaque-pointers=1'  ../resources/sumOfTwoNumInMLIR.mlir -o  ../resources/sumOfTwoNum.llvm

./bin/casair-translate --mlir-to-llvmir --opaque-pointers ../resources/sumOfTwoNum.llvm -o ../resources/sumOfTwoNumLLVMIR.ll

../../llvm-project/build/bin/clang++ ../resources/sumOfTwoNumLLVMIR.ll ../../llvm-project/build/tools/mlir/lib/ExecutionEngine/CMakeFiles/mlir_c_runner_utils.dir/CRunnerUtils.cpp.o /home/sudip/Research/opensource/casair/llvm-project/build/tools/mlir/lib/ExecutionEngine/CMakeFiles/mlir_runner_utils.dir/RunnerUtils.cpp.o /home/sudip/Research/opensource/casair/casair/build/lib/CasaIR/CMakeFiles/obj.MLIRCasaIR.dir/ReadData.cpp.o -o ../resources/sumOfTwoNum.out

Please let me know if you need more information. Thanks.

I don’t know how you implemented the readData function, but if you look into the IR generated by your first step it should look like:

    %13 = llvm.call @readData(%6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64
    %14 = llvm.alloca %2 x !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr
    llvm.store %12, %14 : !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)>, !llvm.ptr
    llvm.call @printMemrefI64(%2, %14) : (i64, !llvm.ptr) -> ()
    llvm.return

You can see that the calling convention for the two call is quite different, it may or may not be what you expect.

Thanks for the reply. I gave the MLIR code before. I provided the llvm code as well below -

%13 = llvm.call @readData(%6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64
%14 = llvm.alloca %2 x !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr
llvm.store %12, %14 : !llvm.struct<(ptr, ptr, i64, array<1 x i64>, array<1 x i64>)>, !llvm.ptr
llvm.call @printMemrefI64(%2, %14) : (i64, !llvm.ptr) -> ()
llvm.return

which is almost similar to your code.
This is the readData method in my cpp code -

extern "C" int readData(int64_t vect[10])
{
    std::string myText; 
    std::ifstream MyReadFile("filename.txt");
    int i = 0;
    while (getline(MyReadFile, myText))
    {
        vect[i] = stoi(myText);
        i++;
    }
    MyReadFile.close();
    return 0;
}

Please let me know if you think the code has some issues.

We figure out the issue and it was happening because we need to provide the absolute path while reading the file in external cpp. Thanks again for checking our issue.

Hi,

I am trying to pass two memref as arguments in the function call. While printing the result, the first memref is giving the proper result, however, the 2nd of is returning 0. When I am passing those individually, both are working fine.

Here is the MLIR -

module {
  func.func @main() {
    %alloc = memref.alloc() : memref<10xi64>
    %alloc_0 = memref.alloc() : memref<10xi64>
    %alloc_1 = memref.alloc() : memref<10xf64>
    %alloc_2 = memref.alloc() : memref<10xf64>
    %0 = call @readFile(%alloc_0, %alloc) : (memref<10xi64>, memref<10xi64>) -> i64
    %c0 = arith.constant 0 : index
    %c10 = arith.constant 10 : index
    %c1 = arith.constant 1 : index
    scf.for %arg0 = %c0 to %c10 step %c1 {
      %1 = memref.load %alloc_0[%arg0] : memref<10xi64>
      vector.print %1 : i64
      %2 = memref.load %alloc[%arg0] : memref<10xi64>
      vector.print %2 : i64
    }
    return
  }
  func.func private @readFile(memref<10xi64>, memref<10xi64>) -> i64
}

Here is the generated llvm -

module attributes {llvm.data_layout = ""} {
  llvm.func @printNewline()
  llvm.func @printI64(i64)
  llvm.func @malloc(i64) -> !llvm.ptr
  llvm.func @main() {
    %0 = llvm.mlir.constant(0 : index) : i64
    %1 = llvm.mlir.constant(10 : index) : i64
    %2 = llvm.mlir.constant(1 : index) : i64
    %3 = llvm.mlir.null : !llvm.ptr
    %4 = llvm.getelementptr %3[10] : (!llvm.ptr) -> !llvm.ptr, i64
    %5 = llvm.ptrtoint %4 : !llvm.ptr to i64
    %6 = llvm.call @malloc(%5) : (i64) -> !llvm.ptr
    %7 = llvm.mlir.null : !llvm.ptr
    %8 = llvm.getelementptr %7[10] : (!llvm.ptr) -> !llvm.ptr, i64
    %9 = llvm.ptrtoint %8 : !llvm.ptr to i64
    %10 = llvm.call @malloc(%9) : (i64) -> !llvm.ptr
    %11 = llvm.mlir.null : !llvm.ptr
    %12 = llvm.getelementptr %11[10] : (!llvm.ptr) -> !llvm.ptr, f64
    %13 = llvm.ptrtoint %12 : !llvm.ptr to i64
    %14 = llvm.call @malloc(%13) : (i64) -> !llvm.ptr
    %15 = llvm.mlir.null : !llvm.ptr
    %16 = llvm.getelementptr %15[10] : (!llvm.ptr) -> !llvm.ptr, f64
    %17 = llvm.ptrtoint %16 : !llvm.ptr to i64
    %18 = llvm.call @malloc(%17) : (i64) -> !llvm.ptr
    %19 = llvm.call @readFile(%10, %10, %0, %1, %2, %6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64, !llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64
    llvm.br ^bb1(%0 : i64)
  ^bb1(%20: i64):  // 2 preds: ^bb0, ^bb2
    %21 = llvm.icmp "slt" %20, %1 : i64
    llvm.cond_br %21, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    %22 = llvm.getelementptr %10[%20] : (!llvm.ptr, i64) -> !llvm.ptr, i64
    %23 = llvm.load %22 : !llvm.ptr -> i64
    llvm.call @printI64(%23) : (i64) -> ()
    llvm.call @printNewline() : () -> ()
    %24 = llvm.getelementptr %6[%20] : (!llvm.ptr, i64) -> !llvm.ptr, i64
    %25 = llvm.load %24 : !llvm.ptr -> i64
    llvm.call @printI64(%25) : (i64) -> ()
    llvm.call @printNewline() : () -> ()
    %26 = llvm.add %20, %2  : i64
    llvm.br ^bb1(%26 : i64)
  ^bb3:  // pred: ^bb1
    llvm.return
  }
  llvm.func @readFile(!llvm.ptr, !llvm.ptr, i64, i64, i64, !llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64 attributes {sym_visibility = "private"}
}

I am not sure where it is going wrong.

Again, you’re only looking at part of the story and the wall of IR isn’t very useful. The focus is between the interaction between MLIR and your C++, your previous example worked “by luck”. The LLVM call was:

%13 = llvm.call @readData(%6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64

But your C++ was:

extern "C" int readData(int64_t vect[10])

It “worked” because the first argument is the memref pointer, so you could ignore the strides and dimension sizes in this case. However when passing two memrefs, the example now is:

    %19 = llvm.call @readFile(%10, %10, %0, %1, %2, %6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64, !llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64

Are you setup on the C++ side to handle this list of arguments? See also: LLVM IR Target - MLIR

Thank you very much for pointing out the issue.
I fixed the issue by modifying the parameter list in the readFile cpp method. I have one query regarding the transformation from MLIR to LLVM. I am passing only two parameter %0 = call @readFile(%alloc_0, %alloc) : (memref<10xi64>, memref<10xi64>) -> i64 however, in llvm it contains 10 parameters. For alloc_0, two memref, three integers which I have declared for the 'for' loop in MLIR, and the same repeated again for the alloc of MLIR code. I didn’t understand why these extra parameters were inserted in LLVM.

What do you expect exactly? Can you spell what the call should look like?
(I assume you studies the link I posted above: LLVM IR Target - MLIR ?)

My expectation from MLIR to LLVM was something like the below -
MLIR -

%0 = call @readFile(%alloc_0, %alloc) : (memref<10xi64>, memref<10xi64>) -> i64

and LLVM -

%4 = llvm.getelementptr %3[10] : (!llvm.ptr) -> !llvm.ptr, i64
%5 = llvm.ptrtoint %4 : !llvm.ptr to i64
%6 = llvm.call @malloc(%5) : (i64) -> !llvm.ptr
%7 = llvm.mlir.null : !llvm.ptr
%8 = llvm.getelementptr %7[10] : (!llvm.ptr) -> !llvm.ptr, i64
%9 = llvm.ptrtoint %8 : !llvm.ptr to i64
%19 = llvm.call @readFile(%10, %6) : (!llvm.ptr, !llvm.ptr) -> i64

(I am going through the documentation you have mentioned above LLVM IR Target - MLIR)

Let me copy/paste the example from the doc here:

func.func @foo(%arg0: memref<?xf32>) -> () {
  "use"(%arg0) : (memref<?xf32>) -> ()
  return
}

Gets converted to the following (using type alias for brevity):

!llvm.memref_1d = !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1xi64>, array<1xi64>)>

llvm.func @foo(%arg0: !llvm.ptr<f32>,  // Allocated pointer.
               %arg1: !llvm.ptr<f32>,  // Aligned pointer.
               %arg2: i64,             // Offset.
               %arg3: i64,             // Size in dim 0.
               %arg4: i64) {           // Stride in dim 0.
  // Populate memref descriptor structure.
  %0 = llvm.mlir.undef :
  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d
  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d
  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d
  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d
  %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d

  // Descriptor is now usable as a single value.
  "use"(%5) : (!llvm.memref_1d) -> ()
  llvm.return
}

A one-dimensional memref function parameter is shown to be converted to five individual components.

1 Like

Got it. Thanks for the clarification. :+1:

Hello, could you explain how to call readData function from my MLIR?

I put the readData and generateMLIRForQ6 into the same Cpp file and compile it, but I got the error:“JIT session error: Symbols not found: [ readData ]”

Hi, of course, I will explain.

I assume you want to call an external cpp method for a specific purpose. For us, it is reading a file. We are reading some integer data from a file and storing it into a vector in external cpp. The cpp method looks like this -

extern "C" int readData(std::vector<int> vect)
{
    std::string myText; 
    // Read from the text file
    std::ifstream MyReadFile("filename.txt");
    while (getline(MyReadFile, myText))
    {
        vect.push_back(stoi(myText));
    }
    MyReadFile.close();
    return 0;
}

Now you have to write a cpp in MLIR to call the external cpp method. This MLIR cpp API looks like -

int yourMLIRMethod(mlir::MLIRContext &context, mlir::OwningOpRef<mlir::ModuleOp> &module, Location loc){
    OpBuilder builder(&context);

    auto funcType = builder.getFunctionType(std::nullopt, std::nullopt);
    auto funcn = builder.create<func::FuncOp>(loc, "main", funcType);

    Type elementType = builder.getI64Type();
    auto memTp = MemRefType::get({10}, elementType);
    auto loadFuncnType = builder.getFunctionType(memTp, elementType);
    auto loadFunc = builder.create<func::FuncOp>(loc, "readData", loadFuncnType);
    loadFunc.setPrivate();

    UnrankedMemRefType castMemrefType = UnrankedMemRefType::get(memTp.getElementType(), /*memorySpace=*/0);
    auto printMemRefFuncnType = builder.getFunctionType(castMemrefType, std::nullopt);
    auto printMemRefFunc = builder.create<func::FuncOp>(loc, "printMemrefI64", printMemRefFuncnType);
    printMemRefFunc.setPrivate();

    Block *entryBlock = funcn.addEntryBlock();
    builder.setInsertionPointToEnd(entryBlock);

    
    Value mem = builder.create<memref::AllocOp>(loc, memTp);
    auto x = builder.create<func::CallOp>(loc, loadFunc, ValueRange{mem});

    
    Value castedMem = builder.create<memref::CastOp>(loc, castMemrefType, mem);

    auto printFuncCall = builder.create<func::CallOp>(loc, printMemRefFunc, castedMem);
    builder.create<func::ReturnOp>(loc);

    module->push_back(funcn);
    module->push_back(loadFunc);
    module->push_back(printMemRefFunc);
    return 0;
} 

You should remember that while calling the external cpp method from MLIR, you should pass the exact name.

builder.create<func::FuncOp>(loc, "readData", loadFuncnType);

In addition, I was reading 10 records, therefore, I declared a memory of size 10.

auto memTp = MemRefType::get({10}, elementType);
auto loadFuncnType = builder.getFunctionType(memTp, elementType);

If you notice, we are passing loadFuncnType while calling the external cpp and this is the parameter of the external cpp.
However, you have to remember one important thing. While the MLIR code is generated from the MLIR cpp, then there are some extra parameters are added explicitly [see Mahdi Amini’s explanation in this thread]. It will look like this -

%13 = llvm.call @readData(%6, %6, %0, %1, %2) : (!llvm.ptr, !llvm.ptr, i64, i64, i64) -> i64

So if you have to pass multiple parameters then you have to adjust the external cpp file accordingly.
Please let me know if you have any other questions.

Hello Sudip,

I know exactly how to use the MLIR APIs such as memref, and lower the IR to LLVM IR.

I just need to know how you organize your code and compile it.

It is impossible to put “readData” and “yourMLIRMethod” into the same cpp file and then compile this file.
The readData function should be compiled into LLVM IR and we need to concatenate the LLVM IR of readData function with MLIR built in “yourMLIRMethod” function.

How did you achieve the above procedure?

Thank you very much.

Hello Sudip,

I solve my problem.

Maybe we use a different LLVM version.

The command option -Xlinker --export-dynamic should be added to enable calling the extern C function in IR.

Thank you very mych

readData is in a different cpp file.