Pybind error when accessing the same compiled obj (mlir.ir.module.parse and parseSourceFile<mlir::ModuleOp>)

A further discovery about last question:

Fail to recognize func.return as a block terminator in ParseSourceFile()

I found the reason to this error
loc("tosa_elided.mlir":27:5): error: block with no terminator, has "func.return"(%23) : (tensor<1x8x16x4xf32>) -> ().
is : I imported 2 python packages which access ParseSourceFile() from parser.cpp at the same time. The 2 packages are: 1. My custom pybind shared object which uses ParseSourceFile() function. 2. mlir.ir.Module.parse().

Here is the example:
I have a tosa.mlir:

module attributes {torch.debug_module_name = "simple"} {
  func.func @forward(%arg0: tensor<1x3x16x16xf32>) -> tensor<1x8x16x4xf32> {
    %0 = "tosa.const"() <{value = dense_resource<__elided__> : tensor<8x3x3x3xf32>}> : () -> tensor<8x3x3x3xf32>
    %1 = "tosa.const"() <{value = dense_resource<__elided__> : tensor<1x16x8xf32>}> : () -> tensor<1x16x8xf32>
    %2 = "tosa.const"() <{value = dense_resource<__elided__> : tensor<1x8x4xf32>}> : () -> tensor<1x8x4xf32>
    %3 = "tosa.const"() <{value = dense<1.000010e+00> : tensor<8x1x1xf32>}> : () -> tensor<8x1x1xf32>
    %4 = "tosa.const"() <{value = dense<[0, 3, 1, 2]> : tensor<4xi32>}> : () -> tensor<4xi32>
    %5 = "tosa.const"() <{value = dense<[0, 2, 3, 1]> : tensor<4xi32>}> : () -> tensor<4xi32>
    %6 = "tosa.const"() <{value = dense_resource<__elided__> : tensor<8xf32>}> : () -> tensor<8xf32>
    %7 = "tosa.const"() <{value = dense_resource<__elided__> : tensor<8xf32>}> : () -> tensor<8xf32>
    %8 = "tosa.const"() <{value = dense<[-0.121429406, 0.0909240693, 0.0867559984, 0.169947132]> : tensor<4xf32>}> : () -> tensor<4xf32>
    %9 = "tosa.transpose"(%arg0, %5) : (tensor<1x3x16x16xf32>, tensor<4xi32>) -> tensor<1x16x16x3xf32>
    %10 = "tosa.conv2d"(%9, %0, %6) <{dilation = array<i64: 1, 1>, pad = array<i64: 1, 1, 1, 1>, stride = array<i64: 1, 1>}> : (tensor<1x16x16x3xf32>, tensor<8x3x3x3xf32>, tensor<8xf32>) -> tensor<1x16x16x8xf32>
    %11 = "tosa.transpose"(%10, %4) : (tensor<1x16x16x8xf32>, tensor<4xi32>) -> tensor<1x8x16x16xf32>
    %12 = "tosa.rsqrt"(%3) : (tensor<8x1x1xf32>) -> tensor<8x1x1xf32>
    %13 = "tosa.mul"(%11, %12) <{shift = 0 : i32}> : (tensor<1x8x16x16xf32>, tensor<8x1x1xf32>) -> tensor<1x8x16x16xf32>
    %14 = "tosa.clamp"(%13) <{max_fp = 3.40282347E+38 : f32, max_int = 2147483647 : i64, min_fp = 0.000000e+00 : f32, min_int = 0 : i64}> : (tensor<1x8x16x16xf32>) -> tensor<1x8x16x16xf32>
    %15 = "tosa.reshape"(%14) <{new_shape = array<i64: 1, 128, 16>}> : (tensor<1x8x16x16xf32>) -> tensor<1x128x16xf32>
    %16 = "tosa.matmul"(%15, %1) : (tensor<1x128x16xf32>, tensor<1x16x8xf32>) -> tensor<1x128x8xf32>
    %17 = "tosa.reshape"(%16) <{new_shape = array<i64: 1, 8, 16, 8>}> : (tensor<1x128x8xf32>) -> tensor<1x8x16x8xf32>
    %18 = "tosa.add"(%17, %7) : (tensor<1x8x16x8xf32>, tensor<8xf32>) -> tensor<1x8x16x8xf32>
    %19 = "tosa.clamp"(%18) <{max_fp = 3.40282347E+38 : f32, max_int = 2147483647 : i64, min_fp = 0.000000e+00 : f32, min_int = 0 : i64}> : (tensor<1x8x16x8xf32>) -> tensor<1x8x16x8xf32>
    %20 = "tosa.reshape"(%19) <{new_shape = array<i64: 1, 128, 8>}> : (tensor<1x8x16x8xf32>) -> tensor<1x128x8xf32>
    %21 = "tosa.matmul"(%20, %2) : (tensor<1x128x8xf32>, tensor<1x8x4xf32>) -> tensor<1x128x4xf32>
    %22 = "tosa.reshape"(%21) <{new_shape = array<i64: 1, 8, 16, 4>}> : (tensor<1x128x4xf32>) -> tensor<1x8x16x4xf32>
    %23 = "tosa.add"(%22, %8) : (tensor<1x8x16x4xf32>, tensor<4xf32>) -> tensor<1x8x16x4xf32>
    return %23 : tensor<1x8x16x4xf32>
  }
}

And I have a custom “pymlir” shared object from a pybind11 c++ source, which defines a load function and uses parseSourceFile():

  void load(std::string filename) {
    DialectRegistry registry;

    registry.insert<func::FuncDialect, FORWARD::FORWARDDialect,
                    quant::QuantizationDialect, memref::MemRefDialect,
                    tensor::TensorDialect, tosa::TosaDialect>();
    context_ = std::make_unique<MLIRContext>(registry);

    OwningOpRef<ModuleOp> module_OOR;
    module_OOR = parseSourceFile<mlir::ModuleOp>(filename, context_.get());
    ...

Then I want to parse the TOSA mlir in 2 ways at the same time. The first way is through above custom “pymlir” pybind API, and the scecond way is bulitin mlir.ir.Module.parse(context, ctx). Then I found the error is related to the order how I import this 2 packages:

1st error type
If I import pymlir first, mlir.ir.Module.parse will goes error:

import pymlir
import mlir
import mlir.ir

if __name__ == '__main__':
    with open("tosa_elided.mlir", 'r') as f:
        context = f.read()

    print("before mlir.ir.Module.parse:")
    ctx = mlir.ir.Context()
    mlir.ir.Module.parse(context, ctx)

    print("before pymlir.py_module.load")
    md = pymlir.py_module()
    md.load("tosa_elided.mlir")

The error is due to bultin mlir.ir.Module.parse:

before mlir.ir.Module.parse:
Traceback (most recent call last):
  File "/home/jhlou/forward-opt/build/bin/trans.py", line 15, in <module>
    mlir.ir.Module.parse(context, ctx)
mlir._mlir_libs._site_initialize.<locals>.MLIRError: Unable to parse module assembly:
error: "-":27:5: block with no terminator, has "func.return"(%23) : (tensor<1x8x16x4xf32>) -> ()
 note: "-":27:5: see current operation: "func.return"(%23) : (tensor<1x8x16x4xf32>) -> ()

2nd error type
If I import mlir.ir first, c++ parseSourceFile() in my custom pybind API will goes error:

import mlir.ir
import pymlir
import mlir


if __name__ == '__main__':
    with open("tosa_elided.mlir", 'r') as f:
        context = f.read()

    print("before mlir.ir.Module.parse:")
    ctx = mlir.ir.Context()
    mlir.ir.Module.parse(context, ctx)

    print("before pymlir.py_module.load")
    md = pymlir.py_module()
    md.load("tosa_elided.mlir")

The reason to error becoms to pymlir:

before mlir.ir.Module.parse:
before pymlir.py_module.load
loc("tosa_elided.mlir":27:5): error: block with no terminator, has "func.return"(%23) : (tensor<1x8x16x4xf32>) -> ()
file:tosa_elided.mlir, module: 0
python3: /home/jhlou/forward-opt/bindings/pymlir/pymlir.cpp:124: void py_module::load(std::string): Assertion `module_' failed.
Aborted (core dumped)

In general, the failed parseSourceFile() is always the later one to be imported. This is interesting, because it seems that the obj of parseSourceFile() is occupied by the first imported package. Hoping for your reply to solve this funny problem.

Same diagnostic as the other thread (why a new one by the way? still the same issue isn’t it?).

The two shared libraries (the python extension you import) have duplicated MLIR code and interact badly with each other.
It’s hard to advise without digging into how is everything built here (is there a libMLIR.so or are the extension statically linking MLIR? Are the MLIR symbols hidden in these libraries? etc.)

Sorry about making another topic.

is there a libMLIR.so or are the extension statically linking MLIR?

Just statically linking mlir.There is no libMLIR.so.

I don’t think so