I have implemented a custom dialect that is ultimately targeting LLVM. It contains a LLVM lowering pass that is a identical copy of the LLVMLoweringPass from the toy tutorial.
My dialect contains an operation to allocate memory and fill it with a user-defined value that is most often zero. The actual allocation lowers to memref.alloca
and the overriding of the memory will be implemented by an scf.parallel
. A minimal reproduction case is
func.func @main() -> i64 {
%tensor = memref.alloca() : memref<2x2x2xf64>
%lower = arith.constant 0 : index
%upper = arith.constant 2 : index
%step = arith.constant 1 : index
scf.parallel (%arg0, %arg1, %arg2) = (%lower, %lower, %lower) to (%upper, %upper, %upper) step (%step, %step, %step) {
%zero = arith.constant 2.0 : f64
memref.store %zero, %tensor[%arg0, %arg1, %arg2] : memref<2x2x2xf64>
}
%i1 = arith.constant 1 : index
%i2 = arith.constant 0 : index
%i3 = arith.constant 1 : index
%element = memref.load %tensor[%i1, %i2, %i3] : memref<2x2x2xf64>
%r = arith.fptosi %element : f64 to i64
return %r : i64
}
I invoke my equivalent of mlir-opt
and instruct it to lower the code above to LLVM. The added passes are
RewritePatternSet patterns(&getContext());
mlir::populateAffineToStdConversionPatterns(patterns);
mlir::populateSCFToControlFlowConversionPatterns(patterns);
mlir::index::populateIndexToLLVMConversionPatterns(typeConverter, patterns);
mlir::arith::populateArithToLLVMConversionPatterns(typeConverter, patterns);
mlir::populateMemRefToLLVMConversionPatterns(typeConverter, patterns);
mlir::cf::populateControlFlowToLLVMConversionPatterns(typeConverter,
patterns);
mlir::populateFuncToLLVMConversionPatterns(typeConverter, patterns);
Unfortunately, the conversion of the scf.parallel
fails. On the contrary, mlir-opt --convert-scf-to-cf repeat.mlir
manages to replace the scf.parallel
operation.
To see what is going on, I added the -debug-only=dialect-conversion
option to pin down the issues.
//===-------------------------------------------===//
Legalizing operation : 'scf.parallel'(0x7fb74270abf0) {
* Fold {
} -> FAILURE : unable to fold
* Pattern : 'scf.parallel -> ()' {
** Insert : 'scf.yield'(0x7fb743305990)
** Insert : 'scf.for'(0x7fb743305ee0)
** Insert : 'scf.yield'(0x7fb743306000)
** Insert : 'scf.for'(0x7fb743306050)
** Insert : 'scf.yield'(0x7fb7433061a0)
** Insert : 'scf.for'(0x7fb7433061f0)
** Erase : 'scf.yield'(0x7fb74270a980)
** Replace Argument : '<block argument> of type 'index' at index: 0'(in region of 'scf.parallel'(0x7fb74270abf0)
** Replace Argument : '<block argument> of type 'index' at index: 1'(in region of 'scf.parallel'(0x7fb74270abf0)
** Replace Argument : '<block argument> of type 'index' at index: 2'(in region of 'scf.parallel'(0x7fb74270abf0)
** Replace : 'scf.parallel'(0x7fb74270abf0)
//===-------------------------------------------===//
Legalizing operation : 'scf.yield'(0x7fb743305990) {
"scf.yield"() : () -> ()
* Fold {
} -> FAILURE : unable to fold
} -> FAILURE : no matched legalization pattern
//===-------------------------------------------===//
} -> FAILURE : failed to legalize generated operation 'scf.yield'(0x00007FB743305990)
} -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern
//===-------------------------------------------===//
It seems that it fails to legalize scf.yield
. But does this even mean? How can such a conversion fail with the conversion patterns I mentioned above? And why is mlir-opt
able to lower successfully?