LLVM dialect constant on unsigned/signed ints

george · June 29, 2020, 11:13pm

I noticed that LLVM dialect allows the following constant declaration:

%1 = llvm.mlir.constant(42 : si32) : !llvm.i32
%2 = llvm.mlir.constant(42 : ui32) : !llvm.i32

Since LLVM doesn’t make a distinction between signed and unsigned types, I am wondering if having signed/unsigned integer in attribute is the intended behaviour?

Following the post by @bondhugula on unsigned int constantOp in standard dialect, I assume that is the issue with the verifier?
This bug propagates from standard dialect as well: being unnoticed in standard its absolutely fine to do this:

$ cat bin/std.mlir

func @test() {
	%2 = constant 42 : si32
	%3 = constant 42 : ui32
	return
}


$ bin/mlir-opt bin/std.mlir -convert-std-to-llvm

module {
  llvm.func @test() {
    %1 = llvm.mlir.constant(42 : si32) : !llvm.i32
    %2 = llvm.mlir.constant(42 : ui32) : !llvm.i32
    llvm.return
  }
}

clattner · June 29, 2020, 11:59pm

This seems like a bug, the verifier for llvm.mlir.constant should require a signless attribute.

ftynse · June 30, 2020, 7:50am

There are very few checks in the LLVM dialect verifiers. The initial thinking was that LLVM itself would complain if one tries to construct wrong IR, and we did not want to replicate LLVM’s work. You can, for example, roundtrip llvm.mlir.constant("foo") : !llvm.i32. The translation to LLVM IR will likely complain about that, but not the verifier.

george · June 30, 2020, 9:51am

Thanks! However, I am not really sure I know how the roundtrip can complain about these cases? As far as I understand, MLIR constants are not translated to actual LLVM?
For example I have this

$ cat bin/llvm.mlir

module {
  llvm.func @const() {
    %0 = llvm.mlir.constant(0 : i32) : !llvm.i32
    %1 = llvm.mlir.constant("foo") : !llvm.i32
    llvm.return
  }
}


$ bin/mlir-translate -mlir-to-llvmir bin/llvm.mlir

; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"

declare i8* @malloc(i64)

declare void @free(i8*)

define void @const() !dbg !3 {
  ret void, !dbg !7
}

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2}

!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "mlir", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!1 = !DIFile(filename: "LLVMDialectModule", directory: "/")
!2 = !{i32 2, !"Debug Info Version", i32 3}
!3 = distinct !DISubprogram(name: "const", linkageName: "const", scope: null, file: !4, line: 2, type: !5, scopeLine: 2, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
!4 = !DIFile(filename: "...", directory: "...")
!5 = !DISubroutineType(types: !6)
!6 = !{}
!7 = !DILocation(line: 5, column: 6, scope: !8)
!8 = !DILexicalBlockFile(scope: !3, file: !4, discriminator: 0)

stellaraccident · June 30, 2020, 12:15pm

This seems like a case where we may want to revise the original thinking? Since MLIR’s integer types are tri-state (signed, unsigned and signless) and LLVM can only map signless (by design), I feel that I more expect the verifier to complain vs the converter. This is different than letting verification be open for types wholly unknown to the LLVM dialect, imo, where making legality be defined in terms of convertibility replicates fewer checks.

In either case, wherever the check should happen, we need to make sure it does.

clattner · June 30, 2020, 5:05pm

Yeah, I agree - early verification is better than last verification for a lot of reasons. I interpret Alex’s comment as saying “this has never been a priority so it hasn’t been done” – @ftynse I assume you’d be ok with others improving verification here over time?

ftynse · June 30, 2020, 5:35pm

By roundtrip, we usually understand mlir-opt filename.mlir that doesn’t do any transformation. Just parsing, verification and printing. The verifier should complain about invalid IR.

In LLVM IR, the constants are a separate class of values that are created and uniqued in the context (unlike MLIR, where they are just values created by a special operation). They won’t be printed unless there is an operation that uses them, and unused constants are essentially dropped if you print and re-parse the module. If you return the constant from the function, you’ll get

define i32 @const() !dbg !3 {
  ret [3 x i8] c"foo", !dbg !7
}

which LLVM (quite surprisingly for me) allows to be constructed, but which is rejected by LLVM’s opt with value doesn't match function result type 'i32.

Indeed, I’m providing the context for why we are were we are today. I am very happy if somebody can work on this! The first-party LLVM types RFC is actually a step in that direction, I expect most operations will have their type constraints encoded in ODS, which is easier to change. (Well, to be honest, this can be done even with the existing type system, but will be more brittle because of locking, in case one compares types by creating a new type, and verbose). Constants are a special case, I don’t think they will interfere much with the proposed type change.

Interestingly, we have the exact opposite of this. Types outside of the LLVM dialect will fail the verifier on any op in the dialect.

bondhugula · July 1, 2020, 6:35pm

This is actually also an issue with the std.constant verifier which isn’t supposed to allow ui32. This bug is open: https://bugs.llvm.org/show_bug.cgi?id=46222

In addition, I also posted the following on the other thread:

"For TF models coming into xla_hlo, it’s a common pattern in nearly every model to obtain the dimension size of a tensor as an unsigned int and then use that as a divisor later. The divisor folds to an unsigned int constant. When lowering to std , one could use diviu , but there has to be a zero extension to i64 which is available for an i32 but not ui32 - so, are we missing a cast from ui32 to i32 and it’s not clear to me where it should live."

Topic		Replies	Views
Constant op with unsigned int's? MLIR	2	730	June 25, 2020
Standard types not converted to LLVM MLIR	2	526	February 15, 2021
Llvm dialect lowering and constantOp MLIR	1	287	June 21, 2021
Verification and isa problem MLIR	2	403	May 21, 2022
[RFC] mlir::FloatAttr and legal types MLIR	14	1415	February 24, 2020

LLVM dialect constant on unsigned/signed ints

Related Topics