Arithmetic referencing dso_local function causes compilation error on Linux/x64

jack.w · July 9, 2024, 6:28am

I have encountered an odd issue where LLVM fails to compile arithmetic referencing a local function on Linux/x64.
Here is a minimal (hopefully reproducible) snippet:

; min_test.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define dso_local void @myFunction() {
    ret void
}
define i64 @main() {
    %1 = ptrtoint ptr @myFunction to i64
    %2 = sub i64 %1, 2147483648    ; = 0x80000000
    %3 = lshr i64 %2, 1    ; `add` also fails
    ret i64 %3
}

I am compiling on Linux/x64 with:

/llvm-17/bin/clang++ -O1 -c min_test.ll -o min_test.o

This gives the following error:

<unknown>:0: error: value of -2147483671 is too large for field of 4 bytes.
error: cannot compile inline asm
1 error generated.

I have found that:

It fails with LLVM 15, 16, and 17 (I haven’t tested other versions)
The snippet fails on Linux/x64, but compiles successfully on MacOS/arm64
Compilation only fails when myFunction is declared as dso_local.
Interestingly, after adding 1 to the integer constant (2147483649 = 0x80000001), the program compiles successfully.

My expectation is that the pointer to myFunction would be stored in a register, and arithmetic instructions emitted working on that register. (This is indeed the case when changing the integer constant so it compiles successfully). I am not sure where the field of 4 bytes is coming from.

If anyone has any ideas why this might be happening I would appreciate your thoughts!

Thanks

phoebe · July 9, 2024, 2:52pm

The error comes from assembler. Note, both GNU and LLVM assembler give the same error: Compiler Explorer

jack.w · July 10, 2024, 3:53am

Thank you!
I have run using debug LLVM 17 with logging enabled. It looks like the EarlyCSE pass simplifies the arithmetic to a ConstantExpr:

EarlyCSE Simplify:   %1 = ptrtoint ptr @myFunction to i64  to: i64 ptrtoint (ptr @myFunction to i64)
EarlyCSE Simplify:   %1 = sub i64 ptrtoint (ptr @myFunction to i64), 2147483648  to: i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648)
EarlyCSE Simplify:   %1 = lshr i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), 1  to: i64 lshr (i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), i64 1)

SelectionDAG has 11 nodes:
    t0: ch,glue = EntryToken
        t14: i64 = X86ISD::WrapperRIP TargetGlobalAddress:i64<ptr @myFunction> 0
      t12: i64 = add t14, Constant:i64<-2147483648>
    t6: i64 = srl t12, Constant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1

Then instruction selection maps to an LEA instruction, which fails to assemble as before:

===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 13 nodes:
    t0: ch,glue = EntryToken
      t12: i64 = LEA64r Register:i64 $rip, TargetConstant:i8<1>, Register:i64 $noreg, TargetGlobalAddress:i32<ptr @myFunction> -2147483648, Register:i16 $noreg
    t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t16: i32 = Register $noreg
  t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1

Adding 1 to the constant causes the instruction selection to prefer LEA then ADD, which compiles successfully:

===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 16 nodes:
    t0: ch,glue = EntryToken
        t14: i64 = LEA64r Register:i64 $rip, TargetConstant:i8<1>, Register:i64 $noreg, TargetGlobalAddress:i32<ptr @myFunction> 0, Register:i16 $noreg
        t11: i64 = MOV64ri TargetConstant:i64<-2147483649>
      t12: i64,i32 = ADD64rr t14, t11
    t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t16: i32 = Register $noreg
  t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1

Is it expected that LLVM may sometimes produce invalid assembly given valid IR code? Or would this be considered a bug in instruction selection?

topperc · July 10, 2024, 4:34am

Adding -code-model=large will make it work. Without that it’s generating a 32-bit pc-relative relocation and the final offset is too large for 32 bits.

MaskRay · July 11, 2024, 7:02am

The issue isn’t specific dso_local. You can reproduce it with a local linkage symbol

// llc -O1
define internal void @myFunction() {
    ret void
}
define i64 @main() {
    %1 = ptrtoint ptr @myFunction to i64
    %2 = sub i64 %1, 2147483648    ; = 0x80000000
    %3 = lshr i64 %2, 1    ; `add` also fails
    ret i64 %3
}

The issue resembles previous offset folding issues ⚙ D73606 [X86] matchAdd: don't fold a large offset into a %rip relative address and ⚙ D93931 [X86] Don't fold negative offset into 32-bit absolute address (e.g. movl $foo-1, %eax) .

Created [X86] Don't fold offsets that are too closer to INT32_MIN in non-large code models by MaskRay · Pull Request #98438 · llvm/llvm-project · GitHub

jack.w · July 12, 2024, 1:06am

Thank you for looking into this @MaskRay !

tyker · July 12, 2024, 9:49pm

This looks somewhat related to [llc] Signed Overflow detected by UBSan. · Issue #75944 · llvm/llvm-project · GitHub

Topic		Replies	Views
Fault when lowering mlir dialect to llvm and compiled it on ARM MLIR	7	336	January 12, 2022
MLIR code succeeds on macOS x86_64 but fails on aarch64 with segfault AArch64 sparse , macos , mlir	2	124	April 24, 2024
Can't bootstrap llvm-gcc-4.0 for x84_64 LLVM Dev List Archives	12	75	October 11, 2007
build broken on linux/amd64 LLVM Dev List Archives	4	81	September 8, 2006
Possible miscompilation? LLVM Dev List Archives	9	97	June 12, 2008

Arithmetic referencing dso_local function causes compilation error on Linux/x64

Related topics