Bug in DAG creation with inline asm?

I’m working through a bug in my Target where inline asm doesn’t seem to be handled properly. The most minimal example that shows this is below. In summary, it seems like when operands are added to the inlineasm node, the wrong operands are picked. When casting to a smaller integer type (truncating, etc), and passing that as an operand to an inline asm call, the original variable is used, not the truncated one.

Llvm function:

define void @inline_asm_trunc(i32 %x) {
    %y = trunc i32 %x to i8
    call void asm sideeffect "drvl $0", "r"(i8 %y)
    ret void
}

The very first generated DAG ends up being the following:

Initial selection DAG: %bb.0 'inline_asm_trunc:'
SelectionDAG has 12 nodes:
  t0: ch = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t3: i8 = truncate t2
  t8: ch,glue = CopyToReg t0, Register:i32 %1, t2
    t10: ch,glue = inlineasm t8, TargetExternalSymbol:i32'drvl $0', MDNode:ch<null>, TargetConstant:i32<1>, TargetConstant:i32<65545>, Register:i32 %1, t8:1
  t11: ch = P2RET t10

t3 is skipped over, despite that being the expected node to use as an operand in the inlineasm node. Is this expected behavior and there’s something I’m missing about how inlineasm works, and I need to add extra flags or code to handle it or something? Or is this a bug in creating inlineasm nodes?

I don’t have any custom inline asm code (other than supporting the ‘r’ constraint), just relying on the library code to do it all. Also, my target doesn’t have any special support for i8 or i16 types, registers are all 32 bits wide, so not sure if always using a 32 bit register class is causing it to choose the non-truncated value.

If i8 isn’t a legal register type, the DAG won’t leave any i8 values around (with the exception of TargetConstant and co.). You can’t have an illegal value directly copied into a register value type. Implicitly promoting to the next legal register type is what I expect here, but I wouldn’t be shocked if this is a buggy area.

So what would be the proper way to handle this situation? It arrises from the following C snippet:

void test(int x) { // x > 255
    char p1, p2;
    p1 = x;
    p2 = (x >> 8);

    asm("drvh %0"::"r"(p1));
    asm("drvh %0"::"r"(p2));
}

when the above code is compiled, the following DAG (fragment) is created:

  t0: ch = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t3: i8 = truncate t2
  t5: i32 = srl t2, Constant:i32<8>
  t6: i8 = truncate t5
  t11: ch,glue = CopyToReg t0, Register:i32 %1, t2
    t13: ch,glue = inlineasm t11, TargetExternalSymbol:i32'drvh $0', MDNode:ch<0x14cf076f8>, TargetConstant:i32<1>, TargetConstant:i32<65545>, Register:i32 %1, t11:1
  t16: ch,glue = CopyToReg t13, Register:i32 %2, t5
    t17: ch,glue = inlineasm t16, TargetExternalSymbol:i32'drvh $0', MDNode:ch<0x14cf079b8>, TargetConstant:i32<1>, TargetConstant:i32<65545>, Register:i32 %2, t16:1

So it seems to be ignoring the cast and using the original value. In TargetLowering, I allow i8, i16, and i32s to use the 32 bit register class for the “r” constraint. Is this incorrect to do? If I only allow i32, then it won’t be able to allocate a register to the char type.

I found a work around that solves this issue: I changed the constraint to only accept i32 types, and then explicitly cast back up to a 32 bit int:

asm("drvh %0"::"r"((int)p1));

However, seems like there should be an automatic upcast or something for these situations, since a 32 bit register can validly hold an 8 bit value, right?

This is extremely ill-defined and inconsistent, both between compilers and within compilers. If you’re using an operand that cannot be mapped to a real hardware register type then all bets are off for the upper bits. See ⚙ D22084 Fix atomic_*cmpset32 on riscv64 with clang. for a length write-up I did of all kinds of ways this behaves weirdly with both GNU-compatible compilers.

Thanks for all the in-depth info. Looks like you came to the same conclusion I did: explicitly cast to the size that the register expects.