illegal code generated for special architecture

Hi!

I'm making a strange observation in my backend, that ends in illegal code:

Version 1:
- I lower FrameIndex to TargetFrameIndex (nothing special)
- I generate a special address-register ADD instruction in eliminateFrameIndex() to write FramePointer + offset into a new address-register
- I use explicit load and store and address-registers in my target instruction patterns:
  eg (store (add (load AddressRegs:$a), DataRegs:$b), AddressRegs:$dst)

This works quite well, but if I access an array on the stack (LLVM generates FrameIndex to access it):

  int buffer[BUFFER_SIZE];
  for(int i = 0; i < end_loop_index; i++) {
    buffer[i] = i;
  }

then LLVM generates the target instruction "ADD D1, A1, D0" which is an illegal instruction - all operands have to be data-registers Dx. I've checked more than once, that address-registers are not in the set of data-registers. The generated instruction is not part of my spec.

I guess this happens, because I replace the TargetFrameIndex by the address-register in eliminateFrameIndex without checking the parent operation (ADD in this case); it would be valid for some load/store instructions, but not for ADD. Can I run some legalization function for the illegal instruction?

Version 2:
The situation worsens (or improves, depending on the point of view) when I replace explicit address-register usage by common complex patterns, like ADDRx and MEMx: for the instruction where the first version generates illegal code, LLVM gives the error message:

LLVM ERROR: Cannot select: 0x7fc7a9034f10: ch = store 0x7fc7a9034d10:1, 0x7fc7a9034d10, 0x7fc7a9034b10, 0x7fc7a9034810<ST1[%arrayidx](align=4)> [ORD=18] [ID=7]
  0x7fc7a9034d10: i32,ch = load 0x7fc7a8c12c08, 0x7fc7a9034410, 0x7fc7a9034810<LD1[%i](align=4)> [ORD=15] [ID=5]
    0x7fc7a9034410: i32 = TargetFrameIndex<3> [ID=4]
    0x7fc7a9034810: i32 = undef [ID=1]
  0x7fc7a9034b10: i32 = add 0x7fc7a9034e10, 0x7fc7a9034d10 [ORD=17] [ID=6]
    0x7fc7a9034e10: i32 = TargetFrameIndex<2> [ID=3]
    0x7fc7a9034d10: i32,ch = load 0x7fc7a8c12c08, 0x7fc7a9034410, 0x7fc7a9034810<LD1[%i](align=4)> [ORD=15] [ID=5]
      0x7fc7a9034410: i32 = TargetFrameIndex<3> [ID=4]
      0x7fc7a9034810: i32 = undef [ID=1]
  0x7fc7a9034810: i32 = undef [ID=1]

This is the instruction where the array's base address and some offset are added to get the i-th's element address (buffer[i] from the C code above). Basically, I understand the error message, because such an instruction does not exist. But I do not understand why the big expression is not splitted into smaller parts.

I have absolutely no idea why this fails! Why cant LLVM write the operand 0x7fc7a9034e10: i32 = TargetFrameIndex<2> [ID=3] into an address-register (via my eliminateFrameIndex) and then copy it to a data-register, perform the add and copy the result back into an address-register?

Thanks in advance,
Boris

Hi Boris,

then LLVM generates the target instruction "ADD D1, A1, D0" which is an illegal instruction - all operands have to be data-registers Dx. I've checked more than once, that address-registers are not in the set of data-registers. The generated instruction is not part of my spec.

I guess this happens, because I replace the TargetFrameIndex by the address-register in eliminateFrameIndex without checking the parent operation (ADD in this case); it would be valid for some load/store instructions, but not for ADD. Can I run some legalization function for the illegal instruction?

This is a fairly common issue. Not necessarily with register classes,
but where different frame-accessing instructions have different
constraints. Existing targets just make sure that only certain
instructions ever get an <fi#N> MachineOperand and check the
MI.getOpcode() value (see rewriteAArch64FrameIndex for example in
lib/Target/AArch64/AArch64InstrInfo.cpp).

It's sometimes even necessary to insert new instructions to handle the
index (e.g. if the offset is too large).

I have absolutely no idea why this fails! Why cant LLVM write the operand 0x7fc7a9034e10: i32 = TargetFrameIndex<2> [ID=3] into an address-register (via my eliminateFrameIndex) and then copy it to a data-register, perform the add and copy the result back into an address-register?

I'd be a little worried about the existence of TargetFrameIndex in
pre-ISel code. They should only be created in situations you know the
user will be able to cope with. Otherwise just create a FrameIndex,
and let it go through ISel if needed.

This might also explain why you're ending up with an unexpected "ADD
reg, <fi>, reg" in eliminateFrameIndex.

Cheers.

Tim.

Hi Tim,

Hi Boris,

then LLVM generates the target instruction "ADD D1, A1, D0" which is an illegal instruction - all operands have to be data-registers Dx. I've checked more than once, that address-registers are not in the set of data-registers. The generated instruction is not part of my spec.

I guess this happens, because I replace the TargetFrameIndex by the address-register in eliminateFrameIndex without checking the parent operation (ADD in this case); it would be valid for some load/store instructions, but not for ADD. Can I run some legalization function for the illegal instruction?

This is a fairly common issue. Not necessarily with register classes,
but where different frame-accessing instructions have different
constraints. Existing targets just make sure that only certain
instructions ever get an <fi#N> MachineOperand and check the
MI.getOpcode() value (see rewriteAArch64FrameIndex for example in
lib/Target/AArch64/AArch64InstrInfo.cpp).

It's sometimes even necessary to insert new instructions to handle the
index (e.g. if the offset is too large).

rewriteAArch64FrameIndex seems to be dead in llvm 3.4.2 (llvm_unreachable("Unimplemented rewriteFrameIndex"):wink: so I had a look on rewriteARMIndex, but this is called in eliminateFrameIndex. The error happens prior to eliminateFrameIndex (according to my debug output).

I have absolutely no idea why this fails! Why cant LLVM write the operand 0x7fc7a9034e10: i32 = TargetFrameIndex<2> [ID=3] into an address-register (via my eliminateFrameIndex) and then copy it to a data-register, perform the add and copy the result back into an address-register?

I'd be a little worried about the existence of TargetFrameIndex in
pre-ISel code. They should only be created in situations you know the
user will be able to cope with. Otherwise just create a FrameIndex,
and let it go through ISel if needed.

This might also explain why you're ending up with an unexpected "ADD
reg, <fi>, reg" in eliminateFrameIndex.

I eliminated lowering FrameIndex to TargetFrameIndex; I did that to make version 1 (direct address register access) work. But I get the same error message again (with FrameIndex instead of TargetFrameIndex):

LLVM ERROR: Cannot select: 0x7fb771834b10: ch = store 0x7fb771834e10:1, 0x7fb771834e10, 0x7fb771835510, 0x7fb771835010<ST1[%arrayidx](align=4)> [ORD=18] [ID=7]
  0x7fb771834e10: i32,ch = load 0x7fb771412c78, 0x7fb771834f10, 0x7fb771835010<LD1[%i](align=4)> [ORD=15] [ID=5]
    0x7fb771834f10: i32 = FrameIndex<3> [ID=1]
    0x7fb771835010: i32 = undef [ID=2]
  0x7fb771835510: i32 = add 0x7fb771835610, 0x7fb771834e10 [ORD=17] [ID=6]
    0x7fb771835610: i32 = FrameIndex<2> [ID=3]
    0x7fb771834e10: i32,ch = load 0x7fb771412c78, 0x7fb771834f10, 0x7fb771835010<LD1[%i](align=4)> [ORD=15] [ID=5]
      0x7fb771834f10: i32 = FrameIndex<3> [ID=1]
      0x7fb771835010: i32 = undef [ID=2]
  0x7fb771835010: i32 = undef [ID=2]

According to my debugging output, SelectADDR fails (returns false), because the ADD does not form a valid addressing mode - the target supports indirect without offset only. But I expect the compiler to create a valid address mode by writing the address created by FrameIndex<2> into address register (pattern exists), copy it to a data register (pattern exists), perform the ADD (pattern exits) and write to an address register (pattern exists). So, how do I make the code generator do it? I need more info what exactly goes wrong.

minor update: after going through the code generator with the options -print-machineinstrs -debug-only=isel, it seems to me that the code generator leaves the ADD expression untouched and then fails. Do I have to make the expression legal myself? If yes, where?

Boris