Question about register allocation

Hi,
I’d like to understand how register allocation works in the case where an instruction is folded into another one. Where in the code would be a good place to start looking at?

After ISEL, one of the instructions has another instruction folded into it, which looks like this

t1: i32,i1,i1,i1,i1 = ADDRR TargetFrameIndex:i32<0>, MOVRI:i32,i1,i1

But during the ‘Assembly Printer’ pass, when emitting the assembly for ADDRR, the assertion at the beginning of getRegisterName() in XXXGenAsmWriter.inc fails because RegNo is 0.
I’d like to know how that happened.

Thanks.

I'd like to understand how register allocation works in the case where an instruction is folded into another one. Where in the code would be a good place to start looking at?

I don't think you've got two instructions folded together. On the
output side that would just be a single instruction.

After ISEL, one of the instructions has another instruction folded into it, which looks like this

t1: i32,i1,i1,i1,i1 = ADDRR TargetFrameIndex:i32<0>, MOVRI:i32,i1,i1

I've never seen anything displayed like that, and it worries me. I
thought only constants of some kind were inlined, which a MOVRI
instruction almost certainly doesn't qualify as.

You'll want to look at how that's serialized after the DAG phase to
diagnose the issue. To the extent that it's meaningful, it ought to be
entirely equivalent to:

    %v0 = MOVRI [...]
    %t1 = ADDRR frameindex<0>, %v0, [...]

Generally, the "-print-after-all" option for llc is the quickest way
to access this kind of information. The first time the ADDRR
instruction occurs is just after an opaque "DAG ISel" phase that
you've been introspecting.

But during the 'Assembly Printer' pass, when emitting the assembly for ADDRR, the assertion at the beginning of getRegisterName() in XXXGenAsmWriter.inc fails because RegNo is 0. I'd like to know how that happened.

Assembly printer is very late. In fact it's the very last pass LLVM
runs. Generally any assertion failure there can be traced to bad input
(unless you're literally implementing the AsmPrinter at the time),
which again points to "-print-after-all" as a diagnostic tool.

There's usually something obviously wrong with the final instructions,
which can be traced back to something less-obviously-but-still wrong
before register allocation (often mismatched register classes). The
trick is establishing the first phase were things are incorrect and
working out why.

From what you've said I'd bet on ISel unfortunately (it's a big phase).

Cheers.

Tim.

Thanks Tim,

The -print-after-all option did help. The problem happens after Prologue/epilogue insertion

*** IR Dump After Fast Register Allocator ***:

Machine code for function _Z16test_rotate_leftv: NoPHIs, TracksLiveness, NoVRegs

Frame Objects:

fi#0: size=4, align=4, at location [SP]

$r0 = ADDR %stack.0.a, $r0(tied-def 0), implicit-def dead $cf, implicit-def dead $nf, implicit-def dead $zf, implicit-def dead $of

End machine code for function _Z16test_rotate_leftv.

*** IR Dump After Prologue/Epilogue Insertion & Frame Finalization ***:

Machine code for function _Z16test_rotate_leftv: NoPHIs, TracksLiveness, NoVRegs

Frame Objects:

fi#0: size=4, align=4, at location [SP-4]

$r0 = ADDR $noreg, $r0(tied-def 0), implicit-def dead $cf, implicit-def dead $nf, implicit-def dead $zf, implicit-def dead $of

Somehow ‘%stack.0.a’ became ‘$noreg’

Other than XXXSEFrameLowering::emitPrologue(), what other function is called when entering a function?

Thanks.

Hi Josh,

  $r0 = ADDR %stack.0.a, $r0(tied-def 0), implicit-def dead $cf, implicit-def dead $nf, implicit-def dead $zf, implicit-def dead $of

  $r0 = ADDR $noreg, $r0(tied-def 0), implicit-def dead $cf, implicit-def dead $nf, implicit-def dead $zf, implicit-def dead $of

Somehow '%stack.0.a' became '$noreg'

That's normally done by the "eliminateFrameIndex" functions, which do
get called during the prologue/epilogue pass but live in
XYZRegisterInfo.cpp. They get told that "%stack.0.a" lives at SP+N,
and they have to rewrite the specified instruction to encode that
information.

Usually the instruction involved is (or contains) an add-immediate,
and what would be the base of the addition is %stack.whatever at the
start. For example:

   $r0 = ADDri %stack.0.a, 0

Then if eliminateFrameIndex gets called with that instruction and the
variable is actually at sp+12 it'll rewrite it to:

   $r0 = ADDri $sp, 12

modifying two of the operands (the base and the offset).

That immediately starts ringing alarm-bells with your ADDR since it
doesn't look like it allows a constant offset, so there's no way
eliminateFrameIndex *can* make it compute the local's address and
really it probably shouldn't be the instruction that %stack.0.a got
folded into in the first place. If I'm reading your example correctly,
what I'd expect to see instead is:

    $rTmp = ADDri %stack.0.a, 0
    $r0 = ADDR $rTmp, $r0(tied-def 0)

Obviously I don't know what you've called your stack-capable
instruction so I'm just guessing ADDri.

Cheers.

Tim.