Instruction Cleanup Questions

I am working on cleaning up some PPC code generation. Two questions:

1. Which pass is responsible for cleaning up self-moves:
   0x00000000100057c0 <+208>: mr r3,r3

2. Which pass is responsible for cleaning up unconditional jumps that
should be fall-throughs:
   0x0000000010005d88 <+1688>: b 0x10005d8c <._Z11sfoo+1692>
   0x0000000010005d8c <+1692>: ld r3,-32056(r2)

Maybe there are no such passes, but these things appear in optimized
code, and I'm trying to figure out why.

Thanks in advance,
Hal

This should be handled by the MachineBlockPlacement (among others). Do you have a reduced est case?

Hi Hal,

I am working on cleaning up some PPC code generation. Two questions:

  1. Which pass is responsible for cleaning up self-moves:
    0x00000000100057c0 <+208>: mr r3,r3

and the RA should eliminate trivial copies.

and the RA should eliminate trivial copies.

Most probably PPC backend misses some hooks / descriptions...

Chandler,

Thanks for the pointer! It turns out that this was a problem only in
the context of a local modification (and I could fix it by fixing
AnalyzeBranch and friends). That triggered other problems, but that's
another story...

-Hal

Hi Hal,

>
> I am working on cleaning up some PPC code generation. Two
> questions:
>
> 1. Which pass is responsible for cleaning up self-moves:
> 0x00000000100057c0 <+208>: mr r3,r3
>

and the RA should eliminate trivial copies.

On PPC, normal moves are encoded as OR instructions where the two
operands being ORed together are the same. These self moves, as it
turns out, come from things like this:

%vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16

This is generated from the pattern:

def : Pat<(i32 (trunc G8RC:$in)),
          (OR8To4 G8RC:$in, G8RC:$in)>;

So, as far as RA is concerned, this is a "real" operation (a binary OR
which truncates the result to 32-bits (from 64-bit inputs)). In
effect, however, this is just a self copy.

How can I fix this?

Thanks again,
Hal

def : Pat<(i32 (trunc G8RC:$in)),
          (EXTRACT_SUBREG G8RC:$in, sub_32)>;

This exposes the copies to the register coalescer and VirtRegMap::rewrite() which eliminates identity copies. You can probably lose the OR8To4 pseudo after that.

I assume there will be no problems with 32-bit instructions using the low part of 64-bit registers without clearing the high part first.

/jakob

What is this pattern trying to achieve? Is the OR actually necessary at all, or can you use an EXTRACT_SUBREG instead?

--Owen

> On PPC, normal moves are encoded as OR instructions where the two
> operands being ORed together are the same. These self moves, as it
> turns out, come from things like this:
>
> %vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16
>
> This is generated from the pattern:
>
> def : Pat<(i32 (trunc G8RC:$in)),
> (OR8To4 G8RC:$in, G8RC:$in)>;
>
> So, as far as RA is concerned, this is a "real" operation (a binary
> OR which truncates the result to 32-bits (from 64-bit inputs)). In
> effect, however, this is just a self copy.
>
> How can I fix this?

def : Pat<(i32 (trunc G8RC:$in)),
          (EXTRACT_SUBREG G8RC:$in, sub_32)>;

This exposes the copies to the register coalescer and
VirtRegMap::rewrite() which eliminates identity copies. You can
probably lose the OR8To4 pseudo after that.

Thanks!

I assume there will be no problems with 32-bit instructions using the
low part of 64-bit registers without clearing the high part first.

I don't think so, the high part of the register is not separately
addressable. I can certainly imagine ways in which this could cause
problems, but I think that in practice it is okay because the 32-bit
adds, subs, cmps, etc. are all separate instructions -- and, in any
case, I did not write this, so I'm hoping that whoever did thought
this through :wink:

-Hal

>
> On PPC, normal moves are encoded as OR instructions where the two
> operands being ORed together are the same. These self moves, as it
> turns out, come from things like this:
>
> %vreg18<def> = OR8To4 %vreg16, %vreg16; GPRC:%vreg18 G8RC:%vreg16
>
> This is generated from the pattern:
>
> def : Pat<(i32 (trunc G8RC:$in)),
> (OR8To4 G8RC:$in, G8RC:$in)>;
>
> So, as far as RA is concerned, this is a "real" operation (a binary
> OR which truncates the result to 32-bits (from 64-bit inputs)). In
> effect, however, this is just a self copy.
>
> How can I fix this?

What is this pattern trying to achieve? Is the OR actually necessary
at all, or can you use an EXTRACT_SUBREG instead?

I can, thanks!

-Hal