[Q] x86 peephole deficiency

Hi all,

I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:

         andl $3, %edi
         je .LBB0_4
# BB#2: # %nz
                                         # in Loop: Header=BB0_1 Depth=1
         cmpl $2, %edi
         je .LBB0_6
# BB#3: # %nz.non-middle
                                         # in Loop: Header=BB0_1 Depth=1
         cmpl $2, %edi
         jbe .LBB0_4
# BB#5: # %sw.bb6
         ret

the second 'cmpl' is totally redundant, which pass is
(or would be) in charge of removing it?

Cheers,

  Gabor

MachineCSE should be in charge of zapping it.

-Chris

Hi all,

I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:

        andl $3, %edi
        je .LBB0_4
# BB#2: # %nz
                                        # in Loop: Header=BB0_1
Depth=1
        cmpl $2, %edi
        je .LBB0_6
# BB#3: # %nz.non-middle
                                        # in Loop: Header=BB0_1
Depth=1
        cmpl $2, %edi
        jbe .LBB0_4
# BB#5: # %sw.bb6
        ret

the second 'cmpl' is totally redundant, which pass is
(or would be) in charge of removing it?

MachineCSE should be in charge of zapping it.

Hi Chris,

I had a look into MachineCSE, but it looks like MBB-oriented.
The above problem is an inter-block one. Also MCSE seems
to perform value numbering on virtual/physical registers, which
does not map very well to status register bits that are implicitly
defined.
Any chance to recast this issue as a target-independent
(but cmp-specific) peephole problem, that just looks into
predecessor blocks and applies (target-hook-like) subsumption
checks for 'cmp' instructions?

I am thankful for any hint,

cheers,

  Gabor

I think that extending MachineCSE to do a simple dominator tree walk with llvm::ScopedHashTable would make sense.

Status register bits should be handled just like any other physreg. On x86, this is a def of EFLAGS physreg for example. On PPC, the condition code register is actually a vreg iirc.

-Chris

Hi Chris,

I had a look into MachineCSE, but it looks like MBB-oriented.

The above problem is an inter-block one. Also MCSE seems

to perform value numbering on virtual/physical registers, which

does not map very well to status register bits that are implicitly

defined.

Any chance to recast this issue as a target-independent

(but cmp-specific) peephole problem, that just looks into

predecessor blocks and applies (target-hook-like) subsumption

checks for ‘cmp’ instructions?

I think that extending MachineCSE to do a simple dominator tree walk with llvm::ScopedHashTable would make sense.

It already does that. MachineCSE is a global pass. It’s not a local CSE pass.

Evan