Hi all,
I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:
andl $3, %edi
je .LBB0_4
# BB#2: # %nz
# in Loop: Header=BB0_1 Depth=1
cmpl $2, %edi
je .LBB0_6
# BB#3: # %nz.non-middle
# in Loop: Header=BB0_1 Depth=1
cmpl $2, %edi
jbe .LBB0_4
# BB#5: # %sw.bb6
ret
the second 'cmpl' is totally redundant, which pass is
(or would be) in charge of removing it?
Cheers,
Gabor
MachineCSE should be in charge of zapping it.
-Chris
Hi all,
I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:
andl $3, %edi
je .LBB0_4
# BB#2: # %nz
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
je .LBB0_6
# BB#3: # %nz.non-middle
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
jbe .LBB0_4
# BB#5: # %sw.bb6
ret
the second 'cmpl' is totally redundant, which pass is
(or would be) in charge of removing it?
MachineCSE should be in charge of zapping it.
Hi Chris,
I had a look into MachineCSE, but it looks like MBB-oriented.
The above problem is an inter-block one. Also MCSE seems
to perform value numbering on virtual/physical registers, which
does not map very well to status register bits that are implicitly
defined.
Any chance to recast this issue as a target-independent
(but cmp-specific) peephole problem, that just looks into
predecessor blocks and applies (target-hook-like) subsumption
checks for 'cmp' instructions?
I am thankful for any hint,
cheers,
Gabor
I think that extending MachineCSE to do a simple dominator tree walk with llvm::ScopedHashTable would make sense.
Status register bits should be handled just like any other physreg. On x86, this is a def of EFLAGS physreg for example. On PPC, the condition code register is actually a vreg iirc.
-Chris
Hi Chris,
I had a look into MachineCSE, but it looks like MBB-oriented.
The above problem is an inter-block one. Also MCSE seems
to perform value numbering on virtual/physical registers, which
does not map very well to status register bits that are implicitly
defined.
Any chance to recast this issue as a target-independent
(but cmp-specific) peephole problem, that just looks into
predecessor blocks and applies (target-hook-like) subsumption
checks for ‘cmp’ instructions?
I think that extending MachineCSE to do a simple dominator tree walk with llvm::ScopedHashTable would make sense.
It already does that. MachineCSE is a global pass. It’s not a local CSE pass.
Evan