Folding zext from i1 into PHI nodes with only zwo incoming values.

Hi,

AFAICT there are two places where zext instructions may get folded into PHI nodes. One is FoldPHIArgZextsIntoPHI and the other is the more generic FoldPHIArgOpIntoPHI. Now, the former only handles PHIs with more than 2 incoming values, while the latter only handles casts where the source type is legal.

This means that for an PHI node with two incoming i8 values, both resulting from zext i1 * to i8 instructions, both of these functions will refuse to actually fold the zext into the PHI, while the same operation would be performed if there were three or more arms. We noticed this because we saw a optimization regression when a function got specialized and the PHI node only had two incoming values left.

Since I’m not fully aware of any implications this might have, I wonder what is the right way to fix this? Looking at FoldPHIArgZextsIntoPHI, it seems that making the check for ShouldChangeType in FoldPHIArgOpIntoPHI conditional on the cast instruction not being a zext instruction. Does that sound right, or am I missing something here?

Thanks

Björn

I’m looking at a similar problem in:
https://reviews.llvm.org/D28625

Does that patch make any difference on the cases that you are looking at?

Instead of avoiding ShouldChangeType with zext as a special-case opcode, it might be better to treat i1 as a special-case type. There’s no way to avoid i1 in IR, so we might as well allow transforming to that type?

I’m not sure yet, but there’s a chance that change might induce problems (infinite loops) with this:
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp#L374

The foldPhiOpArgIntoPhi looks like a few special cases of the general transform (that is applicable as a rewrite rule):

phi(F(1,2,…), F(A,B,…),…) == F(phi(1,A,…), phi(1,B,…), …)

This follows directly from phis being conditional selects.

in code:

a = b + c
b = d + e
result = phi(a, b)

is equivalent to

tmp1 = phi(b, d)

tmp2 = phi(c, e)
result = tmp1 + tmp2

this is true for any number of operators and operations.

The downside is fixpointing this rule (and even probably the one being used in foldPhiOpArgIntoPhi) is that it may require exponential applications of the rule.

Hi Sanjay,

unfortunately that patch does not help in my case. Here’s the IR that fails to get fully optimized:

target datalayout = “e-m:e-i64:64-f80:128-n8:16:32:64-S128”
target triple = “x86_64-unknown-linux-gnu”

define fastcc zeroext i1 @testfunc(i8** noalias nocapture readonly dereferenceable(8)) unnamed_addr {
entry-block:
%1 = load i8*, i8** %0, align 8
%2 = icmp ne i8* %1, null
%.mux = zext i1 %2 to i8
br i1 %2, label %bb10, label %bb15

bb10: ; preds = %entry-block
%3 = load i8, i8* %1, align 1
%4 = icmp eq i8 %3, 42
%.1 = zext i1 %4 to i8
br label %bb15

bb15: ; preds = %entry-block, %bb10
%_0.1 = phi i8 [ %.mux, %entry-block ], [ %.1, %bb10 ]
%5 = icmp ne i8 %_0.1, 0
ret i1 %5
}

The zext instructions should be folded into the phi, and then the new zext gets removed along with the icmp instruction at the end.

Björn

Hi Björn and Daniel,

FoldPHIArgOpIntoPHI() is for unary ops (casts), but it calls FoldPHIArgBinOpIntoPHI() for binops which takes care of almost everything else?

My minimal patch idea is to ease the restriction in ShouldChangeType because i1 is special. I tried the patch below, and it works on the example…and nothing in ‘make check’ failed. :slight_smile:

Index: lib/Transforms/InstCombine/InstructionCombining.cpp

Hi Sanjay,

FWIW: It seems like it tries to restrict itself to certain cases (single
use) but i can still pretty easily produce a testcase where it would
require an exponential number of applications.

That said, hasn't triggered so far for folks, so \_(ツ)_/¯

Hi Sanjay,

My minimal patch idea is to ease the restriction in ShouldChangeType
because i1 is special. I tried the patch below, and it works on the
example...and nothing in 'make check' failed. :slight_smile:

Yeah, that would work for me as well, I just wasn't sure about the
implications that has. Simply making FoldPHIArgOpIntoPHI act like
FoldPHIArgZextsIntoPHI seemed like the safer option to me, but I wanted
feedback on it before creating a PR. Do you want to go ahead with that
minimal approach and create a PR yourself?

Your idea probably has the minimal impact while still fixing your problem
case...although the relationships between FoldPHIArgOpIntoPHI /
FoldPHIArgZextsIntoPHI / FoldOpIntoPhi are confusing.

I think a patch for ShouldChangeType makes sense on its own (and makes any
direct changes to the FoldPHI* functions unnecessary for your example).
I'll try some more experiments with that patch to look for unintended
consequences.

When you say "PR", do you mean a bugzilla report or a patch proposal on
Phab?

Hi Sanjay,

My minimal patch idea is to ease the restriction in ShouldChangeType
because i1 is special. I tried the patch below, and it works on the
example...and nothing in 'make check' failed. :slight_smile:

Yeah, that would work for me as well, I just wasn't sure about the
implications that has. Simply making FoldPHIArgOpIntoPHI act like
FoldPHIArgZextsIntoPHI seemed like the safer option to me, but I wanted
feedback on it before creating a PR. Do you want to go ahead with that
minimal approach and create a PR yourself?

Your idea probably has the minimal impact while still fixing your problem
case...although the relationships between FoldPHIArgOpIntoPHI /
FoldPHIArgZextsIntoPHI / FoldOpIntoPhi are confusing.

I think a patch for ShouldChangeType makes sense on its own (and makes any
direct changes to the FoldPHI* functions unnecessary for your example).
I'll try some more experiments with that patch to look for unintended
consequences.

OK, I'll wait for your results then

When you say "PR", do you mean a bugzilla report or a patch proposal on
Phab?

I meant a patch proposal on Phab, I'm too used to GitHub terminology by
now, sorry.

Cheers,
Björn

I don’t see any downside for the other users of shouldChangeType(), so I’ve cleaned up the patch and posted it for review:
https://reviews.llvm.org/D29336