missing optimization for icmps in induction variables?

Sanjoy_Das1 · December 18, 2014, 8:31pm

Hi all,

I'm trying to get llvm to optimize away the %cmp to true in

define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
entry:
  %length = load i32* %length_ptr, !range !0
  %len.sub.1 = sub i32 %length, 1
  %upper = icmp slt i32 %init, %len.sub.1
  br i1 %upper, label %loop, label %exit

loop:
  %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
  %civ.inc = add i32 %civ, 1
  %cmp = icmp slt i32 %civ.inc, %length
  br i1 %cmp, label %latch, label %break

latch:
  store i32 0, i32* %array
  %check = icmp slt i32 %civ.inc, %len.sub.1
  br i1 %check, label %loop, label %break

break:
ret i32 %civ.inc

exit:
ret i32 42
}

!0 = !{i32 0, i32 2147483647}

One way to prove "%cmp == true" in two steps

1. notice that since both on the backedge and entry, %civ is known to
be less than %len.sub.1, which not i32_signed_max. This means
%civ.inc is an "add nsw".

2. on both the entry and backedge, we know "%civ `slt` %len.sub.1".
This implies "(%civ nsw+ 1) `slt` (%len.sub.1 nsw+ 1)" ==>
"%civ.inc `slt` %len".

Currently neither of these happen (i.e. even if I make transformation
(1) manually, (2) does not kick in).

Is the above reasoning correct? If it is, what is the right place to
add this logic to? I'm thinking ScalarEvolution (so that this gets
picked up by SimplifyIndVar), but maybe there is a more idiomatic
place? The case above is a simplified form of my real workload, which
involves smin and smax expressions; so the implementation has to be
easily generalizable to such cases.

Thanks,
-- Sanjoy

Nick_Lewycky · January 8, 2015, 6:06am

Sanjoy Das wrote:

Hi all,

I'm trying to get llvm to optimize away the %cmp to true in

define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
  entry:
   %length = load i32* %length_ptr, !range !0
   %len.sub.1 = sub i32 %length, 1
   %upper = icmp slt i32 %init, %len.sub.1
   br i1 %upper, label %loop, label %exit

  loop:
   %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
   %civ.inc = add i32 %civ, 1
   %cmp = icmp slt i32 %civ.inc, %length
   br i1 %cmp, label %latch, label %break

  latch:
   store i32 0, i32* %array
   %check = icmp slt i32 %civ.inc, %len.sub.1
   br i1 %check, label %loop, label %break

  break:
   ret i32 %civ.inc

  exit:
   ret i32 42
}

!0 = !{i32 0, i32 2147483647}

One way to prove "%cmp == true" in two steps

  1. notice that since both on the backedge and entry, %civ is known to
     be less than %len.sub.1, which not i32_signed_max. This means
     %civ.inc is an "add nsw".

  2. on both the entry and backedge, we know "%civ `slt` %len.sub.1".
     This implies "(%civ nsw+ 1) `slt` (%len.sub.1 nsw+ 1)" ==>
     "%civ.inc `slt` %len".

Currently neither of these happen (i.e. even if I make transformation
(1) manually, (2) does not kick in).

Is the above reasoning correct? If it is, what is the right place to
add this logic to? I'm thinking ScalarEvolution (so that this gets
picked up by SimplifyIndVar), but maybe there is a more idiomatic
place? The case above is a simplified form of my real workload, which
involves smin and smax expressions; so the implementation has to be
easily generalizable to such cases.

Before reading your two steps I was going to suggest jump threading. Jump threading is where we optimize redundant tests across blocks that feed into branches (block A tests property X and branches to block B which also tests property X). However jump threading is powered by lazy value info, which I don't think is suited for the sort of reasoning in your two steps.

One option is GVN. GVN does have x < y expressions but it doesn't try to deduce nuw/nsw bits. It might be possible to add that, but it isn't immediately obvious to me how. GVN also does path-sensitive expression commoning.

Nick

Sanjoy_Das1 · January 8, 2015, 7:20am

Hi Nick,

I checked in something towards (1) yesterday -- ⚙ D6748 Teach IndVarSimplify to add nuw and nsw to operations that provably don't overflow.

I was under the impression that (2) is exactly the kind of predicate
ScalarEvolution::isKnownPredicate is designed to solve (using
isImpliedCondXXX or something like that). Is there a reason to prefer
GVN over that?

Nick_Lewycky · January 13, 2015, 6:05am

Sanjoy Das wrote:

Hi Nick,

I checked in something towards (1) yesterday -- ⚙ D6748 Teach IndVarSimplify to add nuw and nsw to operations that provably don't overflow.

Ah, I missed that. Catching up on email.

I was under the impression that (2) is exactly the kind of predicate
ScalarEvolution::isKnownPredicate is designed to solve (using
isImpliedCondXXX or something like that). Is there a reason to prefer
GVN over that?

If you think of the problem as a loop problem then yes ScalarEvolution::isKnownPredicate makes sense. If you think of it as a redundancy elimination problem then GVN makes sense. Does this sort of problem occur outside of loops?

Sanjoy_Das1 · January 14, 2015, 11:23pm

If you think of the problem as a loop problem then yes
ScalarEvolution::isKnownPredicate makes sense. If you think of it as a
redundancy elimination problem then GVN makes sense. Does this sort of
problem occur outside of loops?

I've only observed them happen inside loops, but that is only because
I've been looking at loops. At least in theory if LLVM's GVN can
be made to in a path-sensitive manner, then there is no need to do
such tricks using SCEV.

I've only cursorily glanced into GVN.cpp, but given that we want LLVM
to conclude that '(x nsw+ 5) SLT L' implies 'x SLT (L nsw- 5)', will
the value numbering for cmp expressions have to become richer? The
operands to icmp themselves won't value number to the same value,
because they're not the same; but maybe there is some canonicalization
form we want to reduce the icmp expressions to before numbering them?

-- Sanjoy

Topic		Replies	Views
Inferring nsw/nuw flags for increment/decrement based on relational comparisons LLVM Dev List Archives	6	189	September 29, 2016
RFC: canonical icmp predicates? LLVM Dev List Archives	2	153	April 27, 2015
Rewriting compare instructions to avoid materializing previous induction variable LLVM Dev List Archives	1	94	August 29, 2014
Potentially unsafe loop optimization LLVM Dev List Archives	21	261	February 23, 2021
SimplifyIndVar looses nsw flags LLVM Dev List Archives	2	92	June 25, 2013

missing optimization for icmps in induction variables?

Related topics