Handling post-inc users in LSR

Hello,

For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64.

From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc mode. Based on the observation that the icmp is already a post-inc user, I hacked LSR to prevent the icmp from being transformed to post-inc form in OptimizeLoopTermCond() before the initial formulae are determined. Luckily, I was able to remove the redundant add instruction with this hack, but I really doubt if it make sense to prevent a loop terminating condition from being changed to postinc form when it's already a post-inc user.

# Input IR :

define void @foo(i32 %n, i32* %P) {
entry:
   %cmp7 = icmp sgt i32 %n, 1
   br i1 %cmp7, label %for.body.preheader, label %for.end

for.body.preheader: ; preds = %entry
   %n_sext = sext i32 %n to i64
   br label %for.body

for.body:
   %K.in = phi i64 [ %n_sext, %for.body.preheader ], [ %K, %for.body ]
   %K = add i64 %K.in, 1

   %StoredAddr = getelementptr i32, i32* %P, i64 %K
   %StoredValue = trunc i64 %K to i32
   store volatile i32 %StoredValue, i32* %StoredAddr
   %cmp = icmp sgt i64 %K, 1
   br i1 %cmp, label %for.body, label %for.end

for.end:
   ret void
}

# Output in AArch64 where you can see redundant add instructions for stored value, store address, and in cmp :

foo:
  .cfi_startproc
// BB#0:
  cmp w0, #2
  b.lt .LBB0_3
// BB#1:
  sxtw x9, w0
  add w8, w0, #1
.LBB0_2:
  add x10, x1, x9, lsl #2
  add x9, x9, #1
  str w8, [x10, #4]
  add w8, w8, #1
  cmp x9, #1
  b.gt .LBB0_2
.LBB0_3:
  ret

Hello,

For a very simple loop where all IV users are post-inc users, I observed redundant add instructions in AArch64.

From LSR debug, I can see initial formula for icmp is the one that transformed to a post-inc form in OptimizeLoopTermCond() and later expanded in post-inc mode. Based on the observation that the icmp is already a post-inc user, I hacked LSR to prevent the icmp from being transformed to post-inc form in OptimizeLoopTermCond() before the initial formulae are determined. Luckily, I was able to remove the redundant add instruction with this hack, but I really doubt if it make sense to prevent a loop terminating condition from being changed to postinc form when it's already a post-inc user.

I agree, but don’t have a better suggestion. You could file a bug. Anyone have time to try out some fixes?

Andy

Thanks Andy for your response. We already have a related bug opened in https://llvm.org/bugs/show_bug.cgi?id=26913 .
I may happy to prepare a fix for it. However, as I don’t have much experience in LSR, I first need to get some fundamental idea.

For me, it seems that LSR try to handle a loop terminating condition as the post-inc form, while handling other IV users as pre-inc. If this is true, what the reasoning behind the use of post-inc. Is there any assumption about using post-inc or pre-inc form in the cost model?

Thanks,
Jun