[Perf Regressions] r299379 - [CodeGenPrep] move aarch64-type-promotion to CGP

Hi Jun,

Your commit caused performance regressions on AArch64 Cortex-53: http://llvm.org/perf/db_default/v4/nts/110757

MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt: -16.61%
MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl: -14.02%

Other regressions on the page are noise.

I see a difference in generated code which is hot:

===== r299379 =====

14.19% 40f910 ldr s0, [x9]
21.37% 40f914 fcmp s0, #0.0
7.14% 40f918 b.le 40f930 <s342+0xd0>
  40f91c sxtw x10, w10
7.05% 40f920 add x10, x10, #0x1
14.02% 40f924 add x11, x19, x10, lsl #2
21.16% 40f928 ldr w11, [x11,x26]
7.79% 40f92c str w11, [x9]
  40f930 sub x8, x8, #0x1
7.25% 40f934 add x9, x9, #0x4

Hi Evgeny,

Let me take a closer look at this and get back to you soon.

Thanks,
Jun

The IR below should show the issue with my recent commit (r299379) in CodeGenPrepare :

%struct.GlobalData = type { [32000 x float], [3 x i32] }
@global_data = common global %struct.GlobalData zeroinitializer, align 16

define i32 @s341(i8 %c, i32 %j, i32 %j2, float %s) {

if.then: ; preds = %for.body4
  %inc = add nsw i32 %j, 1
  %sext1= sext i32 %inc to i64
  %arrayidx9 = getelementptr inbounds %struct.GlobalData, %struct.GlobalData* @global_data, i64 0, i32 0, i64 %sext1
  store float %s, float* %arrayidx9, align 4
  %j3 = sdiv i32 %inc, %j2
  br label %return
return:
  ret i32 %j3
}

In the original AArch64 address type promotion pass, %sext1 was not promoted because its operand %inc is also used in %j3.

With my commit (r299379), %sext1 is now allowed for promotion with trunc instruction to feed to %j3 from %inc :

  %sext1 = sext i32 %j to i64
  %inc = add nsw i64 %sext1, 1
  %promoted = trunc i64 %inc to i32
  %arrayidx9 = getelementptr inbounds %struct.GlobalData, %struct.GlobalData* @global_data, i64 0, i32 0, i64 %inc
  store float %s, float* %arrayidx9, align 4
  %j3 = sdiv i32 %promoted, %j2
  ret i32 %j3

This transformation prevent ISel from folding sext into the store. Let me first try to fold this in ISel. If this is unreasonable, I will change CodeGenPrepare not to allow sext promotion when the operand has multiple users. Please let me know any comment.

Thanks,
Jun