[Polly] Question about Polly's speed up on huffbench.c without optimization and code generation

Hi all,

It seems that Polly could still speed up test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any optimization and code generation. Our evaluation show that when compiled with “clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none”, the execution time of huffbench would reduced to 15 secs from the original 19 secs without Polly.

By investigating Polly’s canonicalication passes, I find the speedup main! ly comes from “createIndVarSimplifyPass()”, which is controlled by the variable SCEVCodegen:

if (!SCEVCodegen)
PM.add(polly::createIndVarSimplifyPass());

If we remove this canonicalication pass, then there would be no performance improvement.

Could anyone give me some hints why Polly needs this canonicalication pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this canonicalication pass at all?

Thanks,
Star Tan

Hi Star,

polly::createIndVarSimplifyPass() is used in Polly to create canonical induction variables in case we do not use the SCEV based code generation. For SCEV based code generation this pass is not needed any more and one motivation for writing the SCEV based code generation was in fact to remove the need for this pass. It still exists as we did not yet fully test the SCEV based code generation and for the classical code
generation we need canonical induction variables.

Regarding the speed up due to Polly. It seems the rewrites introduced by the createIndVarSimplifyPass happen to yield faster code. If you can easily reproduce a reduced test case that shows a missing optimization,
it would be great to get a bug report for this. On the other hand, I remember the induction variable canonicalization was removed due to introducing unpredictable performance regressions (and possible improvements?). Hence, I would not spend too much time tracking on this
in case there is no obvious missed optimization.

Cheers,
Tobi

>> Hi all,
>>
>>   It seems that Polly could still speed up  test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c even without any optimization and code generation. Our evaluation show that when compiled with "clang -Xclang -load -Xclang LLVMPolly.so -mllvm -polly -mllvm -polly-optimizer=none -mllvm -polly-code-generator=none", the execution time of huffbench would reduced to 15 secs from the original 19 secs without Polly.
>>
>> By investigating Polly's canonicalication passes, I find the speedup mainly comes from "createIndVarSimplifyPass()", which is controlled by the variable SCEVCodegen:
>>
>>
>>      if (!SCEVCodegen)
>>         PM.add(polly::createIndVarSimplifyPass());
>>
>> If we remove this canonicalication pass, then there would be no performance improvement.
>>
>> Could anyone give me some hints why Polly needs this canonicalication pass in normal cases but refuse it in SCEVCodegen case? Is it possible to remove this canonicalication pass at all?
>
>Hi Star,
>
>polly::createIndVarSimplifyPass() is used in Polly to create canonical 
>induction variables in case we do not use the SCEV based code 
>generation. For SCEV based code generation this pass is not needed any 
>more and one motivation for writing the SCEV based code generation was 
>in fact to remove the need for this pass. It still exists as we did not 
>yet fully test the SCEV based code generation and for the classical code
>generation we need canonical induction variables.
>
>Regarding the speed up due to Polly. It seems the rewrites introduced by 
>the createIndVarSimplifyPass happen to yield faster code. If you can 
>easily reproduce a reduced test case that shows a missing optimization,
>it would be great to get a bug report for this. On the other hand, I 
>remember the induction variable canonicalization was removed due to 
>introducing unpredictable performance regressions (and possible 
>improvements?). Hence, I would not spend too much time tracking on this
>in case there is no obvious missed optimization.

I see. Thanks for your explanation.
I think we could remove  the induction variable canonicalization in the next step.

Best,
Star Tan