Interprocedural DSE for -ftrivial-auto-var-init

Vitaly_Buka · April 15, 2019, 8:51pm

Hi JF,

I’ve heard that you are interested DSE improvements and maybe we need to be in sync.

So far I experimented with following DSE improvements:

Cross-block DSE, it eliminates additional 7% stores comparing to existing DSE. But it’s not visible on benchmarks.
Cross-block + Interprocedural analysis to annotate each function argument with:

can read before write
will always write

This annotations gets me 20% stores deleted additional to the current DSE.

This is on LLVM codebase with -ftrivial-auto-var-init=patter.

As-is it’s less than I expected, so I would like to find good benchmark to decide if we should work to make production code from my experiment.

So now I am also planing to try to extend that to whole program analysis.

I will cleanup my code and upload this during this weak, if anyone wants to try.

Vitaly.

JF_Bastien2 · April 15, 2019, 8:55pm

This is great! I’ll try out the patches when you post them, and see if it resolves the issues I’d been seeing. I don’t think we need benchmark gains fro this to be worthwhile since variable auto-init adds slightly unusual code. I think it’s aggravating cases where current DSE was failing in innocuous ways.

Amara_Emerson1 · April 15, 2019, 9:02pm

Hi JF,

I've heard that you are interested DSE improvements and maybe we need to be in sync.
So far I experimented with following DSE improvements:

* Cross-block DSE, it eliminates additional 7% stores comparing to existing DSE. But it's not visible on benchmarks.

I take it you couldn’t see any runtime impact? If there’s code size improvements that could also be useful, CTMark in the llvm test suite is a useful subset of benchmarks to check this on (as a baseline use -Os to compare code size).

Thanks,
Amara

Alexander_Potapenko · April 16, 2019, 2:11pm

>
> Hi JF,
>
> I've heard that you are interested DSE improvements and maybe we need to be in sync.
> So far I experimented with following DSE improvements:
>
> * Cross-block DSE, it eliminates additional 7% stores comparing to existing DSE. But it's not visible on benchmarks.
I take it you couldn’t see any runtime impact? If there’s code size improvements that could also be useful, CTMark in the llvm test suite is a useful subset of benchmarks to check this on (as a baseline use -Os to compare code size).

Thanks,
Amara
>
> * Cross-block + Interprocedural analysis to annotate each function argument with:
> - can read before write
> - will always write
> This annotations gets me 20% stores deleted additional to the current DSE.

I believe we can only benefit from removing extra stores.
Hot functions in existing benchmarks are probably optimized good
enough already, but speeding up the long tail is also important.
Also, at least the repro in
40527 – Missed opportunity to remove a dead store has been extracted from a
real kernel benchmark (hackbench), where this extra store costed us
0.45%

Vitaly_Buka · April 16, 2019, 6:45pm

I tried -Os and effect of new approach significantly increases.
I run regular DSE and immediately myDSE. With -Os myDSE removes more than 50% of DSE number.
Which is expected as -Os inlines less and regular DSE can’t remove over function call.

Amara_Emerson1 · April 16, 2019, 7:10pm

Can you post numbers for how many stores get eliminated from CTMark?

Vitaly_Buka · May 11, 2019, 3:59am

Sorry for delay, I was busy with other stuff.

CTMark results.

dse is the current DSE.
dsem is my experimental module level DSE.
dsem runs after dse, so it’s additionally deleted stores.

-O3
dse - Number of stores deleted 3033
dsem - Number of deleted writes 3148

-O3 -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 5618
dsem - Number of deleted writes 3840

-O3 -flto
dse - Number of stores deleted 3985
dsem - Number of deleted writes 3838

-O3 -flto -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 6461
dsem - Number of deleted writes 4215

-Os
dse - Number of stores deleted 1443
dsem - Number of deleted writes 1517

-Os -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 3951
dsem - Number of deleted writes 2259

-Oz
dse - Number of stores deleted 1072
dsem - Number of deleted writes 574

-Oz -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 3420
dsem - Number of deleted writes 1637

JF_Bastien2 · May 13, 2019, 4:55pm

Sorry for delay, I was busy with other stuff.

CTMark results.

dse is the current DSE.
dsem is my experimental module level DSE.
dsem runs after dse, so it’s additionally deleted stores.

-O3
dse - Number of stores deleted 3033
dsem - Number of deleted writes 3148

-O3 -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 5618
dsem - Number of deleted writes 3840

-O3 -flto
dse - Number of stores deleted 3985
dsem - Number of deleted writes 3838

-O3 -flto -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 6461
dsem - Number of deleted writes 4215

-Os
dse - Number of stores deleted 1443
dsem - Number of deleted writes 1517

-Os -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 3951
dsem - Number of deleted writes 2259

-Oz
dse - Number of stores deleted 1072
dsem - Number of deleted writes 574

-Oz -ftrivial-auto-var-init=pattern
dse - Number of stores deleted 3420
dsem - Number of deleted writes 1637

This looks great! Do you have a patch ready to go?

Vitaly_Buka · May 13, 2019, 5:23pm

I have dirty prof-of-concept patch. I am going to rewrite pieces of it during the May starting now.

Today it’s a new pass which does cross-block DSE, module DSE, and global DSE.
So far the module DSE is the most useful and probably easy integrate to existing DSE.

Vitaly_Buka · May 14, 2019, 2:32am

https://reviews.llvm.org/D61879

JF_Bastien2 · May 14, 2019, 2:46am

Great, thank you for getting this started!

Topic		Replies	Views
[PATCH] Add simple cross-block DSE. LLVM Dev List Archives	2	54	January 15, 2010
llvm-dev Digest, Vol 154, Issue 83 LLVM Dev List Archives	0	88	April 25, 2017
Helpful (?) hints LLVM Dev List Archives	1	57	December 8, 2002
Autotuning parameters/heuristics within LLVM LLVM Dev List Archives	2	154	October 3, 2014
Interprocedural AA LLVM Dev List Archives	2	59	December 18, 2018

Interprocedural DSE for -ftrivial-auto-var-init

Related Topics