which pass can do following optimization? gvn-sink?

Dear all,

Imagine we have following code:

1 #define ny 10

2 #define Batch_Size 10

3

4 typedef float data_t;

5

6 void foo(data_t out[ny][Batch_Size], data_t max[Batch_Size]);

7

8 void Softmax_Activation(data_t l_Z2[ny][Batch_Size],

9 data_t out[ny][Batch_Size]) {

10

11 data_t max[Batch_Size];

12

13 SA_MAX2:

14 for (int i = 0; i < Batch_Size; i++) {

15 max[i] = 0;

16 SA_MAX1:

17 for (int j = 0; j < ny; j++) {

18 if (l_Z2[j][i] > max[i])

19 max[i] = l_Z2[j][i];

20 }

21 }

22 foo(out, max);

23 }

we can see ‘max[i]’ is an invariant variable to loop ‘SA_MAX1’, so I want to know which pass can following following transformation/optimization:

1 #define ny 10

2 #define Batch_Size 10

3

4 typedef float data_t;

5

6 void foo(data_t out[ny][Batch_Size], data_t max[Batch_Size]);

7

8 void Softmax_Activation(data_t l_Z2[ny][Batch_Size],

9 data_t out[ny][Batch_Size]) {

10

11 data_t max[Batch_Size];

12

13 SA_MAX2:

14 for (int i = 0; i < Batch_Size; i++) {

15 data_t Max = 0;

16 SA_MAX1:

17 for (int j = 0; j < ny; j++) {

18 if (l_Z2[j][i] > Max)

19 Max = l_Z2[j][i];

20 }

21 max[i] = Max;

22 }

23 foo(out, max);

24 }

Which will use a local scalar ‘Max’ to replace the original ‘max[i]’, and sink the original write out of the loop ‘SA_MAX1’.

I did some experiment with godbolt, looks like currently we don’t have such kind of optimization.
https://godbolt.org/z/9PK3hYvPs

Do you know which pass can do this? Or it’s not necessary for CPU?

Thanks,
Fangqing
Xilinx Inc.

This kind optimization is done by the LICM pass. Look for
promoteLoopAccessesToScalars in LICM.cpp. However, it requires the
loop ocde to be executed unconditionally (or
isSafeToExecuteUnconditionally). See the justification in the comment
for promoteLoopAccessesToScalars.

Michael

Thank you Michael!
This info is very useful!

Fangqing