create a copy of the original variable outside the parallel region and then use this for firstprivatisation in the region?
This sounds more reasonable to me.
insert a barrier after privatisation (which generally is the first thing to happen in the parallel region)
I am concerned if it leads to some overhead if copy a large array.
-
We don’t need to do this for all data types. For fortran, we only need to do this for pointer/(allocatable?), variables in equivalence and the equivalenced varaible is defined in parallel region, and variables in associate construct. Maybe more scenarios? For large array without these usage, copying a large array causes unexpected overhead. So considering the specific language usage, it seems better to do this in FIR for fortran.
-
Another issue I noticed is that currently the private space uses stack. What if the stack is not enough? Maybe use fir.allocmem to use the heap and add one option similar to
-fstack-arrays
to use stack for better performance?