As we are facing enhancement difficulties in distinguishing between the address of a box versus the address it wraps (its data) (PR # 87723 ) , I would like propose a more formal modeling for data and non-data in the FIR Alias Analysis. This should help in clearing current ambiguities, it will help cleaning up our tests and also simplify the code base.
To recap the issue, it is possible, while following the source of a memory reference through the use-def chain, to arrive at the same origin, even though the starting points were known to not alias.
Example
fir.global @_QMtopEa : !fir.box<!fir.ptr<!fir.array<?xf32>>>
func.func @_QPtest() {
%c1 = arith.constant 1 : index
%cst = arith.constant 1.000000e+00 : f32
%0 = fir.address_of(@_QMtopEa) : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
%1 = fir.declare %0 {fortran_attrs = #fir.var_attrs<pointer>, uniq_name = "_QMtopEa"} : (!fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>) -> !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
%2 = fir.load %1 : !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
...
%5 = fir.array_coor %2 %c1 : (!fir.box<!fir.ptr<!fir.array<?xf32>>>, !fir.shift<1>, index) -> !fir.ref<f32>
fir.store %cst to %5 : !fir.ref<f32>
return
}
With high level operations, such as fir.array_coor
, it is possible to reach into the data wrapped by the box (the descriptor) therefore when asking about the memory source of the %5
, we are really asking about the source of the data of box %2
.
When asking about the source of %0
which is the address of the box, we reach the same source as in the first case: the global @_QMtopEa
. Yet one source refers to the data while the other refers to the address of the box itself.
Currently, to distinguish between the 2, a new source kind was introduced: SourceKind::Direct
. This is leading to issues when handling box function arguments which can be passed by value or by reference as we would like to retain that they are of SourceKind::Argument
.
I propose that we encode in the fir::AliasAnalysis::Source
both the MLIR object and a flag indicating whether this was from data or box reference. As hinted, data would be defined as any memory reference that is not a box reference. Additionally, because it is relied on in HLFIR lowering, we allow querying on a box SSA value, which is interpreted as querying on its data.
So in the above example, !fir.ref<f32>
and !fir.box<!fir.ptr<!fir.array<?xf32>>>
is data, while !fir.ref<!fir.box<!fir.ptr<!fir.array<?xf32>>>>
is not data.
This generally applies to function arguments. In the example below, %arg0
is data, %arg1
is not data but a load of %arg1
is.
func.func @_QFPtest2(%arg0: !fir.ref<f32>, %arg1: !fir.ref<!fir.box<!fir.ptr<f32>>> ) {
%0 = fir.load %arg1 : !fir.ref<!fir.box<!fir.ptr<f32>>>
...
}
The proposed changes can be seen in [flang] AliasAnalysis: More formally define and distinguish between data and non-data by Renaud-K · Pull Request #91020 · llvm/llvm-project · GitHub