[RFC] Rationale for Flang AliasAnalysis pointer component logic

jdenny-ornl · May 29, 2024, 5:21pm

I am trying to learn the rationale for the pointer component logic used in Flang’s AliasAnalysis implementation. It appears that others find that logic confusing as well: the recent PR #91020, despite a thorough review, introduced a test that appears to understand this logic differently than an existing test. The goal of this RFC is to clear up this confusion and decide whether to change the implementation or just improve the documentation and tests.

@szakharin @jeanPerier @Renaud-K @tblah

Conflicting Tests

Introduced more than a year ago, test alias-analysis-3.fir line 106 checks that “Two dummy arguments of composite type with a pointer component may alias each other”. Here is a reduced version, where it expects MayAlias for %arg0 vs. %arg1:

// module m
//   type t
//     real, pointer :: pointer_component
//   end type t
// contains
//   subroutine test(a, b)
//     type(t) :: a, b
//   end subroutine test
// end module m

func.func @_QMmPtest(%arg0: !fir.ref<!fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>> {fir.bindc_name = "a"}, %arg1: !fir.ref<!fir.type<_QMmTt{pointer_component:!fir.box<!fir.ptr<f32>>}>> {fir.bindc_name = "b"}) {
  // ...
}

Introduced by the recent PR #91020, test alias-analysis-9.fir line 21 has nearly the same check as above but with a todo saying, “x and y are non pointer, non target argument and therefore do not alias.” Here is a reduced version, where the todo expects NoAlias for x vs. y, which are labels for hlfir.declare ops on %arg0 and %arg1:

// module m
//   type t
//     type(t), pointer :: next
//   end type
// contains
//   subroutine foo(x, y)
//     type(t) :: x, y
//   end subroutine
// end module

func.func @_QMmPfoo(%arg0: !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>> {fir.bindc_name = "x"}, %arg1: !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>> {fir.bindc_name = "y"}) {
  // ...
  %4:2 = hlfir.declare %arg0 {uniq_name = "_QMmFfooEx", test.ptr = "x"} : (!fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>) -> (!fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>, !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>)
  %5:2 = hlfir.declare %arg1 {uniq_name = "_QMmFfooEy", test.ptr = "y"} : (!fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>) -> (!fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>, !fir.ref<!fir.type<_QMmTt{next:!fir.box<!fir.ptr<!fir.type<_QMmTt>>>,i:i32}>>)
  // ...
}

Of course, AliasAnalysis sometimes returns MayAlias for cases where it is not powerful enough yet to return a better result. Is alias-analysis-3.fir just missing associated todo comments? Are alias-analysis-9.fir’s comments incorrect? Or is there some difference between these cases that should produce different results, once AliasAnalysis is powerful enough?

Alternative Logic

In alias-analysis-9.fir, I think I understand why MayAlias makes sense for x vs. xnext1, which is the address of the pointer x%next (not the address in the pointer). To me, that seems like a special case of how the address of a composite may alias the address of any of its components. My understanding of the rationale is that a store to either one can affect the result of a load from the other. But I do not understand why there might be aliasing between the addresses of different composites (e.g., x and y) just because they have pointer components.

To investigate those points further, I wrote a small patch [edit: that’s now part of this draft PR] that rewrites the pointer component logic accordingly. First, it removes the old pointer component logic to avoid cases (e.g., MayAlias for x vs. y) that don’t seem reasonable to me. Second, it assumes the address of a pointer component should be treated just like the address of any other component. Logic for other components is already implemented without this patch, but it misses pointer components because the current representation treats pointers as not “data”, so this patch adds similar handling for them.

This patch adjusts both of the aforementioned tests to have what I think are more reasonable expected results. That includes resolving some fixme/todo comments in those tests. Other tests do not require changes, and check-flang still behaves.

Does that patch mishandle any use cases?

klausler · May 29, 2024, 5:58pm

The two dummy arguments a and b here could be aliases for the same derived type object only if that object is not modified during the call by any name. See Fortran 2023, 15.5.2.14 p1, items (3)-(4). Their pointer components can of course alias each other and any other POINTER or TARGET objects.

jdenny-ornl · May 29, 2024, 7:28pm

Thanks for the reference. If I am reading that correctly, there is no suggestion that MayAlias should be computed for two dummy arguments simply because they have pointer components. That is consistent with the RFC’s suggestion that the existing AliasAnalysis logic casts too wide of a net.

In the RFC examples, we are discussing the addresses of the pointers not the addresses stored in the pointers. These are distinct in the FIR representation. The aliasing you describe applies to the latter.

Please correct me if I’ve misunderstood your comment.

klausler · May 29, 2024, 9:15pm

Correct. Dummy argument aliasing in Fortran may depend on dummy argument attributes (viz. POINTER and TARGET) but not on type, kind, rank. shape, corank, coshape, &c., other than the aliasing restrictions on subobjects, which are relevant of course only when the dummy argument has a derived type.

Be advised, the Fortran standard defines “subobject” in 9.4.2 such that the contents of ALLOCATABLE components, possibly nested, are subobjects of the base object – but the targets of POINTER components, possibly nested, are not. The POINTER components, if not nested within any target of a POINTER component, are subobjects only for the purposes of pointer assignment and argument association.

So if

module m
  type t1
    type(t2), allocatable :: c1
  end type
  type t2
    real, allocatable :: c2a
    real, pointer :: c2b
  end type
  type(t1), target :: u
  type(t1), pointer :: v
 contains
  subroutine s(x,y,z)
    type(t1) x, y
    type(t1), pointer :: z
  end
end

the only possible aliasing is between the targets of x%c1%c2b, y%c1%c2b, & z%c1%c2b, and of course between the target of the pointer z and both u and the target of v.

jeanPerier · May 30, 2024, 8:06am

I think the comment is incorrect. a%pointer_component and b%pointer_component targets may alias, but not a and b themselves. The initial implementation was likely too conservative with pointer components.

In general, extra care is needed with pointer/allocatable components because they introduce the possibility that some descriptor aliases with some data, while FIR alias analysis can assume this is not possible in other cases.

I agree this would be better to be less conservative with “a” and “b”, but your patch seems to also now assumes that a descriptor cannot alias with data containing a descriptor, and I think this is wrong. See WIP: [flang] AliasAnalysis: Fix pointer component logic · jdenny-ornl/llvm-project@484cb90 · GitHub

jdenny-ornl · May 30, 2024, 5:58pm

This is the example Jean posted there:

module m
  type t
     real, pointer :: p
  end type
  type(t) :: a
  type(t) :: b
contains
subroutine test(p)
  real, pointer :: p
  p = 42
  a = b
  print *, p
end subroutine
end module

  use m
  real, target :: x1 = 1
  real, target :: x2 = 2
  a%p => x1
  b%p => x2
  call test(a%p)
end
The program should print 2.

With your change, if FIR was doing load-store forwarding, it could now mistakenly print 42 because it would think that the write effect on a in “a = b” cannot affect the read effect to “print *, p” since “a” data and “p” descriptor are considered not alias by this change, which is incorrect, “p” descriptor is part of “a” data.

Thanks, that’s exactly the kind of case I was looking for. I would have guessed that changing the pointer association for a dummy argument (like test’s p) would be permitted only via operating directly on that dummy argument, but I’m still learning these nuances of Fortran.

The following addition appears to fix that case:

if (src1->kind == SourceKind::Argument &&
    src1->attributes.test(Attribute::Pointer) &&
    src2->kind == SourceKind::Global &&
    src2->isRecordWithPointerComponent()) {
  LLVM_DEBUG(llvm::dbgs()
             << "  aliasing because of pointer arg and global composite with "
             << " pointer component\n");
  return AliasResult::MayAlias;
}

However, is it general enough? For example, is a dummy pointer argument the only way to end up with an alias of a pointer itself (not just its target)?

[edit: For convenience, I pushed another commit with that fix: WIP: Fix case of ptr dummy arg vs. ptr component · jdenny-ornl/llvm-project@9f6d2a7 · GitHub][edit: that’s now part of this draft PR]

klausler · May 30, 2024, 6:03pm

Fortran’s type system doesn’t include pointers, arrays, procedures, or allocatables – these are attributes of entities, not parts of types. So you can have a pointer to an object type or a pointer to a procedure interface, but you can’t have a pointer to a pointer. The closest you can come is a pointer to an object type that is a derived type with a pointer component.

jdenny-ornl · May 30, 2024, 6:46pm

AliasAnalysis assumes two pointers MayAlias regardless of their target types. That wide net includes the case you mention: a pointer to a composite with a pointer component.

How aliasing can happen between a pointer and a non-pointer (a composite with a pointer component) was the surprise for me from Jean’s example. In some sense, the pointer dummy argument there is like a pointer to a pointer because it’s passed by reference.

jeanPerier · May 31, 2024, 7:37am

The relevant section in the F2023 standard is “15.5.2.14 Restrictions on entities associated with dummy arguments” as far as I know.

My example would be invalid with allocatables I think because of (1) Action that affects the allocation status […] shall be taken through the dummy argument., but there is no equivalent for association status, so it falls through in “(4) Action that affects the value of the entity […] shall be taken only through the dummy argument unless (a) the dummy argument has the POINTER attribute”.

So I think Fortran does not prevent one from changing the association status of a dummy pointer by other means that the dummy pointer itself.

jeanPerier · May 31, 2024, 3:05pm

I do not think this is general enough as far as FIR is concerned. The only safe case I see is when you managed to track both the descriptor storage and data storage of the derived type with pointer components to distinct alloca/allocmem/global sources.

The data could be a dummy TARGET or POINTER for instance I think.

klausler · May 31, 2024, 3:06pm

Correct.

jdenny-ornl · June 3, 2024, 4:28pm

The patches discussed in this RFC are now part of draft PR #94242 so we can more easily track comments there. Sorry, I should have done that from the beginning.

jdenny-ornl · June 3, 2024, 5:15pm

Thanks for your comments. I’m not sure I’m following this one.

At least for me, a recap at this point would help: As far as I know, other code already handles the case that both values examined by AliasAnalysis are pointers/targets (even if also composite). The focus of this RFC is the separate pointer component logic, which I believe focuses on non-pointer, non-target composites that have pointer components. For that, I think we have identified only the following aliasing cases:

The source values/symbols are the same (lhsSrc.origin.u == rhsSrc.origin.u). One value (lhs or rhs) is the address of a composite. The other value is statically the address of a pointer component of that composite.
The source values/symbols are not the same. One value is the address of a composite. The other value is the address of a pointer that might dynamically be a component of that composite.

Handling of 1 appears in the original patch I posted for this RFC (and I’m replying to Renaud’s concerns about its implementation), and I later added another patch to handle 2. We’re currently discussing 2.

For 2, so far I haven’t heard of a case where the pointer can be anything but a dummy argument (pass-by-reference creates an alias for the address of the pointer). I believe a shortcoming of my implementation for 2 is that I assumed the composite had to be a global. Could it also be another dummy argument? Maybe it’s safest to assume the composite might have any SourceKind, as follows:

if ((src1->isRecordWithPointerComponent() &&
     src2->kind == SourceKind::Argument &&
     src2->attributes.test(Attribute::Pointer)) ||
    (src2->isRecordWithPointerComponent() &&
     src1->kind == SourceKind::Argument &&
     src1->attributes.test(Attribute::Pointer))) {
  LLVM_DEBUG(llvm::dbgs()
             << "  aliasing because of pointer arg and composite with "
             << "pointer component\n");
  return AliasResult::MayAlias;
}

If that doesn’t address your concern, would you please provide an example?

jdenny-ornl · June 17, 2024, 5:27pm

PR #94242, which addresses the issues in this RFC, is no longer in a draft state and has received some accepts. I plan to land it tomorrow. Please let me know if anyone needs more time to review. Thanks for the discussion so far.

Topic		Replies	Views
[Classic Flang] Pointer/Target alias analysis improvements. Flang	2	253	May 7, 2021
[RFC] Distinguish between data and non-data in FIR Alias Analysis Flang	1	154	May 4, 2024
Alias analysis in LLVM Flang Flang	48	2646	November 28, 2023
RFC: `fsyntax-only` as an alias for `fparse-only` in `f18` Flang	3	112	February 2, 2021
Reuse of mlir::AliasAnalysis in flang MLIR	0	511	October 31, 2022

[RFC] Rationale for Flang AliasAnalysis pointer component logic

Conflicting Tests

Alternative Logic

Related topics