[RFC] Don't merge memory locations in AliasSetTracker

Abstract

I want to propose to change the AliasSetTracker (AST) analysis (used most visibly by LICM) to track memory locations instead of pointers in its alias sets. The current implementation stores the equivalent of a single MemoryLocation per pointer value (or per must-alias set) but merging the information of memory locations is difficult to implement (and likely to lose information) for alias analyses that track extra information in memory locations, e.g. TypeBasedAA and ScopedNoAliasAA.

Since I think it may help the discussion to see what the change entails in practice, the code changes to implement this have been posted as a draft GitHub PR:

Motivating test case

The primary motivation for this change are missed optimizations such as the following:

inline void my_store(int *p, int v, int *&p_next) {
    *p = v;
    p_next = p + 1;
}

struct S {
    int *p;
};

void test(S *s) {
    for (int i = 0; i < 100; ++i) {
        my_store(s->p, i, s->p);
    }
}

The test function contains a loop doing integer stores using (and updating) a pointer from a struct. LICM does not promote the (loop-invariant) accesses of the pointer s->p out of the loop (see Godbolt result) because it considers the integer stores blocking (while TBAA knows they are not).

In the Godbolt LLVM IR output, you can observe that load and store of s->p have slightly different TBAA metadata (!6 versus !13). While both TBAA tags individually indicate that the access does not alias with integer access, the two TBAA tags are merged in the alias set (because they apply to accesses with the same pointer value &s->p), and the merged TBAA metadata no longer indicates no aliasing with integer access.

The missed optimization is particularly unsatisfying because (1) other LLVM optimizations (not using AST) have no problem seeing the correct alias relation between s->p (pointer access) and *s->p (integer access), so this gives confusion over what the LLVM optimizer can and cannot analyze and (2) it produces counterintuitive situations where adding less alias information leads to better optimization (e.g. with -fno-struct-path-tbaa the access on s->p does get promoted out of the loop).

Further motivation

While it may be possible to improve the merging of TBAA tags to not lose information in this case, I doubt this is the way forward. It requires additional effort to implement merging of MemoryLocation objects, and it will be impossible to do so without information loss as the memory location information gets more complex. For example, the full restrict work adds a pointer provenance argument to MemoryLocation. We have experienced memory location merging issues in AST with it too.

Furthermore, the current AST implementation is deeply integration with the MemoryLocation contents; any extension to memory locations will require updates to AliasSetTracker, both trivial (unpacking and packing MemoryLocations) and non-trivial (merging of information in method updateSizeAndAAInfo). In contrast, when the AST tracks memory locations as such, they can be treated as opaque objects, used just as arguments for alias analysis queries.

Note that it is no longer necessary for the AST to handle copied or deleted pointer values since D138014, which also makes this change easier.

MustAlias sets, saturation

Besides merging memory location information for identical pointer values, the current AST implementation also resolves alias queries for MustAlias sets through a single representative (the first pointer in the set, as returned by getSomePointer). This implies that the memory location information for the other pointer values in the set must be merged into that pointer (although merging e.g. AAInfo and certainly provenance from another pointer - albeit a must-alias - is dubious IMO).

In line with the idea to not merge memory locations, I propose to resolve alias queries for MustAlias sets in the same way as MayAlias sets, checking all elements in the set and not using a representative. Consequently, the saturation mechanism - that protects against excessive time building alias sets and resolving queries - should count elements from all sets, not just MayAlias sets. I suspect MustAlias sets with distinct pointer values have become less frequent since switching to opaque pointers, as it allows to have load/stores with different types from the same pointer value directly, without bitcasts. Hence my intuition that it makes less sense to specifically optimize MustAlias sets anymore.

Compile-time impact

One can assume that the merging of memory locations in AST (for identical pointer values or for pointers in a must alias relation) was a tradeoff to save on compile-time (or at least that to stop merging will have a negative impact on compile-time). To have a first assessment of this, the change was tested with LLVM Compile-Time Tracker:

https://llvm-compile-time-tracker.com/compare.php?from=fab25949688d7b038ee047d4e8b44a23803db3bd&to=aa7ccadde93f78148026103a03c282e23acbf0b3&stat=instructions:u

No significant effects seem to appear here.


Thanks to @dobbelaj for his help.

1 Like