RFC: A plan for stateful alias analysis in LLVM

Front matter

First, I want to emphasize that this is not a short-term plan. This is a long-term plan. When I was looking at refactoring the existing Alias Analysis infrastructure in LLVM in ways that would break stateful AA, Hal Finkel asked me to figure out how things would work with the new pass manager and make sure that was sane.

This plan assumes the new pass manager. There is nothing deeply or intrinsically tied to it, but the pieces will fit together naturally there, and I don’t see how to make them fit naturally in the existing pass manager. I’ve not invested a lot of time in figuring that out as it seems a sunk cost.

Core ideas of what stateful AA should look like:

  1. An alias analysis is an analysis pass, and if it needs to build up state, it must do so for a “unit” of IR at a time (using the new PM’s terminology).
  2. Per-IR-unit invalidation from the pass manager is the primary mechanism to clear out stale (and thus invalid) state from any alias analysis. That means that most passes will not preserve AA by default (unlike today).
  3. To be reasonably scalable, most alias analyses which retain state will need to handle these invalidation events in a way that gracefully degrades their results rather than wiping out all state.

I think everything in-tree but GlobalsModRef fits these requirements already. Notably, this will work well for CFL, Andersen, or generally any “intersection” based AA. A “union” based AA could work here, but would have to do considerable work to be able to tolerate updates that “invalidate” its state without simply forced recomputation. I think this is a reasonable tradeoff as it preserves our ability to experiment, while simplifying the update model.

The rest of this email contains much more detailed musing on this. Feel free to stop reading when necessary.

The last section touches on the core functionality that would be needed to support something like GlobalsModRef in its current form. I’m much less confident in this part, so it’s not part of my core requirements above.

Details about analysis passes and pass manager invalidation

My thought is that an alias analysis pass in the new model would be an analysis pass. It would produce an analysis result that contained any state necessary to vend AA queries. This result would be cached just like all the other analysis results are in the pass manager. It would have the same invalidation rules too.

A really important consequence of this is that analysis results in the new scheme get notified of invalidation events and can attempt to update their internal state to avoid becoming invalid. AAs can leverage this to do fast incremental updates to their internal structures if appropriate.

An AA could even use ValueHandles now to automatically stay up-to-date with a conservatively correct data structure, and just drop all the invalidation requests. In fact, I suspect many stateful AAs will take exactly this approach.

How will users of AA information query it?

This will work much like the normal analysis queries. I plan to provide an alias analysis manager at each IR unit that will provide a single query interface for code to use. This layer can then be configured to query any AnalysisManager for an IR unit T for the relevant actual AliasAnalysis implementations.

The AA manager layer will be responsible for the delegation that is currently done through chaining of AA implementations. It will also be responsible for enforcing that queries at a particular layer don’t violate IR unit abstraction boundaries. For example, you cannot query a FuntionAAManager for alias information of two Values from two different functions.

Why GlobalsModRef doesn’t fit, and what we could do about it

There is a fundamental construct in GlobalsModRef that exists today, and seems somewhat useful, but which doesn’t fit the above: using a global analysis of which memory locations are not escaped by any function, in order to conclude that within a given function, arguments, returns, and loads from globals cannot produce an aliasing memory location. Thus, if you always directly index or manipulate a global within a function, and it never escapes that function (including through integer hacks), you know that some other function can’t produce an aliasing location by loading a pointer from a global or receiving a pointer in an argument.

In practice this is actually totally reasonable. Passes don’t run around escaping pointers just because they feel like it. However, we don’t have a good way to really formally reason about such things.

My suggested way to reason about it is to define a new constraint on transformation passes over an IR unit T (ie, Function, Module, etc.). This constraint is that a pass may not observably escape a pointer value P from the unit being transformed. Now, “observably” is the key word here. The goal is to still allow fundamental transformations like Reg2Mem and instrumentation passes. However, the guarantee must be that the passes do not introduce new ways to observe an escape of a pointer. Because any actual escapes cannot be observed, it is safe for them to happen and for some other function pass (for example) to transform based on the non-escaping property computed prior to the transformation. I think this is a reasonable model, but I’ve not really had enough time thinking about it to be 100% certain.

With such a model, we could regain most of the power of GlobalsModRef w.r.t. non-escaping pointers and interprocedural aliasing inference. We would change it to be a two-pronged analysis. On one hand, a local analysis would have to prove that one location came from a module-level memory location (global) and the other location came from some dynamic input to the function (argument, return, or a load of memory that is def’ed outside the function), and then the other hand would provide the interprocedural information about what global memory locations were non-escaping.

Ok, those are all the thoughts I have currently about AA, and where I’m headed. This should hopefully provide context for some of the upcoming patches.

I’m also going to write a separate email about specific tactical problems I’ve found with GlobalsModRef today (nothing to do with the above theoretical problems) and how I plan to approach them.

Thanks,
-Chandler

Firstly, thanks for the detailed update on this…

Ok, those are all the thoughts I have currently about AA, and where I’m headed. This should hopefully provide context for some of the upcoming patches.

‘Headed’ w.r.t. to the new pass manager specifically, or AA in both old and new pass managers? ‘Upcoming patches’ makes it sound like this is coming soon so will likely be before the new PM is the only PM.

TBH i’m worried about the surface area we have to debug for correctness/performance issues when it comes time to adopt the new pass manager.

The new pass manager by design is going to lead to different behavior. Analyses will be preserved and invalidated in different orders from old to new PMs leading to differences in behavior of transform passes. Bugs will be found in transforms themselves as a result of the new PM ultimately (indirectly) giving them different code. All of this is going to take time to debug and block adoption of the new PM.

I’d like to see us try get the new PM enabled before increasing the surface area any more. Removing the bad AA updating code was fine as we still have the same AA updating behavior between old/new (i.e., no AA updating). But I think adding new AA API should be done either in both PMs or neither (for now). For one thing, we can’t have enough confidence that the code is working correctly if it only works in the new PM which isn’t yet being exposed to as much code as the old PM. Unit tests can only do so much as we can have confidence that the AA API is good with a unit test, but its harder to have confidence that a transform is calling it correctly without lots of code going through the transform pass.

Bugs in transforms related to AA updating will likely be extremely subtle and its better to have a blame list as short as possible. That can only happen if the AA code is running on the bots or else we may not catch it until the new PM is adopted.

Thanks,
Pete

Firstly, thanks for the detailed update on this…

Ok, those are all the thoughts I have currently about AA, and where I’m headed. This should hopefully provide context for some of the upcoming patches.

‘Headed’ w.r.t. to the new pass manager specifically, or AA in both old and new pass managers? ‘Upcoming patches’ makes it sound like this is coming soon so will likely be before the new PM is the only PM.

TBH i’m worried about the surface area we have to debug for correctness/performance issues when it comes time to adopt the new pass manager.

The new pass manager by design is going to lead to different behavior. Analyses will be preserved and invalidated in different orders from old to new PMs leading to differences in behavior of transform passes. Bugs will be found in transforms themselves as a result of the new PM ultimately (indirectly) giving them different code. All of this is going to take time to debug and block adoption of the new PM.

Yep. Trust me, I’m not happy about this, and its something I think about a lot.

I’d like to see us try get the new PM enabled before increasing the surface area any more.

Just to clarify, I’m not working on AA because I want to. =] I mean, I’m enjoying it, but I’m only doing what is on my critical path of getting the new PM working. For example, I wrote this email specifically because Hal encouraged me to write it up prior to proceeding with refactorings to remove the update API and establish the core APIs that will work with the new pass manager.

Removing the bad AA updating code was fine as we still have the same AA updating behavior between old/new (i.e., no AA updating). But I think adding new AA API should be done either in both PMs or neither (for now). For one thing, we can’t have enough confidence that the code is working correctly if it only works in the new PM which isn’t yet being exposed to as much code as the old PM. Unit tests can only do so much as we can have confidence that the AA API is good with a unit test, but its harder to have confidence that a transform is calling it correctly without lots of code going through the transform pass.

Bugs in transforms related to AA updating will likely be extremely subtle and its better to have a blame list as short as possible. That can only happen if the AA code is running on the bots or else we may not catch it until the new PM is adopted.

Preaching to the choir. I tried to make my opening clarify this. I’m not planning on adding any special functionality for updating until the new pass manager is firmly in place precisely because the greater divergence between the two, the worse the transition will be.

Here, I’m just trying to give an idea of the structure I’m imagining that will allow us to eventually do the update thing. The APIs I’ll need to introduce to get minimal functionality with the new pass manager will actually go a long way down this path (we’ll need FunctionAAManager etc) so I think Hal was right that I need to lay out this context before doing stuff. But I plan to do exactly is little as I can get away with until there is a reasonable baseline entirely based on the new PM.

-Chandler

Firstly, thanks for the detailed update on this…

Ok, those are all the thoughts I have currently about AA, and where I’m headed. This should hopefully provide context for some of the upcoming patches.

‘Headed’ w.r.t. to the new pass manager specifically, or AA in both old and new pass managers? ‘Upcoming patches’ makes it sound like this is coming soon so will likely be before the new PM is the only PM.

TBH i’m worried about the surface area we have to debug for correctness/performance issues when it comes time to adopt the new pass manager.

The new pass manager by design is going to lead to different behavior. Analyses will be preserved and invalidated in different orders from old to new PMs leading to differences in behavior of transform passes. Bugs will be found in transforms themselves as a result of the new PM ultimately (indirectly) giving them different code. All of this is going to take time to debug and block adoption of the new PM.

Yep. Trust me, I’m not happy about this, and its something I think about a lot.

I’d like to see us try get the new PM enabled before increasing the surface area any more.

Just to clarify, I’m not working on AA because I want to. =]

I wouldn’t blame you, AA looks like fun :slight_smile:

I mean, I’m enjoying it, but I’m only doing what is on my critical path of getting the new PM working. For example, I wrote this email specifically because Hal encouraged me to write it up prior to proceeding with refactorings to remove the update API and establish the core APIs that will work with the new pass manager.

Thanks. I do appreciate the write-up. I didn’t think you were immediately taking on the task of stateful AA but thought it best to understand exactly what is being proposed short/long-term.

Removing the bad AA updating code was fine as we still have the same AA updating behavior between old/new (i.e., no AA updating). But I think adding new AA API should be done either in both PMs or neither (for now). For one thing, we can’t have enough confidence that the code is working correctly if it only works in the new PM which isn’t yet being exposed to as much code as the old PM. Unit tests can only do so much as we can have confidence that the AA API is good with a unit test, but its harder to have confidence that a transform is calling it correctly without lots of code going through the transform pass.

Bugs in transforms related to AA updating will likely be extremely subtle and its better to have a blame list as short as possible. That can only happen if the AA code is running on the bots or else we may not catch it until the new PM is adopted.

Preaching to the choir. I tried to make my opening clarify this. I’m not planning on adding any special functionality for updating until the new pass manager is firmly in place precisely because the greater divergence between the two, the worse the transition will be.

Excellent. Thanks.

Here, I’m just trying to give an idea of the structure I’m imagining that will allow us to eventually do the update thing. The APIs I’ll need to introduce to get minimal functionality with the new pass manager will actually go a long way down this path (we’ll need FunctionAAManager etc) so I think Hal was right that I need to lay out this context before doing stuff. But I plan to do exactly is little as I can get away with until there is a reasonable baseline entirely based on the new PM.

I haven’t looked in to this at all, so feel free to shoot it down. How extensive is AA let alone stateful AA in -O0/1 or some backends via llc? My thinking is that would it be possible to move any of those to the new PM without doing any more AA work? -O0 for example really just needs to hook in to NoAA for example which is stateless. That would give us adoption of the new PM much sooner on those workloads and then we can incrementally add what is needed for -O2, etc.

Cheers,
Pete

From: "Chandler Carruth" <chandlerc@google.com>
To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
<dannyb@google.com>
Sent: Monday, July 13, 2015 9:59:10 PM
Subject: RFC: A plan for stateful alias analysis in LLVM

# Front matter # First, I want to emphasize that this is not a
short-term plan. This is a long-term plan. When I was looking at
refactoring the existing Alias Analysis infrastructure in LLVM in
ways that would break stateful AA, Hal Finkel asked me to figure out
how things would work with the new pass manager and make sure that
was sane.

This plan assumes the new pass manager. There is nothing deeply or
intrinsically tied to it, but the pieces will fit together naturally
there, and I don't see how to make them fit naturally in the
existing pass manager. I've not invested a lot of time in figuring
that out as it seems a sunk cost.

# Core ideas of what stateful AA should look like: #
1) An alias analysis *is* an analysis pass, and if it needs to build
up state, it must do so for a "unit" of IR at a time (using the new
PM's terminology).
2) Per-IR-unit invalidation from the pass manager is the primary
mechanism to clear out stale (and thus invalid) state from any alias
analysis. That means that most passes will not preserve AA by
default (unlike today).
3) To be reasonably scalable, most alias analyses which retain state
will need to handle these invalidation events in a way that
gracefully degrades their results rather than wiping out all state.

I think everything in-tree but GlobalsModRef fits these requirements
already. Notably, this will work well for CFL, Andersen, or
generally any "intersection" based AA. A "union" based AA could work
here, but would have to do considerable work to be able to tolerate
updates that "invalidate" its state without simply forced
recomputation. I think this is a reasonable tradeoff as it preserves
our ability to experiment, while simplifying the update model.

The rest of this email contains much more detailed musing on this.
Feel free to stop reading when necessary.

The last section touches on the core functionality that would be
needed to support something like GlobalsModRef in its current form.
I'm much less confident in this part, so it's not part of my core
requirements above.

# Details about analysis passes and pass manager invalidation #
My thought is that an alias analysis pass in the new model would be
an analysis pass. It would produce an analysis result that contained
any state necessary to vend AA queries. This result would be cached
just like all the other analysis results are in the pass manager. It
would have the same invalidation rules too.

A really important consequence of this is that analysis results in
the new scheme get notified of invalidation events and can attempt
to update their internal state to *avoid* becoming invalid. AAs can
leverage this to do fast incremental updates to their internal
structures if appropriate.

An AA could even use ValueHandles now to automatically stay
up-to-date with a conservatively correct data structure, and just
drop all the invalidation requests. In fact, I suspect many stateful
AAs will take exactly this approach.

# How will users of AA information query it? #
This will work much like the normal analysis queries. I plan to
provide an alias analysis manager at each IR unit that will provide
a single query interface for code to use. This layer can then be
configured to query any AnalysisManager<T> for an IR unit T for the
relevant actual AliasAnalysis implementations.

The AA manager layer will be responsible for the delegation that is
currently done through chaining of AA implementations. It will also
be responsible for enforcing that queries at a particular layer
don't violate IR unit abstraction boundaries. For example, you
cannot query a FuntionAAManager for alias information of two Values
from two different functions.

One thing I'd like to clarify is what is responsible for 'requiring' the various alias-analysis pass, so that they'll be recomputed if they're invalidated. So if I have an AliasManager, and I've requested that it manage a BasicAA and an SCEVAA (for example), and something happens such that SCEVAA is invalidated, then before future AA queries are processed, the AliasManager is responsible for triggering the re-computation of SCEVAA as part of the chaining process (or before the first AA query). That's my expectation anyway; does that match this model?

Thanks again,
Hal

Sounds great! Hope the inter-procedure alias analysis can be added in next version, so next time i can call it directly :slight_smile:

From: "Chandler Carruth" <chandlerc@google.com>
To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
<dannyb@google.com>
Sent: Monday, July 13, 2015 9:59:10 PM
Subject: RFC: A plan for stateful alias analysis in LLVM

# Front matter # First, I want to emphasize that this is not a
short-term plan. This is a long-term plan. When I was looking at
refactoring the existing Alias Analysis infrastructure in LLVM in
ways that would break stateful AA, Hal Finkel asked me to figure out
how things would work with the new pass manager and make sure that
was sane.

This plan assumes the new pass manager. There is nothing deeply or
intrinsically tied to it, but the pieces will fit together naturally
there, and I don't see how to make them fit naturally in the
existing pass manager. I've not invested a lot of time in figuring
that out as it seems a sunk cost.

# Core ideas of what stateful AA should look like: #
1) An alias analysis *is* an analysis pass, and if it needs to build
up state, it must do so for a "unit" of IR at a time (using the new
PM's terminology).
2) Per-IR-unit invalidation from the pass manager is the primary
mechanism to clear out stale (and thus invalid) state from any alias
analysis. That means that most passes will not preserve AA by
default (unlike today).
3) To be reasonably scalable, most alias analyses which retain state
will need to handle these invalidation events in a way that
gracefully degrades their results rather than wiping out all state.

I think everything in-tree but GlobalsModRef fits these requirements
already. Notably, this will work well for CFL, Andersen, or
generally any "intersection" based AA. A "union" based AA could work
here, but would have to do considerable work to be able to tolerate
updates that "invalidate" its state without simply forced
recomputation. I think this is a reasonable tradeoff as it preserves
our ability to experiment, while simplifying the update model.

The rest of this email contains much more detailed musing on this.
Feel free to stop reading when necessary.

The last section touches on the core functionality that would be
needed to support something like GlobalsModRef in its current form.
I'm much less confident in this part, so it's not part of my core
requirements above.

# Details about analysis passes and pass manager invalidation #
My thought is that an alias analysis pass in the new model would be
an analysis pass. It would produce an analysis result that contained
any state necessary to vend AA queries. This result would be cached
just like all the other analysis results are in the pass manager. It
would have the same invalidation rules too.

A really important consequence of this is that analysis results in
the new scheme get notified of invalidation events and can attempt
to update their internal state to *avoid* becoming invalid. AAs can
leverage this to do fast incremental updates to their internal
structures if appropriate.

An AA could even use ValueHandles now to automatically stay
up-to-date with a conservatively correct data structure, and just
drop all the invalidation requests. In fact, I suspect many stateful
AAs will take exactly this approach.

# How will users of AA information query it? #
This will work much like the normal analysis queries. I plan to
provide an alias analysis manager at each IR unit that will provide
a single query interface for code to use. This layer can then be
configured to query any AnalysisManager<T> for an IR unit T for the
relevant actual AliasAnalysis implementations.

The AA manager layer will be responsible for the delegation that is
currently done through chaining of AA implementations. It will also
be responsible for enforcing that queries at a particular layer
don't violate IR unit abstraction boundaries. For example, you
cannot query a FuntionAAManager for alias information of two Values
from two different functions.

One thing I'd like to clarify is what is responsible for 'requiring' the various alias-analysis pass, so that they'll be recomputed if they're invalidated. So if I have an AliasManager, and I've requested that it manage a BasicAA and an SCEVAA (for example), and something happens such that SCEVAA is invalidated, then before future AA queries are processed, the AliasManager is responsible for triggering the re-computation of SCEVAA as part of the chaining process (or before the first AA query). That's my expectation anyway; does that match this model?

That could get expensive if we keep recomputing AA we end up not needing. We might need to try find a way to be lazier about it.

Take TBAA for example, if it required any kind of computation, it would be better to delay it until we actually query AA on a load/store which has the metadata. I’m pretty sure TBAA doesn’t have state to recompute, but imagine it did and hopefully what i said makes sense.

Cheers,
Pete

From: "Pete Cooper" <peter_cooper@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Chandler Carruth" <chandlerc@google.com>, "Daniel Berlin" <dannyb@google.com>, "LLVM Developers Mailing List"
<llvmdev@cs.uiuc.edu>
Sent: Thursday, July 16, 2015 3:32:43 PM
Subject: Re: [LLVMdev] RFC: A plan for stateful alias analysis in LLVM

>
>> From: "Chandler Carruth" <chandlerc@google.com>
>> To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal
>> Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
>> <dannyb@google.com>
>> Sent: Monday, July 13, 2015 9:59:10 PM
>> Subject: RFC: A plan for stateful alias analysis in LLVM
>>
>>
>>
>> # Front matter # First, I want to emphasize that this is not a
>> short-term plan. This is a long-term plan. When I was looking at
>> refactoring the existing Alias Analysis infrastructure in LLVM in
>> ways that would break stateful AA, Hal Finkel asked me to figure
>> out
>> how things would work with the new pass manager and make sure that
>> was sane.
>>
>>
>> This plan assumes the new pass manager. There is nothing deeply or
>> intrinsically tied to it, but the pieces will fit together
>> naturally
>> there, and I don't see how to make them fit naturally in the
>> existing pass manager. I've not invested a lot of time in figuring
>> that out as it seems a sunk cost.
>>
>>
>> # Core ideas of what stateful AA should look like: #
>> 1) An alias analysis *is* an analysis pass, and if it needs to
>> build
>> up state, it must do so for a "unit" of IR at a time (using the
>> new
>> PM's terminology).
>> 2) Per-IR-unit invalidation from the pass manager is the primary
>> mechanism to clear out stale (and thus invalid) state from any
>> alias
>> analysis. That means that most passes will not preserve AA by
>> default (unlike today).
>> 3) To be reasonably scalable, most alias analyses which retain
>> state
>> will need to handle these invalidation events in a way that
>> gracefully degrades their results rather than wiping out all
>> state.
>>
>>
>> I think everything in-tree but GlobalsModRef fits these
>> requirements
>> already. Notably, this will work well for CFL, Andersen, or
>> generally any "intersection" based AA. A "union" based AA could
>> work
>> here, but would have to do considerable work to be able to
>> tolerate
>> updates that "invalidate" its state without simply forced
>> recomputation. I think this is a reasonable tradeoff as it
>> preserves
>> our ability to experiment, while simplifying the update model.
>>
>>
>> The rest of this email contains much more detailed musing on this.
>> Feel free to stop reading when necessary.
>>
>>
>> The last section touches on the core functionality that would be
>> needed to support something like GlobalsModRef in its current
>> form.
>> I'm much less confident in this part, so it's not part of my core
>> requirements above.
>>
>>
>> # Details about analysis passes and pass manager invalidation #
>> My thought is that an alias analysis pass in the new model would
>> be
>> an analysis pass. It would produce an analysis result that
>> contained
>> any state necessary to vend AA queries. This result would be
>> cached
>> just like all the other analysis results are in the pass manager.
>> It
>> would have the same invalidation rules too.
>>
>>
>> A really important consequence of this is that analysis results in
>> the new scheme get notified of invalidation events and can attempt
>> to update their internal state to *avoid* becoming invalid. AAs
>> can
>> leverage this to do fast incremental updates to their internal
>> structures if appropriate.
>>
>>
>> An AA could even use ValueHandles now to automatically stay
>> up-to-date with a conservatively correct data structure, and just
>> drop all the invalidation requests. In fact, I suspect many
>> stateful
>> AAs will take exactly this approach.
>>
>>
>> # How will users of AA information query it? #
>> This will work much like the normal analysis queries. I plan to
>> provide an alias analysis manager at each IR unit that will
>> provide
>> a single query interface for code to use. This layer can then be
>> configured to query any AnalysisManager<T> for an IR unit T for
>> the
>> relevant actual AliasAnalysis implementations.
>>
>>
>> The AA manager layer will be responsible for the delegation that
>> is
>> currently done through chaining of AA implementations. It will
>> also
>> be responsible for enforcing that queries at a particular layer
>> don't violate IR unit abstraction boundaries. For example, you
>> cannot query a FuntionAAManager for alias information of two
>> Values
>> from two different functions.
>
> One thing I'd like to clarify is what is responsible for
> 'requiring' the various alias-analysis pass, so that they'll be
> recomputed if they're invalidated. So if I have an AliasManager,
> and I've requested that it manage a BasicAA and an SCEVAA (for
> example), and something happens such that SCEVAA is invalidated,
> then before future AA queries are processed, the AliasManager is
> responsible for triggering the re-computation of SCEVAA as part of
> the chaining process (or before the first AA query). That's my
> expectation anyway; does that match this model?
That could get expensive if we keep recomputing AA we end up not
needing. We might need to try find a way to be lazier about it.

I don't disagree, but to be clear, I'm envisioning some version of SCEVAA that is otherwise efficient enough to be used within the standard pipeline.

-Hal