AliasAnalysis update interface - a tale of sorrow and woe

Greetings folks,

As I’m working on refactoring the AA interfaces in LLVM to prepare for the new pass manager, I keep hitting issues. Some of the complexity that is hitting this stems from the update API, which consists of three virtual functions:

deleteValue(Value *V)
copyValue(Value *From, Value *To)
addEscapingUse(Use &U)

These interfaces are very rarely called. Here are the only passes that ever bother to use these:

  • MemoryDependenceAnalysis.cpp
  • BasicBlockUtils.cpp
  • LoopSimplify.cpp
  • MergedLoadStoreMotion.cpp
  • GVN.cpp

That’s it. This is not a complete set of code that can delete a load or store, nor a complete list of things which can fold something into a Phi node or otherwise trigger an “escaping” use. Maybe these interfaces used to be much more widely used prior to the introduction of AliasSetTracker? Now that utility is used instead. Either way, basic things like CSE, store->load forwarding in instcombine, etc., all seem like they should be updating AA, but they aren’t…

So, if these update APIs are really important, I’m pretty sure that LLVM is completely broken already…

But in fact, almost nothing overrides these APIs. The only pass in tree that uses them at all is GlobalsModRef, which might explain why the world hasn’t melted due to us not calling these update routines reliably.

So I looked into GlobalsModRef to understand why it is overriding these. It doesn’t override copyValue at all. That API point appears to be completely dead. So my first question is: can I remove copyValue? Is there some out of tree user that desperately needs this? If so, I’d like to understand why.

The APIs it does override are deleteValue to nix things in its cache, and addEscapingUse, which it implements by just deleting the used value from its cache. This last one seems the most interesting. Only GVN calls it, but GVN does call it. So I commented out the code in addEscapingUse so that it became a no-op. No test failed. =/ So I added an assert to its addEscapingUse method. No test failed. So I added a report_fatal_error to the addEscapingUse method and did an LTO run over clang’s bitcode which finally did reach this code path.

addEscapingUse was added in 2011 by r122787 without any test case. There is no mention of this fixing a bug. It looks like it may have been intended to support something thata was never added to the tree.

So I’d like to remove addEscapingUse since we used to not have it, and we’ve never bothered to test it and I can’t get anything to fail without it, and GlobalsModRef is the only user and that is only reached during LTO. Thoughts?

My final question is deleteValue. Again, only GlobalsModRef implements this extension point. This was added in 2006 in r30684, also without any changes to any test. It claims to implement a test file, but that test isn’t changed in the commit. Indeed, commenting out the body of deleteValue in GlobalsModRef causes no test to fail. =/ Fortunately, putting an assert here does trip in the regression test suite, so the code is reached, just not exercised.

Either way, the use case for the deleteValue at least makes perfect sense. But there is (IMO) a much better way to accomplish the same task: use a ValueHandle to trigger the update on deletion. This will prevent widespread failure to use the deleteValue API to update alias analysis. So I would like to replace deleteValue with a ValueHandle approach. But I’m more than a little nervous implementing this, as no test actually uses the behavior! At least the code is reached…

I think generally, the update API in the AliasAnalysis interface isn’t working well. It is poorly and incompletely used. It isn’t even clear that it is necessary in the face of tools like value handles. Thoughts about just removing the entire thing and falling back to value handles everywhere?

-Chandler

My opinion:

The interface is essentially broken and unused.
We know it doesn't work.
We should remove it.

Given that we need to sit down and think about stateful alias analysis
due to the new pass manager, it seems like a good time to do that.

From: "Daniel Berlin" <dberlin@dberlin.org>
To: "Chandler Carruth" <chandlerc@gmail.com>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
<dannyb@google.com>
Sent: Wednesday, July 1, 2015 11:31:49 AM
Subject: Re: [LLVMdev] AliasAnalysis update interface - a tale of sorrow and woe

My opinion:

The interface is essentially broken and unused.
We know it doesn't work.
We should remove it.

Given that we need to sit down and think about stateful alias
analysis
due to the new pass manager, it seems like a good time to do that.

+1

-Hal

+2 to Hal and Daniel. We’ve looked at utilizing/improving this interface for our efforts. If its broken, lets fix it correctly.

John D. Leidel
Software Compiler Development Manager
Micron Technology, Inc.
jleidel@micron.com
office: 972-521-5271
cell: 214-578-8510

From: "Daniel Berlin" <dberlin@dberlin.org>
To: "Chandler Carruth" <chandlerc@gmail.com>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
<dannyb@google.com>
Sent: Wednesday, July 1, 2015 11:31:49 AM
Subject: Re: [LLVMdev] AliasAnalysis update interface - a tale of sorrow and woe

My opinion:

The interface is essentially broken and unused.
We know it doesn't work.
We should remove it.

Given that we need to sit down and think about stateful alias
analysis
due to the new pass manager, it seems like a good time to do that.

+1

+1

From: "Chandler Carruth" <chandlerc@gmail.com>
To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
<dannyb@google.com>
Sent: Wednesday, July 1, 2015 3:18:33 AM
Subject: AliasAnalysis update interface - a tale of sorrow and woe

Greetings folks,

As I'm working on refactoring the AA interfaces in LLVM to prepare
for the new pass manager, I keep hitting issues. Some of the
complexity that is hitting this stems from the update API, which
consists of three virtual functions:

deleteValue(Value *V)
copyValue(Value *From, Value *To)
addEscapingUse(Use &U)

These interfaces are *very* rarely called. Here are the only passes
that ever bother to use these:
- MemoryDependenceAnalysis.cpp
- BasicBlockUtils.cpp
- LoopSimplify.cpp
- MergedLoadStoreMotion.cpp
- GVN.cpp

That's it. This is not a complete set of code that can delete a load
or store, nor a complete list of things which can fold something
into a Phi node or otherwise trigger an "escaping" use. Maybe these
interfaces used to be much more widely used prior to the
introduction of AliasSetTracker? Now that utility is used instead.
Either way, basic things like CSE, store->load forwarding in
instcombine, etc., all seem like they *should* be updating AA, but
they aren't....

So, if these update APIs are really important, I'm pretty sure that
LLVM is completely broken already...

But in fact, almost nothing overrides these APIs. The only pass in
tree that uses them at all is GlobalsModRef, which might explain why
the world hasn't melted due to us not calling these update routines
reliably.

So I looked into GlobalsModRef to understand why it is overriding
these. It doesn't override copyValue at all. That API point appears
to be completely dead. So my first question is: can I remove
copyValue? Is there some out of tree user that desperately needs
this? If so, I'd like to understand why.

The APIs it does override are deleteValue to nix things in its cache,
and addEscapingUse, which it implements by just deleting the used
value from its cache. This last one seems the most interesting. Only
GVN calls it, but GVN does call it. So I commented out the code in
addEscapingUse so that it became a no-op. No test failed. =/ So I
added an assert to its addEscapingUse method. No test failed. So I
added a report_fatal_error to the addEscapingUse method and did an
LTO run over clang's bitcode which finally did reach this code path.

addEscapingUse was added in 2011 by r122787 without any test case.
There is no mention of this fixing a bug. It looks like it may have
been intended to support something thata was never added to the
tree.

So I'd like to remove addEscapingUse since we used to not have it,
and we've never bothered to test it and I can't get anything to fail
without it, and GlobalsModRef is the only user and that is only
reached during LTO. Thoughts?

My final question is deleteValue. Again, only GlobalsModRef
implements this extension point. This was added in 2006 in r30684,
also without any changes to any test. It claims to implement a test
file, but that test isn't changed in the commit. Indeed, commenting
out the body of deleteValue in GlobalsModRef causes no test to fail.
=/ Fortunately, putting an assert here *does* trip in the regression
test suite, so the code is reached, just not exercised.

Either way, the use case for the deleteValue at least makes perfect
sense. But there is (IMO) a much better way to accomplish the same
task: use a ValueHandle to trigger the update on deletion. This will
prevent widespread failure to use the deleteValue API to update
alias analysis. So I would like to replace deleteValue with a
ValueHandle approach. But I'm more than a little nervous
implementing this, *as no test actually uses the behavior*! At least
the code is reached...

I think generally, the update API in the AliasAnalysis interface
isn't working well. It is poorly and incompletely used. It isn't
even clear that it is necessary in the face of tools like value
handles. Thoughts about just removing the entire thing and falling
back to value handles everywhere?

As a side point, I don't think we currently have a ValueHandle type that can detect when a new use of a value is added. Do we? This would certainly be useful for some kind of AA result caching, etc.

-Hal

From: "Hal Finkel" <hfinkel@anl.gov>
To: "Chandler Carruth" <chandlerc@gmail.com>
Cc: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Daniel Berlin" <dannyb@google.com>
Sent: Thursday, July 2, 2015 3:14:38 PM
Subject: Re: AliasAnalysis update interface - a tale of sorrow and woe

> From: "Chandler Carruth" <chandlerc@gmail.com>
> To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>, "Hal
> Finkel" <hfinkel@anl.gov>, "Daniel Berlin"
> <dannyb@google.com>
> Sent: Wednesday, July 1, 2015 3:18:33 AM
> Subject: AliasAnalysis update interface - a tale of sorrow and woe
>
>
> Greetings folks,
>
>
> As I'm working on refactoring the AA interfaces in LLVM to prepare
> for the new pass manager, I keep hitting issues. Some of the
> complexity that is hitting this stems from the update API, which
> consists of three virtual functions:
>
>
> deleteValue(Value *V)
> copyValue(Value *From, Value *To)
> addEscapingUse(Use &U)
>
>
> These interfaces are *very* rarely called. Here are the only passes
> that ever bother to use these:
> - MemoryDependenceAnalysis.cpp
> - BasicBlockUtils.cpp
> - LoopSimplify.cpp
> - MergedLoadStoreMotion.cpp
> - GVN.cpp
>
>
> That's it. This is not a complete set of code that can delete a
> load
> or store, nor a complete list of things which can fold something
> into a Phi node or otherwise trigger an "escaping" use. Maybe these
> interfaces used to be much more widely used prior to the
> introduction of AliasSetTracker? Now that utility is used instead.
> Either way, basic things like CSE, store->load forwarding in
> instcombine, etc., all seem like they *should* be updating AA, but
> they aren't....
>
>
> So, if these update APIs are really important, I'm pretty sure that
> LLVM is completely broken already...
>
>
> But in fact, almost nothing overrides these APIs. The only pass in
> tree that uses them at all is GlobalsModRef, which might explain
> why
> the world hasn't melted due to us not calling these update routines
> reliably.
>
>
> So I looked into GlobalsModRef to understand why it is overriding
> these. It doesn't override copyValue at all. That API point appears
> to be completely dead. So my first question is: can I remove
> copyValue? Is there some out of tree user that desperately needs
> this? If so, I'd like to understand why.
>
>
> The APIs it does override are deleteValue to nix things in its
> cache,
> and addEscapingUse, which it implements by just deleting the used
> value from its cache. This last one seems the most interesting.
> Only
> GVN calls it, but GVN does call it. So I commented out the code in
> addEscapingUse so that it became a no-op. No test failed. =/ So I
> added an assert to its addEscapingUse method. No test failed. So I
> added a report_fatal_error to the addEscapingUse method and did an
> LTO run over clang's bitcode which finally did reach this code
> path.
>
>
> addEscapingUse was added in 2011 by r122787 without any test case.
> There is no mention of this fixing a bug. It looks like it may have
> been intended to support something thata was never added to the
> tree.
>
>
> So I'd like to remove addEscapingUse since we used to not have it,
> and we've never bothered to test it and I can't get anything to
> fail
> without it, and GlobalsModRef is the only user and that is only
> reached during LTO. Thoughts?
>
>
> My final question is deleteValue. Again, only GlobalsModRef
> implements this extension point. This was added in 2006 in r30684,
> also without any changes to any test. It claims to implement a test
> file, but that test isn't changed in the commit. Indeed, commenting
> out the body of deleteValue in GlobalsModRef causes no test to
> fail.
> =/ Fortunately, putting an assert here *does* trip in the
> regression
> test suite, so the code is reached, just not exercised.
>
>
> Either way, the use case for the deleteValue at least makes perfect
> sense. But there is (IMO) a much better way to accomplish the same
> task: use a ValueHandle to trigger the update on deletion. This
> will
> prevent widespread failure to use the deleteValue API to update
> alias analysis. So I would like to replace deleteValue with a
> ValueHandle approach. But I'm more than a little nervous
> implementing this, *as no test actually uses the behavior*! At
> least
> the code is reached...
>
>
> I think generally, the update API in the AliasAnalysis interface
> isn't working well. It is poorly and incompletely used. It isn't
> even clear that it is necessary in the face of tools like value
> handles. Thoughts about just removing the entire thing and falling
> back to value handles everywhere?

As a side point, I don't think we currently have a ValueHandle type
that can detect when a new use of a value is added. Do we? This
would certainly be useful for some kind of AA result caching, etc.

I should add that I'm somewhat afraid, however, of the compile-time implications of adding the necessary hook.

-Hal

I was just going to say this myself.

In fact, even if you don’t add a hook for tracking new values, just using value handles in AA could prove to be expensive.

A typical optimized piece of code has more load instructions than binary instructions. If you only cached loads with value handles, you’re already creating a huge number of handles, and these are currently entries in a map (pImpl->ValueHandles[V] if you’re curious).

TL;DR: If we are going to create this many value handles, we should strongly consider finding a way to represent them more cheaply.

Cheers,
Pete

The vast majority of stateful AA probably doesn't/won't care about the
vast majority of updates.

For example, unless globalsmodref is going to recompute the transitive
closure, it can't actually give better than "i don't know" answers in
most cases, which is equivalent to telling it "you should delete this
pointer and pretend you know nothing about it". Humorously, you can
see that's what it did with addEscapingUse. It deleted anything it
knew about the pointer.

Further, once that delete has happened once, there is no point in ever
tracking anything about that value ever again for globalsmodref. It
will never regain that info.

Even for those that do care, there are other issues:
The AA may never be queried again. Or the IR may change 6 or 7 times
before it gets queried again. The only thing any AA needs to know in
general is "what is the net result of those changes" (IE added uses,
etc). For example, if you were to add and then remove a use before
querying AA, no AA will care. Telling it an add and then remove is not
only pointless and expensive, it is probably worse than telling it
"nothing happened" (which is the net effect).

In any case, i think we need to sit down and think "which of these
AA's do we want to cache/make stateful, and how".
Because the original use case for building a stateful API, from what i
can tell, is globalsmodref. This is not a great use case, since
incremental updating may require a complete callgraph walk. That
seems like a non-starter :slight_smile:

Nowadays, i expect the stateful use cases are more like "caching
CFL-AA or caching expensive parts of BasicAA" (both of which are
O(pointer chain) to update incrementally).
Even andersen's can be incrementally computed quicker in practice (For
additional uses, delete variable from sets, add new constraints,
fixpoint. In most cases, you can prove the new constraint doesn't
change the results, so you do nothing. Or it takes one iteration of
propagation. Worst case is N^3 like the callgraph walk, but in
practice, it's probably closer to O(pointer chain))