clarification on the semantics of AliasAnalysis queries

Hello,

I’m trying to improve my understanding of the meaning of AliasAnalysis::alias queries and their results. There was a conversation on this topic about a year ago: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/076068.html

Gerolf Hoflehner wrote:

I think I see the fundamental issue you are looking at. It is mentioned implicitly in the discussions, but not called out. In your CFG example there is no backedge (head always dominates tail), only retreat edges. So your graph is irreducible. For such graphs “simultaneous live” means that there be can be two dynamic execution paths where the variables (memory locations, objects etc) are simultaneously live, but they may not be live at the same time on all execution paths. Since LLVM uses SSA, all variables (memory locations, objects, … ) are strictly defined, so there cannot be irreducible dependence graphs. I believe this assumption is built into the alias queries. So to catch cases like in your scenario I think you need to base your queries on classical dataflow analysis.

I was wondering if anyone knows…

(1) Was it intentional that AliasAnalysis allows queries in which neither address dominates the other? Since AA queries don’t sound well-defined for such cases, I’m surprised they’re allowed to go unreported.

(2) When such AA queries are made, do we have an ideal answer is to the query? Empirically it seems to be “may-alias”.

Thanks,
Christian

From: "Christian Convey" <christian.convey@gmail.com>
To: llvmdev@cs.uiuc.edu
Sent: Sunday, June 7, 2015 6:41:05 PM
Subject: [LLVMdev] clarification on the semantics of AliasAnalysis queries

Hello,

I'm trying to improve my understanding of the meaning of
AliasAnalysis::alias queries and their results. There was a
conversation on this topic about a year ago:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/076068.html

Gerolf Hoflehner wrote:

I think I see the fundamental issue you are looking at. It is
mentioned implicitly in the discussions, but not called out. In your
CFG example there is no backedge (head always dominates tail), only
retreat edges. So your graph is irreducible. For such graphs
“simultaneous live” means that there be can be two dynamic execution
paths where the variables (memory locations, objects etc) are
simultaneously live, but they may not be live at the same time on
all execution paths. Since LLVM uses SSA, all variables (memory
locations, objects, … ) are strictly defined, so there cannot be
irreducible dependence graphs. I believe this assumption is built
into the alias queries. So to catch cases like in your scenario I
think you need to base your queries on classical dataflow analysis.

I was wondering if anyone knows....

(1) Was it intentional that AliasAnalysis allows queries in which
neither address dominates the other? Since AA queries don't sound
well-defined for such cases, I'm surprised they're allowed to go
unreported.

That's correct, the query results are only meant to be meaningful along control-flow paths where both accesses (as represented by their 'Location's) are reached.

There is added expense in verifying this, however, and we don't even necessarily have a dominator tree available when AA is run. Adding DT queries as part of every AA query could get expensive. We do have an "expensive checks" mode, and maybe we could add something there.

(2) When such AA queries are made, do we have an ideal answer is to
the query? Empirically it seems to be "may-alias".

If you mean a conservatively-correct answer, yes, MayAlias.

-Hal