Alias analysis missing obvious control flow information?

Hi, I’m investigating alias analysis techniques and am starting with getting a base line for the types of cases LLVM can and can’t catch. I started with a very simple program and found interesting results already that I’d like to have confirmed, and make sure I’m not just missing something obvious as to why this doesn’t work.

I have the following code:

int *alias_test(int *a, int i, int j)	{
	if (i != j)	{
		a[i] = a[i] + a[j];
	}
	return a;
}

It is obvious that a[i] and a[j] cannot alias, but running opt -aa-eval -print-alias-sets input.ll -S -disable-output -stats gives the following output:

Alias sets for function 'alias_test':
Alias Set Tracker: 1 alias sets for 3 pointer values.
  AliasSet[0x55898c6367d0, 3] may alias, Mod/Ref   Pointers: (i32* %7, LocationSize::precise(4)), (i32* %10, LocationSize::precise(4)), (i32* %14, LocationSize::precise(4))

===== Alias Analysis Evaluator Report =====
  6 Total Alias Queries Performed
  0 no alias responses (0.0%)
  5 may alias responses (83.3%)
  0 partial alias responses (0.0%)
  1 must alias responses (16.6%)
  Alias Analysis Evaluator Pointer Alias Summary: 0%/83%/0%/16%
  Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!

indicating the regular alias analysis pass does not catch the fact these pointers can never alias.

To be sure I wasn’t missing anything I also tried a modified version of the code using literals:

int *alias_test(int *a, int i, int j)	{
	if (i == 1 && j == 2)	{
		a[i] = a[i] + a[j];
	}
	return a;
}

and for this, alias analysis gives the result that a[i] and a[j] do not alias:

Alias sets for function 'alias_test':
Alias Set Tracker: 1 alias sets for 3 pointer values.
  AliasSet[0x55e2131624c0, 3] may alias, Mod/Ref   Pointers: (i32* %9, LocationSize::precise(4)), (i32* %12, LocationSize::precise(4)), (i32* %16, LocationSize::precise(4))

===== Alias Analysis Evaluator Report =====
  6 Total Alias Queries Performed
  3 no alias responses (50.0%)
  2 may alias responses (33.3%)
  0 partial alias responses (0.0%)
  1 must alias responses (16.6%)
  Alias Analysis Evaluator Pointer Alias Summary: 50%/33%/0%/16%
  Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!

My question then is is this really what it looks like, that the default alias analysis in LLVM makes very little effort to understand control flow and how this might change aliasing behavior, outside of simple cases like constant propagation? Or am I missing something else here, like for instance could my first code example really still alias in a way I’m missing?

Thanks.

1 Like

When you’re giving examples about alias analysis, please use LLVM IR, not C code. I can guess what the IR looks like for your examples, but in general it can be tricky.

None of the alias analysis passes in LLVM can analyze constructs like “if (i != j)”. The AliasAnalysis::alias() API doesn’t provide a way to represent the fact that you’re inside the if statement, vs. outside of it.

More generally, for array indexing, transforms like the loop vectorizer don’t just depend on alias analysis. There are other analysis passes like LoopAccessAnalysis which do more analysis on indexing to prove various properties.

1 Like

Ah ok, I’ll use IR instead of C in the future, I just thought the latter might be more readable!

So what you’re saying is alias analysis is more or less blind to control flow altogether? I think this is really interesting - I appreciate you may have no idea about this, but don’t you think there could a lot of potential to be gained in finding no alias cases in extending it to consider this? As I understand a lot of these loop orientated passes focus on memory dependency, rather than aliasing, which can both result in different optimisation gains to be made.

1 Like

Yes, alias analysis doesn’t really use control flow.

I wouldn’t say there’s nothing to be gained by analyzing control flow, but I don’t think it’s high on anyone’s priority list. Analyzing control flow is expensive, and there’s a lot of room for improvement without adding control flow sensitivity.

If you asked me what the highest impact improvement to alias analysis would be at the moment, I’d say some form of interprocedural alias analysis. There’s a lot of academic research in the area, but LLVM hasn’t really taken advantage of it. (The new pass manager probably makes this easier; in the past, the legacy pass manager made it very hard to write that sort of analysis pass.)

3 Likes

Is there an existing control-flow pass that would compute the path-information that could be used to determine that the array accesses in the original example don’t alias?