Meaning of NoAlias in alias analysis

I have a question about the meaning of NoAlias in LLVM’s alias analysis. The documentation clearly states that

"The NoAlias response may be used when there is never an immediate dependence between any memory reference based on one pointer and any memory reference based the other […] Another is when the two pointers are only ever used for reading memory." from LLVM Alias Analysis Infrastructure — LLVM 15.0.0git documentation

Now consider this C program

   int f ( int* p, int* q ) {
       return p[1] + q[1];

which clearly only reads pointers. Compiling it with

   clang -O1 -Xclang -disable-llvm-passes -S -emit-llvm
   opt -S -mem2reg

I get the IR below, at the bottom of this post, and when I then print the alias analysis information with

opt -aa-eval -aa -scev-aa -print-alias-sets -print-all-alias-modref-info

I get the following alias analysis

Function: f: 4 pointers, 0 call sites
  MayAlias:     i32* %0, i32* %1
  NoAlias:      i32* %0, i32* %3
  MayAlias:     i32* %1, i32* %3
  MayAlias:     i32* %0, i32* %5
  NoAlias:      i32* %1, i32* %5
  MayAlias:     i32* %3, i32* %5
Alias sets for function 'f':
Alias Set Tracker: 1 alias sets for 2 pointer values.
  AliasSet[0x6000000cb200, 2] may alias, Ref       Pointers: (i32* %3, LocationSize::prec\
ise(4)), (i32* %5, LocationSize::precise(4))

===== Alias Analysis Evaluator Report =====
  6 Total Alias Queries Performed
  2 no alias responses (33.3%)
  4 may alias responses (66.6%)
  0 partial alias responses (0.0%)
  0 must alias responses (0.0%)
  Alias Analysis Evaluator Pointer Alias Summary: 33%/66%/0%/0%
  Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!

In the light of the quote above that read-only implies NoAlias, I don’t understand
why i get 4 may alias responses at all. should it not be all NoAlias?

; ModuleID = 'main.c.ll'
source_filename = "main.c"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S1\
target triple = "x86_64-apple-macosx12.0.0"

; Function Attrs: nounwind ssp uwtable
define i32 @f(i32* %0, i32* %1) #0 {
  %3 = getelementptr inbounds i32, i32* %0, i64 1
  %4 = load i32, i32* %3, align 4, !tbaa !6
  %5 = getelementptr inbounds i32, i32* %1, i64 1
  %6 = load i32, i32* %5, align 4, !tbaa !6
  %7 = add nsw i32 %4, %6
  ret i32 %7

attributes #0 = { nounwind ssp uwtable "darwin-stkchk-strong-link" "frame-pointer"="all" \
"min-legal-vector-width"="0" "no-trapping-math"="true" "probe-stack"="___chkstk_darwin" "\
stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxs\
r,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "tune-cpu"="generic" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 2, !"SDK Version", [2 x i32] [i32 12, i32 3]}
!1 = !{i32 1, !"wchar_size", i32 4}
!2 = !{i32 7, !"PIC Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 1}
!4 = !{i32 7, !"frame-pointer", i32 2}
!5 = !{!"Apple clang version 13.1.6 (clang-1316."}
!6 = !{!7, !7, i64 0}
!7 = !{!"int", !8, i64 0}
!8 = !{!"omnipotent char", !9, i64 0}
!9 = !{!"Simple C/C++ TBAA"}

The AA alias() API does not know whether the passed pointers are only going to be used in a read operations. You need to use getModRefInfo() if you want to determine whether an instruction may read/write a certain location – which will always return Ref for simple (non-atomic/volatile) loads.

The primary case where AA will return NoAlias for pointers that point to the same address is in the presence of noalias attributes. AA will report that p and q do not alias in that case, because this would be UB as long as one of the pointers is stored to.

This part of the docs is out of date (patches welcome! :slight_smile: ) . As @nikic mentioned AliasAnalysis nowadays works on MemoryLocations and does not know whether a pointer is used to read or write memory.

Some clients may not care about aliasing read accesses, but it’s up to the clients to skip such accesses.

@nikic @fhahn what i don’t understand is why i need to care about what happens outside of the function. the two pointers p and q are local to the function and don’t exist elsewhere.

@fhahn I would be happy to update the documentation, where do I submit patches?

I’m not sure what you mean here. -print-all-alias-modref-info will print alias info for all pairs of pointers in the function. This is for information/debugging only.

Great! Please see Contributing to LLVM — LLVM 15.0.0git documentation for more info.

What I mean is that p and q only exist in the function, so throughout the entire existence of those pointers they only get read (except when initialised of course).

Right, but they still can alias (if the same pointer is passed to p and q). You are asking opt print alias information between all pairs of pointers in the function. Whether the pointer gets read or written is irrelevant with respect to aliasing.

Yes, I know that the procedure f can be called with the same pointer twice, e.g.

int n = 17;
f(&n, &n);

but still the pointers are only read in the the body of f, so according to the documentation, which I know know is out of date, this should be counted as NoAlias. I’m basically trying to determine the exact meaning of NoAlias.

AA is allowed to return NoAlias for the pointers, but it is not required. MayAlias is always a valid conservative result. Compiler Explorer for an example where AA does return NoAlias.

A sufficiently smart AA implementation could look at the function body and see that it only contains reads and then return NoAlias for all queries inside that function, but this would not be a useful thing to do.

OK, thank you. So the AA is just not ‘smart enough’ out of the box to return the precise result NoAlias, and errs on the side of caution with MayAlias.

Hi, so following from that AliasAnalysis doesn’t know if a pointer is used for reads or writes, does this mean the ModRef parts of the alias analysis documentation (LLVM Alias Analysis Infrastructure — LLVM 15.0.0git documentation) is outdated too? Are methods like getModRefInfo no longer provided?

The ::alias API specifically does only take MemoryLocations. the ::getModRefInfo API takes an instruction and a MemoryLocation. It will return the ModRef behavior of the instruction for the given MemoryLocation.