TBAA metadata

Hi

I do not really understand why frontend generated TBAA metadata is
needed for the TBAA pass to work. It seems to me that we can always go
up the IR chain and find the base type from which the pointer is
derived from. Take the following example.

I know %0 = load i32, i32* %a, align 4, !tbaa !1 and store i32
%i.02, i32* %b, align 4, !tbaa !6
do not alias as their metadata !1 = !{!2, !3, i64 0} and !6 = !{!7,
!3, i64 0} tell me that they are derived from different (incompatible)
basetypes. However i can also walk up the IR chain and find out %a is
point to basetype object of struct A and %b pointing to basetype
object of struct B.

Maybe, the metadata simplified the TBAA pass ?

Thanks,
-Trent

; ModuleID = 'tbaa.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.A = type { i32 }
%struct.B = type { i32 }

; Function Attrs: nounwind uwtable
define i32 @foo(%struct.A* nocapture readonly %sa, %struct.B*
nocapture %sb) #0 {
entry:
  br label %for.body

for.body: ; preds = %entry, %for.body
  %i.02 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %sum.01 = phi i32 [ 0, %entry ], [ %add, %for.body ]
  %a = getelementptr inbounds %struct.A, %struct.A* %sa, i64 0, i32 0
  %0 = load i32, i32* %a, align 4, !tbaa !1
  %b = getelementptr inbounds %struct.B, %struct.B* %sb, i64 0, i32 0
  store i32 %i.02, i32* %b, align 4, !tbaa !6
  %add = add nsw i32 %0, %sum.01
  %inc = add nsw i32 %i.02, 1
  %cmp = icmp slt i32 %inc, 1024
  br i1 %cmp, label %for.body, label %for.end, !llvm.loop !8

for.end: ; preds = %for.body
  %sum.0.lcssa = phi i32 [ %add, %for.body ]
  ret i32 %sum.0.lcssa
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false"
"no-frame-pointer-elim"="false" "no-infs-fp-math"="false"
"no-nans-fp-math"="false" "stack-protector-buffer-size"="8"
"unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = !{!"clang version 3.7.0 (trunk 524)"}
!1 = !{!2, !3, i64 0}
!2 = !{!"A", !3, i64 0}
!3 = !{!"int", !4, i64 0}
!4 = !{!"omnipotent char", !5, i64 0}
!5 = !{!"Simple C/C++ TBAA"}
!6 = !{!7, !3, i64 0}
!7 = !{!"B", !3, i64 0}
!8 = distinct !{!8, !9}
!9 = !{!"llvm.loop.unroll.disable"}

Hi

I do not really understand why frontend generated TBAA metadata is
needed for the TBAA pass to work. It seems to me that we can always go
up the IR chain and find the base type from which the pointer is
derived from. Take the following example.

LLVM Types != C types

I know %0 = load i32, i32* %a, align 4, !tbaa !1 and store i32
%i.02, i32* %b, align 4, !tbaa !6
do not alias as their metadata !1 = !{!2, !3, i64 0} and !6 = !{!7,
!3, i64 0} tell me that they are derived from different (incompatible)
basetypes.
However i can also walk up the IR chain and find out %a is
point to basetype object of struct A and %b pointing to basetype
object of struct B.

Which tells you precisely nothing since LLVM has no inherent language,
and thus, no inherent rules about aliasing :slight_smile:

It is perfectly legal to inttoptr, cast, whatever.

Hi

I do not really understand why frontend generated TBAA metadata is
needed for the TBAA pass to work. It seems to me that we can always go
up the IR chain and find the base type from which the pointer is
derived from. Take the following example.

LLVM Types != C types

I know %0 = load i32, i32* %a, align 4, !tbaa !1 and store i32
%i.02, i32* %b, align 4, !tbaa !6
do not alias as their metadata !1 = !{!2, !3, i64 0} and !6 = !{!7,
!3, i64 0} tell me that they are derived from different (incompatible)
basetypes.
However i can also walk up the IR chain and find out %a is
point to basetype object of struct A and %b pointing to basetype
object of struct B.

Which tells you precisely nothing since LLVM has no inherent language,
and thus, no inherent rules about aliasing :slight_smile:

It is perfectly legal to inttoptr, cast, whatever.

I see, but should not users be discouraged (or prohibited) from
writing a LLVM function in which they cast a struct A* to a struct B*
or vice versa and then linking with a LLVM IR generated from a C
program and still claim this is strict-C-aliasing compliant ?

I guess i am not getting how this is possible, the only ways i can
think of are to write something in LLVM IR directly or compile some C
programs (with castings) with fno-strict-aliasing and then link it
with a C program compiled with strict-aliasing.

I apologize if i am not making much sense.

-Trent

Hi

I do not really understand why frontend generated TBAA metadata is
needed for the TBAA pass to work. It seems to me that we can always go
up the IR chain and find the base type from which the pointer is
derived from. Take the following example.

LLVM Types != C types

I know %0 = load i32, i32* %a, align 4, !tbaa !1 and store i32
%i.02, i32* %b, align 4, !tbaa !6
do not alias as their metadata !1 = !{!2, !3, i64 0} and !6 = !{!7,
!3, i64 0} tell me that they are derived from different (incompatible)
basetypes.
However i can also walk up the IR chain and find out %a is
point to basetype object of struct A and %b pointing to basetype
object of struct B.

Which tells you precisely nothing since LLVM has no inherent language,
and thus, no inherent rules about aliasing :slight_smile:

It is perfectly legal to inttoptr, cast, whatever.

I see, but should not users be discouraged (or prohibited) from
writing a LLVM function in which they cast a struct A* to a struct B*
or vice versa and then linking with a LLVM IR generated from a C
program and still claim this is strict-C-aliasing compliant ?

???
The right thing will happen here.
The parts it can optimize, it will optimize, the parts it can't, it own't.

I guess i am not getting how this is possible, the only ways i can
think of are to write something in LLVM IR directly or compile some C
programs (with castings) with fno-strict-aliasing and then link it
with a C program compiled with strict-aliasing.

Or you know, don't use C?