tbaa

Hi,

Could anyone tell me how exactly do I use "Type Based Alias Analysis"?

I compiled the C program with Clang, and verified that there is tbaa
metadata in the IR code.

But then when I use "opt -tbaa input.c.bc -aa-eval" to check the results,
it always gives 100% may aliasing no matter what input.

Am I using "tbaa" correctly?

Thanks.
Yi

Can you post the source code of your test case?

Gan

Can you post the source code of your test case?

Gan

Why not try some simple ones like:

  1 void foo(int);
  2
  3 int main()
  4 {
  5 int x=0;
  6 int* p=&x;
  7 int* q=&x;
  8
  9 float z=0;
10 float* t=&z;
11
12 return *p;
13 }

-tbaa gives me:
Alias Set Tracker: 1 alias sets for 7 pointer values.
  AliasSet[0x207f860, 7] may alias, Mod/Ref Pointers: (i32* %1, 4),
(i32* %x, 4), (i32** %p, 8), (i32** %q, 8), (float* %z, 4), (float** %t,
8), (i32* %2, 4)

===== Alias Analysis Evaluator Report =====
  21 Total Alias Queries Performed
  0 no alias responses (0.0%)
  21 may alias responses (100.0%)
  0 partial alias responses (0.0%)
  0 must alias responses (0.0%)
  Alias Analysis Evaluator Pointer Alias Summary: 0%/100%/0%/0%
  Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!

-basicaa gives me:
Alias Set Tracker: 6 alias sets for 7 pointer values.
  AliasSet[0x27a0020, 1] must alias, Mod Pointers: (i32* %1, 4)
  AliasSet[0x27a0080, 2] may alias, Mod/Ref Pointers: (i32* %x, 4),
(i32* %2, 4)
  AliasSet[0x27a00e0, 1] must alias, Mod/Ref Pointers: (i32** %p, 8)
  AliasSet[0x27a3f60, 1] must alias, Mod Pointers: (i32** %q, 8)
  AliasSet[0x27a4000, 1] must alias, Mod Pointers: (float* %z, 4)
  AliasSet[0x27acdf0, 1] must alias, Mod Pointers: (float** %t, 8)

===== Alias Analysis Evaluator Report =====
  21 Total Alias Queries Performed
  19 no alias responses (90.4%)
  2 may alias responses (9.5%)
  0 partial alias responses (0.0%)
  0 must alias responses (0.0%)
  Alias Analysis Evaluator Pointer Alias Summary: 90%/9%/0%/0%
  Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!

So I suspect -tbaa is not working in this case.

Hi Yi,

Could anyone tell me how exactly do I use "Type Based Alias Analysis"?

I compiled the C program with Clang, and verified that there is tbaa
metadata in the IR code.

But then when I use "opt -tbaa input.c.bc -aa-eval" to check the results,
it always gives 100% may aliasing no matter what input.

you need to run some optimizations on your bitcode, at least mem2reg, to get
it in a form where alias analysis will do something useful.

Ciao, Duncan.

Duncan Sands <baldrick <at> free.fr> writes:

Hi Yi,

> Could anyone tell me how exactly do I use "Type Based Alias Analysis"?
>
> I compiled the C program with Clang, and verified that there is tbaa
> metadata in the IR code.
>
> But then when I use "opt -tbaa input.c.bc -aa-eval" to check the results,
> it always gives 100% may aliasing no matter what input.

you need to run some optimizations on your bitcode, at least mem2reg, to get
it in a form where alias analysis will do something useful.

Ciao, Duncan.

Thanks for the advice. But I tried -mem2reg. It gives me the same results:
everything may alias while -basicaa give more meaningful results.

Have you made -tbaa working before? Could you show me the optimizations you used?

Thank you.
Yi

Hi Yi,

I didn't get a chance to run your code. But from the debug information you
posted about tbaa alias analysis:

Alias Set Tracker: 1 alias sets for 7 pointer values.
AliasSet[0x207f860, 7] may alias, Mod/Ref Pointers: (i32* %1, 4),
(i32* %x, 4), (i32** %p, 8), (i32** %q, 8), (float* %z, 4), (float** %t,
8), (i32* %2, 4)

I guess it is because of the way how TBAA alias analysis treats pointers.
See file: tools/clang/lib/CodeGen/CodeGenTBAA.cpp, line 147&148

    147 if (Ty->isPointerType())
    148 return MetadataCache[Ty] = getTBAAInfoForNamedType("any pointer",
    149 getChar());

For any pointer, no matter which object it points to, TBAA will generate
the same type of tbaa metadata, which is called "any pointer". Since the
elements in the alias set you posted are all pointers, TBAA will think
that they all "may alias" each other.

The BasicAA pass analyzes in a different way. So it is not surprise if
they return you different results.

Can you post the IR of the code? It would be much easier to explain this
if we can see the IR.

Gan

Err, no.
This would in fact, defeat the entire purpose of TBAA, which is to
make it so pointers to one type are not considered to alias pointed-to
variables of other types

IE int * does not point to a float.

The above TBAA info generated by clang is "conservatively correct" (it
misses that float ** can't point to int *, and there is a FIXME about
this in the source, which correctly states the real issue in makign
this work), but still does not explain his may-alias situation,
because when clang compiled it, the tbaa tag assigned to float would
not match the tbaa tag assigned to the int.

For a test program,
void foo(float *);
void bar(int *);
int main()
{
  int x=0;
  int* p=&x;
  int* q=&x;
bar(p);
  float z=0;
   float* t=&z;
   foo(t);
return *p;
}

I get

define i32 @main() nounwind uwtable ssp {
  %x = alloca i32, align 4
  %z = alloca float, align 4
  store i32 0, i32* %x, align 4, !tbaa !0
  call void @bar(i32* %x) nounwind
  store float 0.000000e+00, float* %z, align 4, !tbaa !3
  call void @foo(float* %z) nounwind
  %1 = load i32* %x, align 4, !tbaa !0
  ret i32 %1
}

declare void @bar(i32*)

declare void @foo(float*)

!0 = metadata !{metadata !"int", metadata !1}
!1 = metadata !{metadata !"omnipotent char", metadata !2}
!2 = metadata !{metadata !"Simple C/C++ TBAA", null}
!3 = metadata !{metadata !"float", metadata !1}

You can see int and float were assigned different tbaa tags (both
children of the omnipotent char type, which is "conservatively
correct"), and that it should know that these two don't alias.
The TBAA analysis code is correct
(http://llvm.org/svn/llvm-project/llvm/trunk/lib/Analysis/TypeBasedAliasAnalysis.cpp),
it walks up the tree and checks whether they are ancestors of each
other

(Note, in deep TBAA trees, this can be done significantly faster using
the same thing I did for dominators, DFS number the TBAA tree once
you have all the initial sets, renumber if invalidated)

Yet, aa-eval still says otherwise.

Function: main: 2 pointers, 2 call sites
  MayAlias: float* %z, i32* %x
  Both ModRef: Ptr: i32* %x <-> call void @bar(i32* %x) nounwind
  Both ModRef: Ptr: float* %z <-> call void @bar(i32* %x) nounwind
  Both ModRef: Ptr: i32* %x <-> call void @foo(float* %z) nounwind
  Both ModRef: Ptr: float* %z <-> call void @foo(float* %z) nounwind
  Both ModRef: call void @bar(i32* %x) nounwind <-> call void
@foo(float* %z) nounwind
  Both ModRef: call void @foo(float* %z) nounwind <-> call void
@bar(i32* %x) nounwind

I'm too busy to debug further, i don't have a debug binary of opt
handy, but these results are fairly clearly wrong :slight_smile:

The problem is with aa-eval. It collects all the pointer values in a
function, and then just makes a bunch of raw pointer queries, rather than
considering dereferences. TBAA tags are only attached to dereferences.
So TBAA always has to say MayAlias for every aa-eval query.

Dan

Hi Dan,

So do you know how to get the correct results from tbaa? It seems
"print-alias-sets" also has all the pointers in one set.

Yi

Makes sense. In that case, it would give the expected answers if
clang was enhanced to properly deal with similar/dissimilar pointer
types, instead of giving all pointer types the "everything" tag :slight_smile:

This requires implementing the "pointer to first member is allowed as
pointer to struct"/etc rules, however.

If you want it to say that float * and int * don't alias, you will
have to enhance clang to generate TBAA tags for pointer types
properly.
It should already say that float * and int don't alias, and int * and
float don't alias.

You can see equivalent hand-written examples in
test/Analysis/TypeBasedAliasAnalysis/

Hi Daniel,

If you want it to say that float * and int * don't alias, you will
have to enhance clang to generate TBAA tags for pointer types
properly.
It should already say that float * and int don't alias, and int * and
float don't alias.

From what you said above, do you mean that, the current clang can not

disambiguate a "float pointer" (float *) and an "int pointer" (int *)?

Gan

-print-alias-sets is going to be pretty useless, because any time there's
even a single "omnipotent char" anywhere, it'll MayAlias everything,
so everything will be transitively MayAliased to everything else, and
you'll end up with one big set.

I don't think there are any other tools that do anything similar in LLVM.

Dan

No, you've confused pointer types with pointee types. That "any pointer"
tag is for describing when the in-memory objects themselves have pointer
types.

The real issue here is that -aa-eval is just a limited debugging tool that
tends to get misinterpreted as a general-purpose AA benchmark due to
there being no easy alternatives.

Dan

Right, and when i later debugged it, some of the queries were about
whether int** could alias a float *, which would properly be fixed ...

Dan,

Thanks for clarifying. This makes perfect sense.

Gan

Hi,

Does LLVM have any document about the TBAA? I think its source code
must be one. But I cannot find any discussion about TBAA from
http://llvm.org/docs/AliasAnalysis.html
Thanks.