Alias Analysis Problem in LICM

Hi,

I met an alias analysis problem in the LICM phase. I am using the following
piece of code to present this problem.

      1 int size;
      2 int ** AAA;
      3 void * xalloc(int);
      4
      5 void foo(void)
      6 {
      7 int i;
      8 AAA = (int**) xalloc(size * sizeof(int*));
      9
     10 for (i=0; i<size; i++)
     11 AAA[i] = 0;
     12 }

This code tries to initialize an array of pointers. The array is
allocated from heap.
"AAA" points to the beginning of the array and it is a global variable.

I got the following IR dump after LICM:

    82 *** IR Dump After Loop Invariant Code Motion ***
    83 for.body: ; preds =
%for.body.lr.ph, %for.body
    84 %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
    85 %4 = load i32*** @AAA, align 4, !tbaa !3
    86 %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
    87 store i32* null, i32** %arrayidx, align 4, !tbaa !3
    88 %inc = add nsw i32 %i.02, 1
    89 %cmp = icmp slt i32 %inc, %3
    90 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

I was expecting that, line 85: "%4 = load i32*** @AAA", would be
treated as loop
invariant code and be moved out of the loop body. However, it didn't.
The reason is
that, LLVM thinks that pointer "@AAA" would alias with pointer
"&AAA[i]". I found GCC
has no problem to move this load out of the loop body.

Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
BasicAA and TBAA.
I found that TBAA does not differentiate pointers. Any pointer will be
given the same tbaa meta-data
named "any pointer". After reading the TBAA code, I don't believe that
my problem can be
easily fixed in its framework.

Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:

1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?

Thank you!

Hi,

[snip]

Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:

1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?

The poolalloc project (directions for downloading it are on the SAFECode web page at http://sva.cs.illinois.edu/docs/Install.html) contains a points-to analysis called DSA.

DSA is a unification-based points-to analysis. The interface to use it as an AliasAnalysis pass was removed since no one was using it and it was probably buggy. There are currently two routes for re-adding it. One is to review and fix up a patch that was submitted earlier this year. The other approach is to rewrite the code from scratch. The former approach may be easier, but I don't know how well the code would work in practice. The latter approach could be easily done using some code that we wrote (and could probably distribute if someone wants to tackle this problem).

Bug 11130 (http://llvm.org/bugs/show_bug.cgi?id=11130) is tracking interest in this feature so that we can give it an appropriate priority. Until one of us gets time to implement it or we get a contributor that provides a clean patch that works with LLVM 3.0, though, it's like to remain unfixed.

You can, of course, always examine the points-to graph directly.

Another alternative is to look at the analyses by Ben Hardekopf (http://www.cs.ucsb.edu/~benh/downloads.html). I'm not sure if his code offers an AliasAnalysis interface, though.

-- John T.

Hi John,

Thank you for your reply! Please see my other questions below:

DSA is a unification-based points-to analysis. The interface to use it as
an AliasAnalysis pass was removed since no one was using it and it was
probably buggy.

- When you say "it was probably buggy", what is buggy? the
"interface", or the AliasAnalysis?

I'm so surprise that no one is using it.

There are currently two routes for re-adding it. One is to
review and fix up a patch that was submitted earlier this year. The other
approach is to rewrite the code from scratch. The former approach may be
easier, but I don't know how well the code would work in practice. The
latter approach could be easily done using some code that we wrote (and
could probably distribute if someone wants to tackle this problem).

I can see that you like the latter approach. Let me think whether I
have time to do it.

Do you have any plan to integrate DSA into LLVM mailine, say the 3.0 release?
I learned from your previous emails that you are working on this goal.
What is the
current status?

Thank you again for your reply!

As far as I know, there are some academic works on pointer analysis (alias analysis) implementations open source on LLVM. Some are flow-insensitive and some are flow-sensitive.
For example: http://www.cs.ucsb.edu/~benh/downloads.html

As I remembered, there exists Andersens alias analysis in LLVM-2.5 lib/Analysis/IPA/Andersens.cpp, which has been removed in later LLVM versions for some buggy problems. You may have a try the old versions.

Lei Shang

2011/11/4 Gan <george.gan@gmail.com>

Gan wrote:

Hi,

I met an alias analysis problem in the LICM phase. I am using the following
piece of code to present this problem.

       1 int size;
       2 int ** AAA;
       3 void * xalloc(int);
       4
       5 void foo(void)
       6 {
       7 int i;
       8 AAA = (int**) xalloc(size * sizeof(int*));
       9
      10 for (i=0; i<size; i++)
      11 AAA[i] = 0;
      12 }

This code tries to initialize an array of pointers. The array is
allocated from heap.
"AAA" points to the beginning of the array and it is a global variable.

I got the following IR dump after LICM:

     82 *** IR Dump After Loop Invariant Code Motion ***
     83 for.body: ; preds =
%for.body.lr.ph, %for.body
     84 %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
     85 %4 = load i32*** @AAA, align 4, !tbaa !3
     86 %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
     87 store i32* null, i32** %arrayidx, align 4, !tbaa !3
     88 %inc = add nsw i32 %i.02, 1
     89 %cmp = icmp slt i32 %inc, %3
     90 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

I was expecting that, line 85: "%4 = load i32*** @AAA", would be
treated as loop
invariant code and be moved out of the loop body. However, it didn't.
The reason is
that, LLVM thinks that pointer "@AAA" would alias with pointer
"&AAA[i]". I found GCC
has no problem to move this load out of the loop body.

Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
BasicAA and TBAA.
I found that TBAA does not differentiate pointers. Any pointer will be
given the same tbaa meta-data
named "any pointer". After reading the TBAA code, I don't believe that
my problem can be
easily fixed in its framework.

Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:

1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?

Thanks for this testcase, I think this is a serious problem. I see two possible approaches, hopefully there are more.

   1. Use the fact that there's only one store and then make a select out of it. Pretending for a moment that partially overlapping writes don't exist over an i32*, the value loaded in %.pre is either %1 because we didn't overwrite it, or null because we did. Thus, we can transform it into:

     %cmp = icmp eq i32** %arrayidx2, bitcast (i32*** @AAA to i32**)
     %.pre = select i1 %cmp, i32** null, i32** %1

   which saves a load at the cost of a compare. Sadly, LLVM fails to optimize out the compare afterward. (You'd think that the prospect of storing through a null pointer would be enough for llvm to realize that one arm of the select is unreachable. That optimization would not be hard to add.)

   2. Find the least fixed point. Currently, your "xalloc" method could potentially return &AAA itself so the first step is to mark it as only returning pointers which do not alias any existing pointers, as such:

     void * xalloc(int) __attribute__((malloc));

   Then we're starting the loop with two pointers (&AAA and AAA) which we know do not alias, and we'll continue to assume that pointers can't alias until there is evidence to the contrary. Because we know they don't alias when entering the loop, we'll never encounter evidence that %arrayidx could alias @AAA, then GVN would eliminate the load %.pre against the store before the start of the loop.

   Unfortunately, this would require writing a new AA pass, as it's too expensive to add to -basicaa.

I suspect that GCC is doing option 2, or something else I haven't thought of (TBAA? if so, why doesn't our TBAA do as well?).

Nick

Nick,

The problem is that LLVM's implementation of TBAA does not distinguish between these two types: i32*** @AAA and i32** %arrayidx. We believe that type based alias analysis should, in general, be able to figure out that those two variables cannot meaningfully alias.

-Anshu

Yes. The current TBAA implementation is conservative, with the idea that it
can become more aggressive (and with TBAA, this fundamentally means
"more dangerous") with incremental steps.

It's interesting to note that clang's own source code is known to violate the TBAA
rules for pointers (it's thought to be unlikely to cause trouble in practice). There
are reasons for caution in this area.

Dan

Yes. The current TBAA implementation is conservative, with the idea that it
can become more aggressive (and with TBAA, this fundamentally means
"more dangerous") with incremental steps.

Dan,

Could you disclose more details about how to implement the "incremental steps"
to handle more complicated alias cases? For example, differentiate
different pointers
that point to different types. and use this information to improve
alias analysis accuracy?

It's interesting to note that clang's own source code is known to violate the TBAA
rules for pointers (it's thought to be unlikely to cause trouble in practice). There
are reasons for caution in this area.

I don't get your point here. What "TBAA rules" are violated by
"clang's own source code"?
Where does this violation happen? and what caution we should have in this area?

Thanks for your reply!

Yes. The current TBAA implementation is conservative, with the idea that it
can become more aggressive (and with TBAA, this fundamentally means
"more dangerous") with incremental steps.

Dan,

Could you disclose more details about how to implement the "incremental steps"
to handle more complicated alias cases? For example, differentiate
different pointers
that point to different types. and use this information to improve
alias analysis accuracy?

Yes. It's almost all up to the front-end. Find the place in clang
where it emits the "any pointer" metadata, and implement something
better.

It's interesting to note that clang's own source code is known to violate the TBAA
rules for pointers (it's thought to be unlikely to cause trouble in practice). There
are reasons for caution in this area.

I don't get your point here. What "TBAA rules" are violated by
"clang's own source code"?
Where does this violation happen? and what caution we should have in this area?

Clang frequently casts the addresses of Stmt* objects to Expr**
before dereferencing them. C++'s TBAA rules don't permit this.

Dan

Yes. It's almost all up to the front-end. Find the place in clang
where it emits the "any pointer" metadata, and implement something
better.

Dan,

Thanks for replying! I have read the TBAA code in the front-end of clang.
I did consider to extend it to handle more complicated pointer cases. For
example, assigning different TBAA names to pointers pointing to different
types. According to the current design idea of TBAA, all pointers share
the same name. Thus there is no need to calculate points-to set for each
pointer, because all pointers are in the same points-to set. If we try to
split this "all-in-one" points-to set based on the object types that are
pointed to by each pointer, we also need to consider pointer assignment
and other pointer operations (e.g. address taken) to accurately calculate
points-to set for each pointer. Then, the job would become implementing an
alias analysis algorithm similar to Steensgaard's or Anderson's. Please
correct me if you think I'm wrong.

Since both algorithms were already implemented in LLVM, and then remove
from LLVM for this or that reason, it is really not necessary to repeat
the same work. Again, please correct me if I'm wrong.

Thanks again for your reply!

Gan

It's not clear to me what you're trying to do. If you're trying to speed
up a particular benchmark, there are probably other ways to achieve your
objectives with less effort. If you're interested in doing general
points-to set algorithm work in LLVM, I don't think there are any quick
shortcuts.

Dan