Testing CFL alias analysis

Hello everyone,

If you’ve read through my previous introduction email (), you can safely ignore this message. The short story is: CFL-AA does not seem to be broken anymore. Please try it out and help us find more bugs / performance issues if switching to it in the future sounds interesting to you. Here are more backgrounds: I was working on a GSoC project, whose first step is to fix cfl-aa. After a bug was patched up in r268269, bootstrapping llvm+clang with standalone cfl-aa as well as cfl-aa+basicaa breaks nothing in llvm test-suite. It shows that cfl-aa is in a pretty good shape today and are almost ready to be turned out by default. But before we can do this, we’d like to gather enough evidences that it is a safe move. A more thorough description of the current status can be found in the link provided at the beginning of this message. To compile your codes with cfl-aa turned on, simply add " -mllvm -use-cfl-aa -mllvm -use-cfl-aa-in-codegen" option to the clang command line arguments.

Hi Jia,

We did some testing with CFL-AA enabled on an aarch64 OoO target on the llvm test-suite and SPEC (with and without LTO). We didn’t observe any correctness issues. We didn’t really observe any positive or negative performance differences, other than a single llvm test llvm-test-suite/SingleSource/Benchmarks/Shootout/lists that improved ~3%. I also looked over some of the generated code differences: only a handful of tests changed at all (9 in llvm test-suite, 5 in SPEC2006), and in most of these only a few functions changed, usually with a small amount of static instruction differences. We didn’t collect any compile time data.

-Geoff

Hi Geoff,

Thank you so much for the effort!

It's good to hear that cfl-aa didn't break anything. However, the fact that it doesn't quite affect code generation is also concerning. I'll definitely look into the issue.

Thanks for running CFLAA and sharing the numbers :slight_smile:

FWIW, when I measured CFLAA bootstrapping clang/LLVM at the end of my internship, the rough numbers were*:

Regular alias stack (BasicAA/…), no CFLAA: ~80% NoAlias responses
CFLAA only: ~20-25% NoAlias responses
CFLAA behind the regular alias stack: ~82% NoAlias responses.

So, CFLAA did help accuracy a bit.

With this in mind, I’d like to note that there’s tons of low-hanging fruit in CFLAA. For instance, we currently treat things like malloc as opaque function calls, rather than as memory allocation functions. Additionally, due to fun corner cases, we do super-conservative things like treating ‘a’ and ‘b’ as aliases below:

int a[10], b[10];
int N = getN();
a[N] = 1;
b[N] = 2;

…Because we end up emitting two GEPs that use N. I’m not arguing that this is sane behavior, nor am I saying that it’s a fundamental problem with CFLAA; it’s just that CFLAA is currently very bare-bones, and very little time has been spent making it less so. Hopefully, by the end of this GSoC project, CFLAA will be much fancier, and will give significantly better results. :slight_smile:

    • These numbers were taken by throwing logging code into the AA infrastructure. So, “N% NoAlias responses” means that N% of all alias queries issued throughout compilation were answered with NoAlias. The numbers were scraped from the logs that I got from running ninja clean; ninja check-all for a release+asserts build of llvm/clang/compiler-rt. Also, these numbers are purely from my memory of mid-2014. So, they may be different now.