Field sensitive alias analysis?

Hi,

I’m trying to optimize a simple C code and came across a situation where invariant code is not being moved out:

On an -O3 compilation, I noticed that the “load” for the loop bounds (which remain invariant throughout) happens on each iteration of both the loops, even though it is not modified anywhere in the function “bigLoop”. It seems that alias analysis is not able to say that the writes to one field in the structure does not impact the other field, leading to LICM being ineffective.

Do any of the alias analyses currently have some kind of field sensitivity that can help in this case?

------------------------- test case ------------------------------------

#include <stdlib.h>
#include <stdio.h>

#define SIZE 100

struct AS {
int a[SIZE+4];
int size;
} A;

void bigLoop(void)
{
unsigned i, j;

for (i = 0; i < A.size; i++) {
A.a[i+2] += A.a[i];
}
for (i = 0; i < A.size; i++) {
A.a[i+2] *= A.a[i];
}
}

int main()
{
A.size = random()%SIZE;
for (unsigned i = 0; i < A.size; i++) {
A.a[i] = random()%23;
}
bigLoop();
return 0;
}

Thanks,

As far as I can see it is specifics of arrays inside structs. Current TBAA does distinguish non-array members with path sensitive TBAA (see !1 and !6 in my example below, TBAA has reference to struct !2 and offset). As for arrays information that it was member of some struct get lost completely (!7 has nothing about struct !2).

struct S {
int a;
int b;
int c[3];
};

void foo(struct S* p) {
p->a = 1;
p->b = 2;
p->c[0] = 3;
p->c[1] = 4;
}

define void @foo(%struct.S* nocapture %p) #0 {
entry:
%a = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 0
store i32 1, i32* %a, align 4, !tbaa !1
%b = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 1
store i32 2, i32* %b, align 4, !tbaa !6
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 2, i64 0
store i32 3, i32* %arrayidx, align 4, !tbaa !7
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %p, i64 0, i32 2, i64 1
store i32 4, i32* %arrayidx2, align 4, !tbaa !7
ret void
}

!0 = !{!"clang version 3.8.0 "}

!1 = !{!2, !3, i64 0}
!2 = !{!“S”, !3, i64 0, !3, i64 4, !4, i64 8}
!3 = !{!“int”, !4, i64 0}
!4 = !{!“omnipotent char”, !5, i64 0}
!5 = !{!“Simple C/C++ TBAA”}
!6 = !{!2, !3, i64 4}
!7 = !{!3, !3, i64 0}

I’m just start learning how TBAA in clang works so I don’t know why it was implemented this way.

BTW, I have found why it doesn’t work for arrays. TBAA information propagation is not implemented in CodeGenFunction::EmitArraySubscriptExpr with “TODO: Preserve/extend path TBAA metadata?”.

Hi Vaivaswatha, Dmitry,

Some more analysis of what goes wrong (both in missed optimizations and wrong code) in the current tbaa representation can be found in following thread:

http://lists.llvm.org/pipermail/cfe-dev/2015-March/042015.html

unfortunately, there is (as far as I am aware) no solution yet.

(I still want to dive deeper into this, but this is currently not on the top of my stack :frowning: )

Greetings,

Jeroen Dobbealere

TBAA is about types, not accesses.
TBAA “struct path data” is about access paths and types that are the end of access paths, for the most part.

It has no notion of access size, etc, only offset.

It is possible to extend it and handle very basic cases in the frontend (IE structs not containing unions anywhere, with constant accesses, etc)
But it would degrade quite quickly (you would likely need a sane way to say “this is an access to an unknown offset into this type”)

Generally, rather than just try to produce metadata in the frontend, most compilers perform field-sensitive points-to or something similar for fields, and then rely on data dependence for differentiating array subscripts.

(This is what GCC does, LLVM has CFL-AA, but it’s not field sensitive yet)

So handling .size vs .a is probably possible in the frontend.

Doing array subscript analysis in general, probably not something you want in the frontend.
Handling tricky cases of what pointers to structs point to, probably the domain of field-sensitive points-to

Jeroen, thank you for very useful link with the context. Indeed union cases are very complicated and I see places in code when TBAA gives up.

Daniel, I completely agree that TBAA has limited power and can solve relatively simple cases only. So anything more complicated that involves intermediate variables that points to struct or array elements cannot be solved by TBAA alone. Differentiating array subscripts also seems like out of the scope of TBAA but adding type information that this memory assess accessing array not a random object of given type could be useful. Therefore TBAA can compliment points-to analysis and sometimes help in cases when points-to may not have enough information.

Just remember that to LLVM, TBAA info doesn’t tell it about types.

So when you say " Differentiating array subscripts also seems like out of the scope of TBAA but adding type information that this memory assess accessing array not a random object of given type could be useful."

Remember that LLVM has no notion of C/C++ types.

It doesn’t know about them, it doesn’t care about them.

What TBAA is telling it is that certain memory locations are disjoint.
It will never understand that what is being accessed is “a C++ struct containing an array of shorts”, because the type system doesn’t know what a “C struct” is, or a “C array” or a “C short”.

Thus, while helpful, TBAA info in LLVM tells it nothing about the types, it only tells it that “if i am accessing this offset, it is disjoint with accessing this other offset”.

Hi Daniel,

I see your point about LLVM and C/C++ type agnostic. I think TBAA was invented to partially cover this gap and give optimization opportunities when LLVM types are not sufficient but C/C++ types have required information.

What do you think about following example:

struct S {
int a[10];
int b;
};

int foo(struct S *ps, int i) {
ps->a[i] = 1;
ps->b = 2;
return ps->a[0];
}

define i32 @foo(%struct.S* nocapture %ps, i32 %i) #0 {
entry:
%idxprom = sext i32 %i to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0, i32 0, i64 %idxprom
store i32 1, i32* %arrayidx, align 4, !tbaa !1
%b = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0, i32 1
store i32 2, i32* %b, align 4, !tbaa !5
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0, i32 0, i64 0
%0 = load i32, i32* %arrayidx2, align 4, !tbaa !1
ret i32 %0
}

!1 = !{!2, !2, i64 0}

!2 = !{!“int”, !3, i64 0}
!3 = !{!“omnipotent char”, !4, i64 0}
!4 = !{!“Simple C/C++ TBAA”}
!5 = !{!6, !2, i64 40}
!6 = !{!“S”, !3, i64 0, !2, i64 40}

Missing information here is the range inside struct S that could be accessed. Also as you can see array member of struct in TBAA is presented as omnipotent char not as an array of int.

Arrays in struct in TBAA can be represented something like this:
!6 = !{!“S”, !7, i64 0, !2, i64 40}

!7 = !{!“<unique id of int[10]>”, !2, i64 0}

And ‘ps->a[i]’ could have TBAA like this:
!8 = !{!6, !7, i64 0}

As far as I can see if struct is enclosed in another struct, information about inner struct get lost only offset present. But I think for arrays it is better to keep array type in TBAA for the struct and element accesses.

Dmitry

Hi Daniel,

I see your point about LLVM and C/C++ type agnostic. I think TBAA was
invented to partially cover this gap and give optimization opportunities
when LLVM types are not sufficient but C/C++ types have required
information.

What do you think about following example:
struct S {
  int a[10];
  int b;
};

int foo(struct S *ps, int i) {
  ps->a[i] = 1;
  ps->b = 2;
  return ps->a[0];
}

define i32 @foo(%struct.S* nocapture %ps, i32 %i) #0 {
entry:
  %idxprom = sext i32 %i to i64
  %arrayidx = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0, i32
0, i64 %idxprom
  store i32 1, i32* %arrayidx, align 4, !tbaa !1
  %b = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0, i32 1
  store i32 2, i32* %b, align 4, !tbaa !5
  %arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %ps, i64 0,
i32 0, i64 0
  %0 = load i32, i32* %arrayidx2, align 4, !tbaa !1

I'm not entirely sure why TBAA is necessary to disambiguate ps->a from
ps->b, it looks like basicaa should already be able to say they don't
overlap.
Does this not happen?

  ret i32 %0
}

!1 = !{!2, !2, i64 0}
!2 = !{!"int", !3, i64 0}
!3 = !{!"omnipotent char", !4, i64 0}
!4 = !{!"Simple C/C++ TBAA"}
!5 = !{!6, !2, i64 40}
!6 = !{!"S", !3, i64 0, !2, i64 40}

Missing information here is the range inside struct S that could be
accessed.

What do you mean by "could be accessed". Do you mean "valid to access in
C"?

Also as you can see array member of struct in TBAA is presented as
omnipotent char not as an array of int.

Agreed.

Arrays in struct in TBAA can be represented something like this:
!6 = !{!"S", !7, i64 0, !2, i64 40}
!7 = !{!"<unique id of int[10]>", !2, i64 0}

And 'ps->a[i]' could have TBAA like this:
!8 = !{!6, !7, i64 0}

Yes. This should likely work. Note that size, while nice, is harder.

One thing that is sadly still common (at least in C) is to do this:

struct S {
  int b;
  int a[0]; // or 1
};

and malloc it at (sizeof S + 40 * sizeof (int)), then write into a[1...39].

If we want to break that, it is likely a lot of stuff gets broken (at one
point when we did it in gcc, we broke 80% of all the packages in a given
linux distro ....)

As far as I can see if struct is enclosed in another struct, information
about inner struct get lost only offset present. But I think for arrays it
is better to keep array type in TBAA for the struct and element accesses.

Don't get me wrong, i think that it would be nice to have offset and size,
and gcc does indeed track this info on it's own.

I'm just trying to understand where you think it will provide better info.

Because once you get into cases like:

struct S {
  int a[10];
  int b;
};

int foo(struct S *ps, int *i) {
  ps->a[i] = 1;
  *i = 3;
  return ps->b;
}

You have no guarantee, for example, that *i and *(ps->b) are not the same
memory.

Please see inline.

Opps, you are right in my example basicaaa could do it potentially.

Correct example is slightly different:
int foo(struct S *ps, int i) {
  ps->a[i] = 1;
  ps->b = 2;
  return ps->a[i];
}
Here basicaa cannot make sure that 'ps->a[i]' doesn't change after 'ps->b
= 2' because if 'i == 10' all 3 memory accesses will read/write the same
memory. And type information about S::a is required to disambiguate. With
current TBAA 'ps->a[i]' is about random 'int' read.

Yes, and without more info, in LLVM that read can legally touch ps->b.
So that makes sense.

Missing information here is the range inside struct S that could be

accessed.

What do you mean by "could be accessed". Do you mean "valid to access in
C"?

By access I meant read/write memory i.e. that size of S::a inside the
struct or at least information that only S::a is accessed in this place
i.e. not S::b.

Okay.

So what you want sounds reasonable to me :wink:

I thought the inbounds on the GEPs would have told us that the a accesses and b access are both in bounds of their respective fields of the struct.

Or does inbounds only tell us that the GEPs are in bounds of ‘ps’ itself?

The second.

langref says:

“The in bounds addresses for an allocated object are all the addresses that point into the object, plus the address one byte past the end. In cases where the base is a vector of pointers the inbounds keyword applies to each of the computations element-wise.”

Presumably, the allocated object here is ps, not the fields.

Interestingly, the GEP faq says:“A common example of how this is used is arrays where the size is not known. It’s common to use array types with zero length to represent these. The fact that the static type says there are zero elements is irrelevant; it’s perfectly valid to compute arbitrary element indices, as the computation only depends on the size of the array element, not the number of elements. Note that zero-sized arrays are not a special case here. This sense is unconnected with inbounds keyword. The inbounds keyword is designed to describe low-level pointer arithmetic overflow conditions, rather than high-level array indexing rules.”

At least this leads me to believe you can’t use the “llvm type” info much at all in conjunction with inbounds, only the size of allocas that you see.

–Dan

Danny’s description matches my understanding of inbounds as well. In generally, the LLVM type (particular first class aggregates) has essentially no meaning. Our operations are typed, our operands generally are not. The only real exception to that is that we have a strong distinction between pointers (due to address spaces) and everything else.

Philip