Alias Analysis with inbound GEPs

Hi,

I’m checking aliasing of two pointers:

%GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 1, i64 %indvars.iv41, i64 %indvars.iv39
%GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 16

The result I got is “PartialAlias” because the indices of the GEP1 are variable. Shouldn’t the “inbounds” keyword mean that the access to sub-array is also in-bounds?
I’m trying to reach “NoAlias” consensus between GEP1 and GEP2.

Thanks.

  • Elena

From: "Elena via llvm-dev Demikhovsky" <llvm-dev@lists.llvm.org>
To: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Monday, July 25, 2016 9:45:55 AM
Subject: [llvm-dev] Alias Analysis with inbound GEPs

Hi,

I’m checking aliasing of two pointers:

%GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32
1, i64 %indvars.iv41, i64 %indvars.iv39
%GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32
16

The result I got is “PartialAlias” because the indices of the GEP1
are variable.

That seems like a bug. PartialAlias should only be returned when we can prove a partial overlap. Otherwise, MayAlias should be returned.

Shouldn’t the “inbounds” keyword mean that the access to sub-array is
also in-bounds?

No. inbounds applies only to the whole object.

I’m trying to reach “NoAlias” consensus between GEP1 and GEP2.

Did the original code come from C or C++? What are we modeling here?

-Hal

I’m checking aliasing of two pointers:

  %GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 1, i64 %indvars.iv41, i64 %indvars.iv39
  %GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32 16

The result I got is “PartialAlias” because the indices of the GEP1 are variable.
That seems like a bug. PartialAlias should only be returned when we can prove a partial overlap. Otherwise, MayAlias should be returned.
[Demikhovsky, Elena] There are some comments inside:
  // Statically, we can see that the base objects are the same, but the
  // pointers have dynamic offsets which we can't resolve. And none of our
  // little tricks above worked.
  //
  // TODO: Returning PartialAlias instead of MayAlias is a mild hack; the
  // practical effect of this is protecting TBAA in the case of dynamic
// indices into arrays of unions or malloc'd memory.
  return PartialAlias;

Shouldn’t the “inbounds” keyword mean that the access to sub-array is also in-bounds?
No. inbounds applies only to the whole object.
I’m trying to reach “NoAlias” consensus between GEP1 and GEP2.
Did the original code come from C or C++? What are we modeling here?
[Demikhovsky, Elena] C-code:
                           for (m=0; m < params->num; m++) {
                                           params->a[i][m] = expr;
                              }
%GEP1 is the store for params->a[i][m]
%GEP2 is the load for params->num.
The loop is not vectorized due to a possible collision between params->num and params->a[i][m]. If I take loading of params->num outside the loop, it is vectorized.
Bounds of array “a” are known at compile time. Limit of “i” and “m” are runtime variables.

-Hal

Thanks.

· Elena

From: "Elena Demikhovsky" <elena.demikhovsky@intel.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Monday, July 25, 2016 2:46:34 PM
Subject: RE: [llvm-dev] Alias Analysis with inbound GEPs

> I’m checking aliasing of two pointers:

> %GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32
> 1, i64 %indvars.iv41, i64 %indvars.iv39

> %GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0, i32
> 16

> The result I got is “PartialAlias” because the indices of the GEP1
> are variable.

That seems like a bug. PartialAlias should only be returned when we
can prove a partial overlap. Otherwise, MayAlias should be returned.
[Demikhovsky, Elena] There are some comments inside:
// Statically, we can see that the base objects are the same, but the
// pointers have dynamic offsets which we can't resolve. And none of
our
// little tricks above worked.
//
// TODO: Returning PartialAlias instead of MayAlias is a mild hack;
the
// practical effect of this is protecting TBAA in the case of dynamic
// indices into arrays of unions or malloc'd memory.
return PartialAlias ;

Ah, thanks! That, unfortunately, makes sense.

> Shouldn’t the “inbounds” keyword mean that the access to sub-array
> is
> also in-bounds?

No. inbounds applies only to the whole object.
> I’m trying to reach “NoAlias” consensus between GEP1 and GEP2.

Did the original code come from C or C++? What are we modeling here?
[Demikhovsky, Elena] C-code:
for (m=0; m < params->num; m++) {
params->a[i][m] = expr;
}
%GEP1 is the store for params->a[i][m]
%GEP2 is the load for params->num.
The loop is not vectorized due to a possible collision between
params->num and params->a[i][m]. If I take loading of params->num
outside the loop, it is vectorized.
Bounds of array “a” are known at compile time. Limit of “i” and “m”
are runtime variables.

The problem is, IIRC, it is not undefined behavior to access one structure field by over-indexing an earlier array member. C++ has rules for "safely-derived pointers", and I think they include all pointer arithmetic on addresses from subobjects. If array access is just pointer arithmetic, I'm not sure that helps you as much as you'd like. cc'ing Richard to correct me if necessary.

Out of curiosity, does SCEVAA get this if you add it to the pipeline before the vectorizer?

-Hal

It is actually undefined behavior. This is explicitly called out in Annex
J.2: "An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression a[1][7]
given the declaration int a[4][5]) ". If you break it apart into separate
steps, the interesting bit is that the implicit array-to-pointer conversion
is not equivalent to a cast; "int* b = (int*)a;" is not equivalent to "int*
b = *a;", even though the expressions have the same type and value.

There currently isn't any way to model the aliasing behavior of the
address-of operator or array-to-pointer decay in LLVM IR. See also
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html .

-Eli

Would this proposal help: https://reviews.llvm.org/D22793 ?

From: "Mehdi Amini" <mehdi.amini@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Elena Demikhovsky" <elena.demikhovsky@intel.com>, "llvm-dev"
<llvm-dev@lists.llvm.org>
Sent: Tuesday, July 26, 2016 1:37:28 PM
Subject: Re: [llvm-dev] Alias Analysis with inbound GEPs

> > From: "Elena via llvm-dev Demikhovsky" < llvm-dev@lists.llvm.org
> > >
>

> > To: "llvm-dev" < llvm-dev@lists.llvm.org >
>

> > Sent: Monday, July 25, 2016 9:45:55 AM
>

> > Subject: [llvm-dev] Alias Analysis with inbound GEPs
>

> > Hi,
>

> > I’m checking aliasing of two pointers:
>

> > %GEP1 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0,
> > i32
> > 1, i64 %indvars.iv41, i64 %indvars.iv39
>

> > %GEP2 = getelementptr inbounds %struct.s, %struct.s* %0, i64 0,
> > i32
> > 16
>

> > The result I got is “PartialAlias” because the indices of the
> > GEP1
> > are variable.
>

> That seems like a bug. PartialAlias should only be returned when we
> can prove a partial overlap. Otherwise, MayAlias should be
> returned.

> > Shouldn’t the “inbounds” keyword mean that the access to
> > sub-array
> > is
> > also in-bounds?
>

> No. inbounds applies only to the whole object.

Would this proposal help: https://reviews.llvm.org/D22793 ?

Yes, it looks like that would help.

-Hal

Would this proposal help: https://reviews.llvm.org/D22793 ?

Yes, it looks like that would help.

Thank you. I’ll try to get more info from the author.

Btw, Intel compiler vectorizes this loop and, probably, it is the right thing to do.