InstCombine GEP

Hi,

I have a doubt with GEP transformation in the instruction-combiner.

Consider below test-case:

struct ABC {

int A;

int B[100];

struct XYZ {

int X;

int Y[100];

} OBJ;

};

void Setup(struct ABC *);

int foo(int offset) {

struct ABC *Ptr = malloc(sizeof(struct ABC));

Setup(Ptr);

return Ptr->OBJ.X + Ptr->OBJ.Y[33];

}

Generated IR for the test-case:

define i32 @foo(i32 %offset) local_unnamed_addr #0 {

entry:

%call = tail call i8* @malloc(i64 808)

%0 = bitcast i8* %call to %struct.ABC*

tail call void @Setup(%struct.ABC* %0) #3

%OBJ = getelementptr inbounds i8, i8* %call, i64 404

%X = bitcast i8* %OBJ to i32*

%1 = load i32, i32* %X, align 4, !tbaa !2

%arrayidx = getelementptr inbounds i8, i8* %call, i64 540

%2 = bitcast i8* %arrayidx to i32*

%3 = load i32, i32* %2, align 4, !tbaa !8

%add = add nsw i32 %3, %1

ret i32 %add

}

Instruction combiner transforms GEPs to i8 type, because of this the GEP offset looks weird and the actual type information is missing on GEP.

I expected the GEPs to use the actual type offsetting for which the memory is allocated.

Expected IR:

; Function Attrs: nounwind uwtable

define i32 @foo(i32 %offset) local_unnamed_addr #0 {

entry:

%call = tail call i8* @malloc(i64 808)

%0 = bitcast i8* %call to %struct.ABC*

tail call void @Setup(%struct.ABC* %0) #3

%X = getelementptr inbounds %struct.ABC, %struct.ABC* %0, i64 0, i32 2, i32 0

%1 = load i32, i32* %X, align 4, !tbaa !2

%arrayidx = getelementptr inbounds %struct.ABC, %struct.ABC* %0, i64 0, i32 2, i32 1, i64 33

%2 = load i32, i32* %arrayidx, align 4, !tbaa !8

%add = add nsw i32 %2, %1

ret i32 %add

}

In the above IR the GEP offsetting looks very explicit for the type struct.ABC.

Looking at the InstCombiner source found the below code is responsible for it:

1914 /// See if we can simplify:

1915 /// X = bitcast A* to B*

1916 /// Y = gep X, <…constant indices…>

1917 /// into a gep of the original struct. This is important for SROA and alias

1918 /// analysis of unions. If “A” is also a bitcast, wait for A/X to be merged.

1919 if (BitCastInst *BCI = dyn_cast(PtrOp)) {

1920 …

I’m not sure how transforming GEP offset to i8 type will help alias analysis & SROA for the mentioned test case.

Regards,

Ashutosh

I agree that’s a bit strange.
I dunno about SROA, but BasicAA certainly knows about structs and unions. I don’t think right now it has any problem handling those. (and if there’s some case it doesn’t, shouldn’t be hard to fix)
Also, BasicAA has special rules for arrays of structs that can conclude no-alias even if some of the indexes aren’t constants. So you could even turn those no-alias results into may-alias; instcombine is losing information here.

I guess you could lookup the history to see why/when that transformation was introduced if noone remembers.

Nuno

Hi Ashutosh,

I’m not sure how transforming GEP offset to i8 type will help alias analysis
& SROA for the mentioned test case.

It should neither help nor hinder AA or SROA -- the two GEPs (the
complex one and the simple one) are equivalent. Since memory isn't
typed in LLVM, having the GEP in terms of %struct.ABC does not provide
any extra information.

-- Sanjoy

Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison).

Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y:
a[c1] no-alias a[y][c2] if:
the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots.
For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened.

But I agree that LLVM itself doesn't exploit types for AA extensively. For example, a pointer based in a struct field may alias another field of the same struct, even if at C/C++ level that's probably not allowed.

Nuno

Thanks Nuno & Sanjoy for the inputs.

As you mentioned the flattened GEPs should neither help nor hinder AA & SROA.
It's good to keep type based GEPs. I'll make the change and submit for review.

Regards,
Ashutosh

Hi,

I’m not sure how transforming GEP offset to i8 type will help alias
analysis & SROA for the mentioned test case.

It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent.

Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC does not provide any extra information.

Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison).

That may be true in C++, but I'm not sure if we want that to be true
in LLVM IR. We would not be able to inline memcpy's if that were
true, for one thing (e.g. Compiler Explorer). Unless
you're talking about TBAA metadata?

Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y:
a[c1] no-alias a[y][c2] if:
the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots.
For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened.

Do you mean to say that in LLVM IR we will conclude ptr0 and ptr1 don't alias:

  int a[4][4];
  ptr0 = &a[3];
  ptr1 = &a[y][7];

If so, that doesn't match my understanding -- I was under the
impression that in LLVM IR x = 2, y = 1 will give us must-alias
between ptr0 and ptr1.

-- Sanjoy

I’m not sure how transforming GEP offset to i8 type will help alias
analysis & SROA for the mentioned test case.

It should neither help nor hinder AA or SROA -- the two GEPs (the complex one and the simple one) are equivalent.

Since memory isn't typed in LLVM, having the GEP in terms of %struct.ABC does not provide any extra information.

Memory is somewhat typed, since if you store something with a type and load the same location with a different type that's not valid (let's call it poison).

That may be true in C++, but I'm not sure if we want that to be true
in LLVM IR. We would not be able to inline memcpy's if that were
true, for one thing (e.g. Compiler Explorer). Unless
you're talking about TBAA metadata?

Ah, that's a very good point. This is a simplified version of your example: Compiler Explorer
memcpy is transformed into a store of an int, which is then loaded as float.

Well, at least according to LLVM semantics, memory records the last stored type size, such that it's invalid to store an i12 and load an i13. Not sure why this restriction in the semantics is actually needed, though. If you read a smaller/larger type than what was stored, you may end up with some padding bits (poison). That's it.

Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y:
a[c1] no-alias a[y][c2] if:
the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots.
For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened.

Do you mean to say that in LLVM IR we will conclude ptr0 and ptr1 don't alias:

  int a[4][4];
  ptr0 = &a[3];
  ptr1 = &a[y][7];

If so, that doesn't match my understanding -- I was under the
impression that in LLVM IR x = 2, y = 1 will give us must-alias
between ptr0 and ptr1.

No, in this case it won't conclude no-alias, since 3 % 4 == 7 % 4. LLVM is not that aggressive in exploiting UB. Anyway, concluding no-alias here was only possible if the GEP index had the inrange attribute.

The example is more like this:
  int a[4][5];
  p = &a[0];
  q = &a[y][1];

With access sizes sp, sq, respectively:
If the access size through p ends before q (q >= sp) and the access through q doesn't go beyond the array limit (sq <= 5*sizeof(int) - 1*sizeof(int)), then it's no-alias.

By flattening a GEP, you lose the information of the size of the each of array/struct constituents. Hence this proof rule doesn't apply and you would get may-alias for the example above.
Another interesting conclusion is that LLVM is being quite nice by allowing accesses to multiple array/struct fields through the address of one of them.

The code is here: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/BasicAliasAnalysis.cpp?revision=310766&view=markup#l1349
(you may need to scroll back to line 1294 or even to the beginning of that function to see where all the data comes from)

Nuno

Also, BasicAA has the following rule, with constants c1 and c2, and arbitrary values x, y:
a[c1] no-alias a[y][c2] if:
the distance between c1 and c2 is sufficient to guarantee that the accesses will be disjoint due to ending up in different array slots.
For this rule it's important to know what's the size of each array element. This information is lost if GEPs are flattened.

Do you mean to say that in LLVM IR we will conclude ptr0 and ptr1 don't alias:

  int a[4][4];
  ptr0 = &a[3];
  ptr1 = &a[y][7];

If so, that doesn't match my understanding -- I was under the
impression that in LLVM IR x = 2, y = 1 will give us must-alias
between ptr0 and ptr1.

No, in this case it won't conclude no-alias, since 3 % 4 == 7 % 4. LLVM is not that aggressive in exploiting UB. Anyway, concluding no-alias here was only possible if the GEP index had the inrange attribute.

The example is more like this:
  int a[4][5];
  p = &a[0];
  q = &a[y][1];

With access sizes sp, sq, respectively:
If the access size through p ends before q (q >= sp) and the access through q doesn't go beyond the array limit (sq <= 5*sizeof(int) - 1*sizeof(int)), then it's no-alias.

By flattening a GEP, you lose the information of the size of the each of array/struct constituents. Hence this proof rule doesn't apply and you would get may-alias for the example above.

Let me rephrase this last bit: I agree with you that if the GEP has no inrage attributes then there's no logical loss of information: a GEP is simply a bunch of arithmetic operations and nothing else can be inferred from the types.
However, the BasicAA code that I mentioned above wouldn't kick in (as far as I understand). It would be possible to extend it to handle the flattened GEP case, though. There's already a bit of code that analyses linear expressions, but I don't think it go as far as needed to handle the case above.

Nuno

FWIW: Having memory be untyped strongly hinders field sensitive analysis :slight_smile:
It's okay for non, obviously.

I also get the desire to avoid bitcasts.
But field-sensitive analysis of any sort requires knowing that memory has
structures and arrays, and we've said "no it doesn't :slight_smile:

So either you spend all your time trying to recover that based on the
indexing operations you see(at precision loss), provide metadata and base
your analysis on the metadata (which must stay semantically correct, since
you are effectively ignoring the addressing operation), or give up.
(Analyses like DSA attempt to recover it).

We already miss pretty basic field aliasing cases in C++ due to the above :frowning: