Is it ok to allocate > half of address space?

Hi,

I was looking into the semantics of GEP inbounds and some BasicAA rules and I'm wondering if it's valid in LLVM IR to allocate more than half of the address space with a global variable or an alloca.
If that's a scenario want to consider, then we have problems :slight_smile:

Consider this C code (32 bits):
#include <string.h>

char obj[0x80000008];

char f() {
   char *p = obj + 0x79999999;
   char *q = obj + 0x80000000;
   *q = 1;
   memcpy(p, "abcd", 4);
   return *q;
}

Clearly the stores alias, and the memcpy should override the value written by "*q = 1".

I dunno if this is legal in C or not, but the IR produced by clang looks like (32 bits):

@obj = common global [2147483656 x i8] zeroinitializer, align 1

define signext i8 @f() {
   store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4, i32 1, i1 false)
   %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
   ret i8 %1
}

With -O2, the store to q gets forwarded, and so we get "ret i8 1".
So, BasicAA concluded that p and q don't alias. The culprit is an overflow in BasicAAResult::isGEPBaseAtNegativeOffset().

So my question is do we care about this use case where a single allocation can take more than half of the address space?

Thanks,
Nuno

Hi Nuno.
I can't answer your question, but I know that Mikael Holmén wrote a trouble report about problems in GVN related to objects larger than half of address space:
  https://bugs.llvm.org/show_bug.cgi?id=34344

It ended up in a long discussion with Eli Friedman, and then I think we just left it as an open trouble report.

/Björn

Many thanks for the pointer! I missed that bug report since the title was about GVN.

If there's interest in supporting this feature I can help since we've formalized most of BasicAA. I can easily verify if proposed changes are correct. (I'll release the code soon).

Nuno

Quoting Björn Pettersson A <bjorn.a.pettersson@ericsson.com>:

The example in https://bugs.llvm.org/show_bug.cgi?id=34344 is of course a reduced test case.
The problem was detected a runtime failures in one of our regression tests for large memcpy from one address space to another.

I'd welcome a correction for this (or at least that the compiler reports an error rather than producing incorrect code, assuming that it isn't undefined behavior to have such large allocations).

/Björn

I don’t think it is reasonable because pointer subtraction stops working with objects that large (i.e. taking the difference and adding it back to the base is nonsensical).

I was looking into the semantics of GEP inbounds and some BasicAA rules and I'm wondering if it's valid in LLVM IR to allocate more than half of the address space with a global variable or an alloca.
If that's a scenario want to consider, then we have problems :slight_smile:

Consider this C code (32 bits):
#include <string.h>

char obj[0x80000008];

char f() {
char *p = obj + 0x79999999;

I guess you mean 0x7fffffff here.

char *q = obj + 0x80000000;
*q = 1;
memcpy(p, "abcd", 4);
return *q;
}

Clearly the stores alias, and the memcpy should override the value written by "*q = 1".

I dunno if this is legal in C or not, but the IR produced by clang looks like (32 bits):

@obj = common global [2147483656 x i8] zeroinitializer, align 1

define signext i8 @f() {
store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4, i32 1, i1 false)
%1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
ret i8 %1
}

With -O2, the store to q gets forwarded, and so we get "ret i8 1".
So, BasicAA concluded that p and q don't alias. The culprit is an overflow in BasicAAResult::isGEPBaseAtNegativeOffset().

So my question is do we care about this use case where a single allocation can take more than half of the address space?

Yeah, I'm curious about it too. One of the complications is that the compiler doesn't control all the situation -- the size of the allocation could be read by the program from outside and the allocation could be done by a libc (and glibc will happily allocate more than half the address space).

There is a good discussion of various related topics in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67999 .

Accoding to LangRef, your IR currently has undefined behavior: the rules for "inbounds" GEPs say that indexes are treated as signed values. And solving that would involve changing the way we represent GEPs in IR, so I think you can consider that out of scope.

Assuming we're not dealing with inbounds GEPs (e.g. you pass -fwrapv to clang), I don't see any particular reason to disallow allocations more than half the address-space.

-Eli

Hi,

I was looking into the semantics of GEP inbounds and some BasicAA rules and I'm wondering if it's valid in LLVM IR to allocate more than half of the address space with a global variable or an alloca.
If that's a scenario want to consider, then we have problems :slight_smile:

Consider this C code (32 bits):
#include <string.h>

char obj[0x80000008];

char f() {
  char *p = obj + 0x79999999;
  char *q = obj + 0x80000000;
  *q = 1;
  memcpy(p, "abcd", 4);
  return *q;
}

Clearly the stores alias, and the memcpy should override the value written by "*q = 1".

I dunno if this is legal in C or not, but the IR produced by clang looks like (32 bits):

@obj = common global [2147483656 x i8] zeroinitializer, align 1

define signext i8 @f() {
  store i8 1, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 2040109465), i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0), i32 4, i32 1, i1 false)
  %1 = load i8, i8* getelementptr inbounds (i8, i8* getelementptr inbounds ([2147483656 x i8], [2147483656 x i8]* @obj, i32 0, i32 0), i32 -2147483648), align 1
  ret i8 %1
}

With -O2, the store to q gets forwarded, and so we get "ret i8 1".
So, BasicAA concluded that p and q don't alias. The culprit is an overflow in BasicAAResult::isGEPBaseAtNegativeOffset().

So my question is do we care about this use case where a single allocation can take more than half of the address space?

Accoding to LangRef, your IR currently has undefined behavior: the rules for "inbounds" GEPs say that indexes are treated as signed values. And solving that would involve changing the way we represent GEPs in IR, so I think you can consider that out of scope.

Sorry, that was a typo. The test case was supposed to not have inbounds (it should work without as well).
The current definition of GEP inbounds is complicated, though.. It disallows the following:
%a = gep %p, 0x88888888
%b = gep inbounds %a, 1

If %a is within bounds, the "gep inbounds" gives a signed overflow even though it's just a +1 (since 0x88888888 + 1 overflows).
So GEP inbounds disables large objects outright.

BTW I've always wondered why EmitGEPOffset (http://llvm.org/doxygen/Local_8h_source.html#l00247) doesn't use 'add nsw' if the semantics of GEP inbounds allows that (if my reading of LangRef is correct).

Assuming we're not dealing with inbounds GEPs (e.g. you pass -fwrapv to clang), I don't see any particular reason to disallow allocations more than half the address-space.

Ok, I can file bug reports for the cases I'm seeing. I can verify correctness of fixes as well. But only starting in a week from now; I'm quite busy at the moment.

Nuno

This blog post contains additional examples that may shed light on this topic:

   https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/

John

The signed cmov emitted in this example (mentioned in the blog post below) seems to indicate that LLVM really has baked in the assumption that objects are not larger than half the address space:

   https://godbolt.org/g/8zhrZ1

John