LLVM & Large memory 64-bit systems

Hello Chris,

many thanks for your quick fix.

No problem.

I'm currently trying to compile llvm under AMD64 - after applying the trivial
fix for Burg/zalloc.c attached below I do get pretty far - compilation stops
somewhere in gcc when compiling libgcc, probably related to a varargs problem.
I still have to further investigate.

Thanks, applied:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041213/022366.html

More problematic is the use of unsigned instead of size_t in many llvm places
- how does the the following file big_array.c compile on your current 64-bit
targets:

Hrm, we definitely SUPPORT 64-bit targets, but I KNOW there are still some places where we probably do the wrong thing for huge sizes like this.

/* big_array.c */

This is not a good example, because it's a front-end test, but there ARE
known problems. For example the "malloc" and "alloca" instructions can only take 'uint' size parameters: they should obviously be generalized to take uint or ulong parameters so you can say:

   %X = malloc int, ulong <something big>

Other instructions are already fine: obviously we handle different sized pointers without a problem, and the getelementptr instruction takes 32 or 64-bit indices without problem. Actually, at inception, getelementptr ONLY took 64-bit indices. :slight_smile:

If you're interested in working on this, I would love to get this fixed. After that, there may be places we incorrectly use unsigned instead of a "target size_t". Because we build in cross compiler situations, using size_t isn't safe either: we should really be using the constant folding interfaces where it makes sense. I suspect that the number of problem areas really is small, but again, they should definitely be fixed.

p.s. which mailing list would be a better place to discuss such issues ?

llvmdev is the right list for this, I've cc'd it.

-Chris

#include <stddef.h>
#include <stdlib.h>

#define COMPILE_TIME_ASSERT(e) {typedef int __cta_fail[1-2*!(e)];}

/* B is a valid size_t constant on 64-bit machines */
#define B ((size_t)1 << 33)

struct Foo {
char big[B];
int dummy;
};

extern struct Foo foo;

char* p(size_t n)
{
return &foo.big[n];
}

/* info: gcc-3.4 -O3 on AMD64 translates this into a "return 1;" */
int y(void)
{
ptrdiff_t d1;
ptrdiff_t d2;
d1 = offsetof(struct Foo, dummy);
d2 = p(B - 1) - p(0);
COMPILE_TIME_ASSERT(B > 0xffffffff)
COMPILE_TIME_ASSERT(d1 > 0xffffffff)
COMPILE_TIME_ASSERT(d2 > 0xffffffff)
COMPILE_TIME_ASSERT(d1 >= d2)
return d1 >= d2;
}

Regards,
Markus

-Chris

More problematic is the use of unsigned instead of size_t in many llvm
places - how does the the following file big_array.c compile on your
current 64-bit targets:

Hrm, we definitely SUPPORT 64-bit targets, but I KNOW there are still some
places where we probably do the wrong thing for huge sizes like this.

/* big_array.c */

This is not a good example, because it's a front-end test, but there ARE
known problems. For example the "malloc" and "alloca" instructions can
only take 'uint' size parameters: they should obviously be generalized to
take uint or ulong parameters so you can say:

Did you actually try the big_array.c example - it typedefs both an array and a
struct > 2**32 bytes, and I expected it to overflow e.g.
unsigned ArrayType::getNumElements() in include/llvm/DerivedTypes.h.

Yes, it does, but for this particular testcase does not cause a problem. Compiled on SparcV9, I get this:

target endian = big
target pointersize = 64
target triple = "sparcv9-sun-solaris2.8"
deplibs = [ "c", "crtend" ]
         %struct.Foo = type { [0 x sbyte], int }
%foo = external global %struct.Foo ; <%struct.Foo*> [#uses=3]

implementation ; Functions:

sbyte* %p(ulong %n) {
entry:
         %tmp.3 = getelementptr %struct.Foo* %foo, long 0, uint 0, ulong %n
         ret sbyte* %tmp.3
}

int %y() {
entry:
         ret int cast (bool setlt (long sub (long cast (sbyte* getelementptr (%struct.Foo* %foo, long 0, uint 0, ulong 8589934591) to long), long cast (%struct.Foo* %foo to long)), long 8589934593) to int)
}

... The type of "foo" is clearly wrong and going to be miscompiled, but 'y' is correctly compiled (even if it should be constant folded).

   %X = malloc int, ulong <something big>

Other instructions are already fine: obviously we handle different sized
pointers without a problem, and the getelementptr instruction takes 32 or
64-bit indices without problem. Actually, at inception, getelementptr
ONLY took 64-bit indices. :slight_smile:

If you're interested in working on this, I would love to get this fixed.
After that, there may be places we incorrectly use unsigned instead of
a "target size_t".

Yes, gcc calls the greater of size_t and target_size_t "HOST_WIDE_INT".

Sure.

Because we build in cross compiler situations, using
size_t isn't safe either: we should really be using the constant folding
interfaces where it makes sense. I suspect that the number of problem
areas really is small, but again, they should definitely be fixed.

Working cross compilation would be _really_ nice, but looking at the code
there seem to be a lot of "unsigned" variables each of which must be
categorized into unsigned vs size_t vs target_size_t. And int vs ssize_t vs
target_ssize_t might also be an issue...

For array size, for example, using uint64_t would always be fine (and would fix the problem above). Care to submit a patch to fix this problem? :slight_smile:

-Chris