Unexpected behaviour for constant expression initializer

Hi All,

The following reduced C testcase :

$ cat initializer.c

unsigned long ok = 8509568; // 0x81D880

unsigned long bogus = 0x800000 + 2378016; // 0x81D880

when compiled with clang on my host (x86_64):

$ clang -O0 -S -emit-llvm -o initializer.ll initializer.c

$ cat initializer.ll

@ok = global i64 8509568, align 8 ; <i64*> [#uses=0]

@bogus = global i64 8509568, align 8 ; <i64*> [#uses=0]

but when compiled for the MSP430:

$ clang -ccc-host-triple msp430-unknown-unknown -O0 -S -emit-llvm -o initializer.ll initializer.c

$ cat initializer.ll

@ok = global i32 8509568, align 2 ; <i32*> [#uses=0]

@bogus = global i32 8378496, align 2 ; <i32*> [#uses=0]

llvm-2.7, llvm-2.8 and trunk are giving consistent results.

sizeof(unsigned long) is as least 32bits on both platforms, and the final computation values, as well as the intermediate values always fit in 32 bits. The problem here is that it seems that 2378016 expression is evaluated using 16bits arithmetics, thus with an overflow.

I am not a language lawyer, but this behaviour is looking at best surprising to me.

Should I file this as a bug ?

Best regards,

I'm pretty sure the small integers in those initializers are of type
int until they're "assigned" to the unsigned long global and/or used
in arithmetic with an integer constant with a larger[1] type. So if
'int' is something like i16 (which I'm guessing is the case for
MSP430?), the intermediate value 2*3780*16 will overflow before being
added to 0x800000 (which doesn't fit in i16, so it's probably
automatically a long int).
Try appending 'L' (or 'UL') to the 2, 3780 and/or 16.

[1] (Where long > int, even if both are 32 bit)

Yes, int is i16 on the MSP430.

I was not aware that expression is evaluated with the target's "int type", unless specified otherwise (using L/UL/..).

Thanks for your help,