libclang - inconsistent signs for enumeration constants

Given the source

enum ABC {
    V1 = 1,
    V2 = 0x90000100
};
enum DEF {
    V3 = 1,
    V4 = -100
};
enum GHI {
    V5 = 1 << 28,
    V6 = 12 << 28
};

The libclang API returns inconsistent signs for the different
enumeration constants. On a 32-bit machine, The EnumConstantDecl for the
various values have the following types:

V1 - Int - as expected
V2 - UInt - seems reasonable, though it's negative if viewed as a 32-bit int
V3 - Int - as expected
V4 - Int - as expected
V5 - Int - seems reasonable
V6 - Int - not consistent with the interpretation of V2.

Both V2 and V6 have bit 31 set, yet one is interpreted as unsigned and the
other as signed.

Is this a known issue? I just thought I'd mention it here before opening an
issue on the tracker (I didn't find any existing issue covering this).

It seems reasonable to interpret any value given as a literal negative value
(or where the value fits in a signed int) as an integer, but interpret as
unsigned any value provided in hex or using shift and binary logic operators,
such as &, | and ^.

Thanks,

Vinay Sajip

As far as I can tell, the type checking for the given enums is working
as expected (i.e. in a gcc-compatible manner) for your example.

-Eli

This is dictated by C.

The initializer in V2 is unsigned because the integer literal 0x90000100 falls outside of the range of a signed int (C99 6.4.4.1p5).
The initializer in V6 is signed because 12 does not fall outside of the range of a signed int (6.4.4.1p5 again), and left-shifting a signed int doesn't change its type (6.5.7p3). Also, technically, left-shifting into the sign bit like this is undefined behavior (6.5.7p4).

John.

John McCall <rjmccall@...> writes:

This is dictated by C.

The initializer in V2 is unsigned because the integer literal 0x90000100 falls

outside of the range of a

signed int (C99 6.4.4.1p5).
The initializer in V6 is signed because 12 does not fall outside of the range

of a signed int (6.4.4.1p5

again), and left-shifting a signed int doesn't change its type (6.5.7p3).

Also, technically,

left-shifting into the sign bit like this is undefined behavior (6.5.7p4).

Thanks for the explanation. It's my bad luck that those examples are from real
production code:

The 0x90000100 is an encoding enumeration value (NSUTF16BigEndianStringEncoding)
from the OS X Foundation framework, and the 12 << 28 value is a font class
enumeration (NSFontSymbolicClass) from the OS X AppKit framework :frowning:

Regards,

Vinay Sajip