MS 128-bit literals don't always have the correct type

Hello,

Currently Clang chooses the type of MS 128-bit literals (<digits>i128,
<digits>ui128) based on their value, as if there was no suffix, but
also allows an extended 128-bit type.

For example, on x86_64:

1i128 is equivalent to 1,
0x100000000i128 is same as 0x100000000L,
and finally 0x10000000000000000i128 is indeed a 128-bit literal.

I don't know if it is intended, but i128 is definitely treated
differently way from other MS literal suffixes we accept (for example,
i64 is essentially an alias for LL).

I don't have Visual Studio so I can not check how it handles these literals.

@João: CCing you since it seems like you are interested in MS extensions.

Dmitri

VC++ does not support 128-bit literals.

I'm confused. So why does Clang implement them?

Dmitri

There's an __int128 supplied by gcc
(__int128 (Using the GNU Compiler Collection (GCC))), but there's
not one for Visual Studio (at least not one officially documented).

~Aaron

That is a 128-bit integer type. I was asking about 128-bit literals,
like 1i128 (which gcc does not support).

Dmitri

I’ve never heard of this extension before. I tried to compile a simple example:

int main()
{
auto i = 1i128;
return 0;
}

cl -nologo test.cpp
test.cpp
test.cpp(3) : error C2059: syntax error : ‘bad suffix on number’
test.cpp(3) : error C2146: syntax error : missing ‘;’ before identifier ‘i128’
test.cpp(3) : error C2065: ‘i128’ : undeclared identifier

I also couldn’t find any mention of these literals in MSDN.

When I did the work to add support for literals wider than 64b, I simply took the existing implementation in Sema indicating that they were a MS extension at face value (I don't myself have an MS system available to test on). Regardless of whether or not MS supports them, it would be quite nice if we can continue to support them (and improve on the current state) -- I have a few personal projects that depend on them, which is why I originally implemented this.

Perhaps support shouldn't be toggled via -fms-extensions, if VC++ doesn't actually support them?

The issue with "short 128b literals" not having the correct type is something that I noticed a couple weeks ago and have been meaning to take a look at, as it is causing bugs for me.

- Steve

Following up…

What seems to be the “MS extension” is the UiN and iN style suffixes for N-bit integers (which is nice, as making up progressively longer suffixes “ULLL…” is clearly untenable in the long term). Unless someone can suggest a better name for __uint128_t and __int128_t literals, it seems like as good an option as any.

Looking at the GCC docs, it seems that the only support for 128b literals in GCC is if you happen to be on a platform where “long long” is 128b.

  • Steve

When I did the work to add support for literals wider than 64b, I simply took the existing implementation in Sema indicating that they were a MS extension at face value (I don't myself have an MS system available to test on). Regardless of whether or not MS supports them, it would be quite nice if we can continue to support them (and improve on the current state) -- I have a few personal projects that depend on them, which is why I originally implemented this.

No objections to the feature itself -- it is definitely useful to have
128-bit literals if one has a 128-bit type. What might be confusing
is the choice of the suffix, though -- it is more like i64 suffix
which is an MS extension than standard L, LL suffixes.

Perhaps support shouldn't be toggled via -fms-extensions, if VC++ doesn't actually support them?

Yes, currently it is misleading and just bad to turn on
-fms-extensions if one wants 128-bit literals. Since this is not an
MS extension, code using this feature does not need other MS
extensions.

The issue with "short 128b literals" not having the correct type is something that I noticed a couple weeks ago and have been meaning to take a look at, as it is causing bugs for me.

Looking forward to it!

Dmitri

Should support be enabled by default as a clang extension? Or is there a more appropriate flag to use to control support? Do we need a new flag just for this? I have no experience with this corner of policy, and defer to the experts (and/or people who care – I don’t care what flag I need to pass, I just want to have support).

  • Steve

I would say yes, but since a new kind of integer literal is such a big
thing, we definitely need a comment from our honorable C++ language
layers who working on Clang.

Dmitri

There are three interactions with language law that seem relevant here:

  1. C99 and C++11 have a notion of “extended integer types”, which allows for an integer literal (with or without a suffix) to be given a type larger than any standard integer type, if it would be too large to fit in any relevant standard integer type. We are not required to give __int128 this treatment, but are permitted to do so. If we treated __int128 as an extended integer type, we wouldn’t need a suffix for this, except for cases where the programmer explicitly want to get an __int128 result for a number that would fit in 64 bits, and the i128 suffix does not provide us with that behavior.

  2. We support a GNU extension where a decimal integer literal which doesn’t fit in the largest standard signed type is given the largest standard unsigned type instead, and that directly conflicts with the newer language standards. This is not a conforming extension in C++11, where an implementation must either give the constant a signed type large enough to hold it, or produce a diagnostic (we get by here on a technicality, since we produce a warning for this case, but we currently fail to reject it in -pedantic-errors mode).

  3. The new i128 suffix conflicts with C++11’s user-defined literals. However, that suffix (and all others not starting with an underscore) are reserved (“for future standardization”), so using this suffix for our own purposes is defying the intent of the committee, but still conforming.

My recommendation is: we drop the GNU extension, implement the C99 / C++11 extended integer type rules instead, and drop the i128 suffix.

It seems unlikely that anyone was relying on the suffix, since it is undocumented, only works in -fms-extensions mode but isn’t an MS extension, and relies on the GNU __int128 type but isn’t implemented in GCC… and in any case the new rules would make it redundant.

Dropping the GNU extension could possibly impact the meaning of some (C++) code which cares about the type of an overlarge integer literal, so we should produce a diagnostic under -Wgcc-compat in the cases whose meanings change. These are the cases where we currently produce an “integer constant is so large that it is unsigned” warning (by default, with no flag to disable it!).

– Richard

Incidentally, we currently treat (for instance) 1000000000000000000i32 as a 64-bit integer. How does MSVC behave here? (Does i32 mean “this shall be an int32_t”, or does it mean “this shall be at least an int32_t”?)

If we choose to keep the i128 suffix, it should match i32 and i64, and always give an __int128 (since we have no larger integer types).

As the only current user(?) of 128b literals, I like this approach. I’m a little worried about the interaction with what you say about C++11, however: “an implementation must either give the constant a signed type large enough to hold it, or produce a diagnostic”; how does the user specify a 128b unsigned literal that is too large for 128b signed?

  • Steve

It seems to mean that it shall be an int32_t. In both 32- and 64-bit compiles.

// 32-bit
; 2 : auto i = 1000000000000000000i32;
  mov DWORD PTR _i$[ebp], -1486618624 ; a7640000H

// 64-bit
; 2 : auto i = 1000000000000000000i32;
  mov DWORD PTR i$[rsp], -1486618624 ; ffffffffa7640000H

However, I can't find documentation on MSDN to suggest that this is
intended behavior or not. The only integer type suffix documentation
they have lists i64, ll and LL (plus the u variants), but not i32.

~Aaron

It would be more precise to ask the compiler itself, like:

#include <type_traits>
#include <iostream>
#include <cstdint>

template<typename T>
void f(const T &t) {
  std::cout << std::is_same<int, T>::value << " "
            << std::is_same<long, T>::value << " "
            << std::is_same<long long, T>::value << " "
            << std::is_same<int32_t, T>::value << " "
            << std::is_same<int64_t, T>::value << std::endl;
  // also add 128-bit type here
}

int main() {
  f(1i32);
  f(0x100000000i32);
  f(0x10000000000000000i32);
}

Dmitri

1) VS does not accept 0x10000000000000000i32 as it claims the constant
size is too big. It will only accept it as a decimal value.

2) All three tests return results consistent with the assembly posted:
1 0 0 1 0 (on both 32- and 64-bit builds)

~Aaron

(+cfe-dev)

Hi Aaron,

I tried with VS 2012, but am happy to try 2010 or 2008 if it's worth it.

~Aaron