Patch for arbitrary fixed-width integer types

Patch attached. The reason I'm sending this to cfe-dev rather than
just committing it is that it's a substantial new chunk of code, and I
want to make sure it's something we really want; the potential users
are rather limited.

The attached patch adds one user for the new fixed-width types: the
implementation of the mode attribute. That said, this is probably
overkill for that particular use, and it might confuse code that uses
function overloading.

Another possible use is that there was a request on the list a while
ago for a way to declare integer types of arbitrary width; this patch
would make implementing that relatively easy.

Also, we could use this for gcc compatibility for testcases like the
following, although I'm not sure it's worthwhile:
struct {unsigned long long x : 33;} x;
unsigned long long a(void) {return x.x+1;}

-Eli

x.txt (11.9 KB)

Patch attached. The reason I'm sending this to cfe-dev rather than
just committing it is that it's a substantial new chunk of code, and I
want to make sure it's something we really want; the potential users
are rather limited.

Nifty.

In lib/Sema/SemaDeclAttr.cpp, it seems that you could simplify the code by doing something like this before the switch:

if (IntegerMode) {
   NewTy = S.Context.getFixedWidthIntType(size, OldTy- >isSignedIntegerType());
} else {
   switch ()...

The attached patch adds one user for the new fixed-width types: the
implementation of the mode attribute. That said, this is probably
overkill for that particular use, and it might confuse code that uses
function overloading.

This sounds like a good use for it. I'm not too concerned with C++ overloading for attribute(mode). I have never seen attribute mode used with C++ code.

Another possible use is that there was a request on the list a while
ago for a way to declare integer types of arbitrary width; this patch
would make implementing that relatively easy.

Yes, it would. attribute(bitwidth(42)) would be nice to support. This would also give you something to pretty print to with:

+void FixedWidthIntType::getAsStringInternal(std::string &S) const {
+ if (S.empty()) {
+ S = llvm::utostr_32(Width);
+ } else {
+ // Prefix the basic type, e.g. 'int X'.
+ S = ' ' + S;
+ S = llvm::utostr_32(Width) + S;
+ }
+}

Also, we could use this for gcc compatibility for testcases like the
following, although I'm not sure it's worthwhile:
struct {unsigned long long x : 33;} x;
unsigned long long a(void) {return x.x+1;}

This would also be very nice. We really should support this case at the very least, because this is a silent miscompilation otherwise. How does "x" act w.r.t to overloading in this case with gcc?

+++ lib/AST/ASTContext.cpp (working copy)
@@ -373,6 +373,13 @@
        break;
      }
      break;
+ case Type::FixedWidthInt:
+ // FIXME: This isn't precisely correct; the width/alignment should depend
+ // on the available types for the target
+ Width = cast<FixedWidthIntType>(T)->getWidth();
+ Width = std::max(llvm::NextPowerOf2(Width - 1), 8ULL);
+ Align = Width;
+ break;

This should probably follow the same logic that TargetData has on the LLVM IR side of things.

+QualType ASTContext::getFixedWidthIntType(unsigned Width, bool Signed) {
+ if (Signed) {
+ if (!SignedFixedWidthIntTypes.count(Width)) {
+ SignedFixedWidthIntTypes[Width] = new FixedWidthIntType(Width, true);
+ }
+ return QualType(SignedFixedWidthIntTypes[Width], 0);
+ }
+ if (!UnsignedFixedWidthIntTypes.count(Width)) {
+ UnsignedFixedWidthIntTypes[Width] = new FixedWidthIntType(Width, false);
+ }
+ return QualType(UnsignedFixedWidthIntTypes[Width], 0);
+}

Instead of doing two/three lookups (one for count one for return and one for the insert), please just do one:

   llvm::DenseMap<unsigned, FixedWidthIntType*> &Map = Signed ? SignedFixedWidthIntTypes : UnsignedFixedWidthIntTypes;

   FixedWidthIntType *&Entry = Map[Width];
   if (Entry) return Entry;
   return Entry = new ...
}

I'm not sure how this fits in with ranking and C++ overload, but this is very cool.

-Chris

I suggest that we just call all of these conversions "Integral conversions" so that they get conversion rank. It won't let anyone do fancy overloading with them, but they'll behave sanely in C++.

  - Doug

The attached patch adds one user for the new fixed-width types: the
implementation of the mode attribute. That said, this is probably
overkill for that particular use, and it might confuse code that uses
function overloading.

This sounds like a good use for it. I'm not too concerned with C++
overloading for attribute(mode). I have never seen attribute mode used with
C++ code.

Ah, I remembered the reason why using the fixed-width types for this
is bad while trying some compiling: the glibc sys/types.h redefines
int8_t and friends using the mode attribute, and that has to be
consistent with the way stdint.h defines it; therefore, we have to
follow the same logic as InitializePredefinedMacros.

Another possible use is that there was a request on the list a while
ago for a way to declare integer types of arbitrary width; this patch
would make implementing that relatively easy.

Yes, it would. attribute(bitwidth(42)) would be nice to support. This
would also give you something to pretty print to with:

Is bitwidth a good name? Is there any precedent here?

Also, we could use this for gcc compatibility for testcases like the
following, although I'm not sure it's worthwhile:
struct {unsigned long long x : 33;} x;
unsigned long long a(void) {return x.x+1;}

This would also be very nice. We really should support this case at the
very least, because this is a silent miscompilation otherwise. How does "x"
act w.r.t to overloading in this case with gcc?

gcc compiles this differently in C and C++ modes. In C++ mode, x.x is
a 64-bit "unsigned long long"; in C mode, it's a 33-bit unsigned
integer. The behavior difference is rather evil :(.

+++ lib/AST/ASTContext.cpp (working copy)
@@ -373,6 +373,13 @@
      break;
    }
    break;
+ case Type::FixedWidthInt:
+ // FIXME: This isn't precisely correct; the width/alignment should
depend
+ // on the available types for the target
+ Width = cast<FixedWidthIntType>(T)->getWidth();
+ Width = std::max(llvm::NextPowerOf2(Width - 1), 8ULL);
+ Align = Width;
+ break;

This should probably follow the same logic that TargetData has on the LLVM
IR side of things.

I don't think the relevant information is exposed by the targets
outside of getTargetDescription(), and I don't think we want the AST
to depend on that. I'll file a PR when I commit this.

Instead of doing two/three lookups (one for count one for return and one for
the insert), please just do one:

Okay.

-Eli