llvm::PointerIntPair -- is this by design or a bug?

llvm::PointerIntPair<double*, 3, signed> P;

P.setInt(-4);

Ideally, the value range for a 3-bit signed integer should be [-4,3]. But the above call to setInt will fail. Essentially, the signed int field in PointerIntPair is behaving the same as an 3-bit unsigned field which has the legal value range of [0,7]. Is this by design? Are negative values not allowed in PointerIntPair?

/Riyaz

Hi,

That doesn’t sound right (for any computer made in the last few decades), the representation of -3 will be 1111…1111101. Storing the low bits will yield 101, which is a 3-bit negative three. When you then sign extend this to any other signed type, you will get -3 in that representation. It sounds as if the signed specialisation of PointerIntPair is simply not doing the sign extension.

David

Yep, I meant it looks like it currently does not do a sign extension, it expects only the available bits to be set, no others. In any case, it is probably worth documenting the behaviour.

Cheers,
Florian

It won’t move the sign bit, so negative values won’t fit, unless you have a 3 bit signed type :wink:

Note that if you assign negative values to and then read from a signed bit-field, you would do sign extension. So 3-bit signed types do exist in C++.

It begs the question why PointerIntPair supports signed int types if it always loses the sign. Is it just to avoid signed/unsigned comparison when comparing the return value of getInt with signed types? Or to use enums that default to a signed type? In any case, this should be clearly documented if there is no intention to fix it.

/ Riyaz

I’d suggest someone try fixing this & see if it breaks anything that can’t reasonably be fixed (before we go assuming this is by design/shouldn’t be fixed just because it’s the way it is today).

Rather than “fixing” it, it might be better to support a separate method for signed extension. My reasoning is as follows:

int x = 7;

llvm::PointerIntPair<double*, 3, int> pip;

pip.setInt(x);

There could be code out there that expects pip.getInt() to return 7 and not -1.

So if you really want to set a negative and return a negative value, a separate method setSignedInt and getSignedInt may be OK. Further, sign-extension would need two shift instructions in X86 as opposed to no-sign extension where only one ‘and’ with mask is needed for retrieving the int.

The more likely scenario in existing code would perhaps be with enums:

enum Code { Value = 7 };

Code x = Value;

llvm::PointerIntPair<double*, 3, Code> pip;

pip.setInt(x);

assert(pip.getInt() == Value);

/Riyaz

I think it’d be reasonable to model this on the same behavior as int to short to int round-tripping & not to speculate that there might be code relying on the existing behavior until there’s evidence of it.

I’d suggest changing the behavior & testing to see if anything breaks - and if nothing does, moving to the behavior rather than supporting both.

I’d argue that bitfield sign extensions are surprising and are usually a source of bugs. It would be much more explicit and less error prone for the user to write the sign extension if they want it.

By extension, it seems good that PointerIntPair doesn’t do sign extension when the type happens to be signed.

The sign extension is correct. Otherwise setInt(-1) won’t work. If you don’t want sign extension, then use ‘unsigned’ and not ‘int’ in the template arguments.

I do agree that sign-extension is the right thing to do. Unlike bit-fields, llvm::PointerIntPair has asserts checking that the int value is within range. So if you assign an out of range value, it should fail with an assertion:

llvm::PointerIntPair<SomeType*, 3, int> pip;

pip.setInt(7); // can be made to fail as the valid range

// of signed 3-bit values is [-4:3]

The above code does not currently fail and instead fails for pip.setInt arguments with values in [-4:-1] which is actually unexpected and the reason I started this email thread.

/Riyaz

Ah, my favorite way to check for bitfield errors like this is to just assert that the bits that come out of the bitfield match the bits that went into the bitfield.

This should fix it:

diff --git i/include/llvm/ADT/PointerIntPair.h w/include/llvm/ADT/PointerIntPair.h
index 884d05155bf..f64155e4ee6 100644
--- i/include/llvm/ADT/PointerIntPair.h
+++ w/include/llvm/ADT/PointerIntPair.h
@@ -63,6 +63,7 @@ public:

   void setInt(IntType IntVal) {
     Value = Info::updateInt(Value, static_cast<intptr_t>(IntVal));
+ assert(IntVal == getInt() && "Integer out of range for field");
   }

   void initWithPointer(PointerTy PtrVal) {
@@ -72,6 +73,7 @@ public:
   void setPointerAndInt(PointerTy PtrVal, IntType IntVal) {
     Value = Info::updateInt(Info::updatePointer(0, PtrVal),
                             static_cast<intptr_t>(IntVal));
+ assert(IntVal == getInt() && "Integer out of range for field");
   }

   PointerTy const *getAddrOfPointer() const {
@@ -167,8 +169,7 @@ struct PointerIntPairInfo {

   static intptr_t updateInt(intptr_t OrigValue, intptr_t Int) {
     intptr_t IntWord = static_cast<intptr_t>(Int);
- assert((IntWord & ~IntMask) == 0 && "Integer too large for field");