Behaviour of APInt

I’m having trouble understanding how APInts should be used.

The APInt documentation states that it ‘is a functional replacement for common
case unsigned integer type’, but I’m not seeing this because the internal logic
is that the value is always treated as negative if the most significant bit is
set.

I’m interested in an add or sub that could be using a negative value. I have the
following snippet of code to demostrate the issue:

APInt Positive(8, 128, false);
APInt Negative(8, -128, true);
LLVM_DEBUG(dbgs() << "Positive: " << Positive << “\n”);
LLVM_DEBUG(dbgs() << "Negative: " << Negative << “\n”);
LLVM_DEBUG(dbgs() << "0 + Positive = " << 0 + Positive << “\n”);
LLVM_DEBUG(dbgs() << "0 - Positive = " << 0 - Positive << “\n”);
LLVM_DEBUG(dbgs() << "0 + Negative = " << 0 + Negative << “\n”);
LLVM_DEBUG(dbgs() << "0 - Negative = " << 0 - Negative << “\n”);

The output is:

Positive: -128
Negative: -128
0 + Positive = -128
0 - Positive = -128
0 + Negative = -128
0 - Negative = -128

I know there are operators for when the sign matters, but from my example,
either my understanding or the functionality is broken. If an abstract
structure exists, why does the MSB still represent the sign? Especially
when it’s supposed to be an unsigned type!

Thanks,

Sam

The APInt documentation states that it 'is a functional replacement for common
case unsigned integer type', but I'm not seeing this because the internal logic
is that the value is always treated as negative if the most significant bit is
set.

I take that as saying it's a 2s-complement type rather than overflow
being UB, but the statement may still be misleading.

I know there are operators for when the sign matters, but from my example,
either my understanding or the functionality is broken.

It's definitely quirky that it's always printed as a signed integer.
My guess would be it stems from a very early decision about the
friendliest ways to print IR's iN types, which was probably its first
use-case (i.e. most people would prefer to see i64 -1 over i64
18446744073709551616). But I haven't done the archaeology to confirm
it.

If an abstract
structure exists, why does the MSB still represent the sign? Especially
when it's supposed to be an unsigned type!

I think it's be more correct to say it's an arbitrary precision type
that could be either sign (again, much like LLVM's iN). There's a
separate APSInt for a type that genuinely is either signed or unsigned
in all cases.

Cheers.

Tim.

Cheers Tim,

The real problem is not just in the printing though, any code can misinterpret the true value if one queries isNegative(). negate() will also produce the original value.

I didn’t know about APSInt. It seems I have been mislead and I think I will have to go back to some of my past patches… I know I’m not the only one to be caught out by this behaviour though, APSInt looks like a safer type to use.

Thanks again,