i1 Values

I've been debugging some strange happenings over here and I put an
assert in APInt to catch what I think is the source of the problem:

  int64_t getSExtValue() const {
    // An i1 -1 is unrepresentable.
    assert(BitWidth != 1 && "Signed i1 value is not representable!");

To me an i1 -1 makes no sense whatsoever. It is not representable in
twos-complement form. It cannot be distinguished from unsigned i1 1.

It turns out this assert triggers all over the place. Before I dive
into this rat hole I'd like to confirm with others my sense that we
shouldn't be creating i1 -1 values. Is there some legitimate reason to
do so?

                             -David

<dag@cray.com> writes:

I've been debugging some strange happenings over here and I put an
assert in APInt to catch what I think is the source of the problem:

  int64_t getSExtValue() const {
    // An i1 -1 is unrepresentable.
    assert(BitWidth != 1 && "Signed i1 value is not representable!");

To me an i1 -1 makes no sense whatsoever. It is not representable in
twos-complement form. It cannot be distinguished from unsigned i1 1.

It turns out this assert triggers all over the place. Before I dive
into this rat hole I'd like to confirm with others my sense that we
shouldn't be creating i1 -1 values. Is there some legitimate reason to
do so?

I should clarify this. The problem isn't with i1 -1 per se, it's the
fact that we need to also represent an unsigned i1 1 value.

The problem appears to be that somewhere we request a sign-extended i1 1
constant value when we really want the zero-extended i1 1 value. That's
probably actually the real error but I'm still wondering whether APInt
should be used to represent both signed and unsigned i1 1.

                                -David

I've been debugging some strange happenings over here and I put an
assert in APInt to catch what I think is the source of the problem:

  int64_t getSExtValue() const {
    // An i1 -1 is unrepresentable.
    assert(BitWidth != 1 && "Signed i1 value is not representable!");

Please don't do this. Asking for an i1 APInt to be sign-extended does not
mean that we have encountered a bug. How would you write a parser for the
constant expression "i32 sext(i1 true)"? It grabs the i1 and sign extends
it.

To me an i1 -1 makes no sense whatsoever. It is not representable in
twos-complement form. It cannot be distinguished from unsigned i1 1.

It turns out this assert triggers all over the place. Before I dive
into this rat hole I'd like to confirm with others my sense that we
shouldn't be creating i1 -1 values. Is there some legitimate reason to
do so?

We don't create i1 -1 values, nor do we create i1 1 values. APInt is
neither signed nor unsigned, it is signless. The interpretation of the bits
(or in this case, the bit) is a matter for the caller to decide.

Nick

This is in no way specific to i1's. Given an i8, you can have anything from
i8 -128 up to i8 255, inclusive. That's a larger range than 8 bits can
represent.

Nick

If the type is interpreted as signed, then -1 is a proper value.

For signed integer type of n bits, the values that are
representable are in the range -2^(n-1)...2^(n-1)-1), e.g.,
-128...127 for i8. For i1, the values are in the range -1...0.
So -1 and 0 are the two values that can be represented.

I can see two reasons for it:

1) An integer way to represent -0 and +0 from the floating point domain.
2) unsigned i1 represents 0 and 1(via unsigned values being in the range 0 -> (2^N) - 1, but a signed i1 represents [-]0 and -1(via signed values being in the range -2^(N-1) -> 2^(N-1) - 1. This could be important when promoting to large integers and determining if sign or zero extension is needed.

Nick Lewycky <nlewycky@google.com> writes:

    I've been debugging some strange happenings over here and I put an
    assert in APInt to catch what I think is the source of the
    problem:
    
    int64_t getSExtValue() const {
    // An i1 -1 is unrepresentable.
    assert(BitWidth != 1 && "Signed i1 value is not representable!");

Please don't do this. Asking for an i1 APInt to be sign-extended does
not mean that we have encountered a bug. How would you write a parser
for the constant expression "i32 sext(i1 true)"? It grabs the i1 and
sign extends it.

I simply put the assert there to track down a bug, I wasn't proposing to
commit it. :slight_smile:

We don't create i1 -1 values, nor do we create i1 1 values. APInt is
neither signed nor unsigned, it is signless. The interpretation of the
bits (or in this case, the bit) is a matter for the caller to decide.

Yes, I clarified this in my follow-up. Somewhere a caller is treating
it as the wrong thing. It's deep in codegen somewhere so it is
difficult to root out. :frowning:

I agree that APInt is fine. Thanks for your helpful note!

                           -David

Micah Villmow <micah.villmow@softmachines.com> writes:

I can see two reasons for it:

1) An integer way to represent -0 and +0 from the floating point domain.
2) unsigned i1 represents 0 and 1(via unsigned values being in the
range 0 -> (2^N) - 1, but a signed i1 represents [-]0 and -1(via
signed values being in the range -2^(N-1) -> 2^(N-1) - 1. This could
be important when promoting to large integers and determining if sign
or zero extension is needed.

Right. As Nick noted, APInt doesn't provide any information to a user
to know whether sign- or zero-extension is appropriate. This makes
debugging hard as callers can't simply assert that the value is signed
or unsigned.

Not that I want to add state to APInt but it could be helpful in some
cases to carry information about an APInt's intended signedness for
debugging purposes. As it is now I have to check every user of APInt to
make sure it's doing the right thing.

Probably there's no good solution for this but it's something to ponder.

                            -David

Micah Villmow <micah.villmow@softmachines.com> writes:

I can see two reasons for it:

1) An integer way to represent -0 and +0 from the floating point domain.
2) unsigned i1 represents 0 and 1(via unsigned values being in the
range 0 -> (2^N) - 1, but a signed i1 represents [-]0 and -1(via
signed values being in the range -2^(N-1) -> 2^(N-1) - 1. This could
be important when promoting to large integers and determining if sign
or zero extension is needed.

Right. As Nick noted, APInt doesn't provide any information to a user
to know whether sign- or zero-extension is appropriate. This makes
debugging hard as callers can't simply assert that the value is signed
or unsigned.

Not that I want to add state to APInt but it could be helpful in some
cases to carry information about an APInt's intended signedness for
debugging purposes. As it is now I have to check every user of APInt to
make sure it's doing the right thing.

Probably there's no good solution for this but it's something to ponder.

This reminds me of something I always wanted when doing language frontend work: Statically typed Value*'s. The idea being that in lots of places where Value*'s are used, you know the type of the instruction you're pointing at, and you want to tell the C++ compiler about that so it can help you keep the types straight at compiler-run-time. I imagine such a thing could be implemented the same way strong typedefs work. The same strategy could probably be used to help with keeping track of signedness of APInts, and probably lots of other things.

Unfortunately I never got around to implementing that, so I don't know exactly what the pros and cons of it are.

Jon