Predicate registers/condition codes question

Hey folks,

We are having some difficulty with how we have been representing our predicate registers, and wanted some advice from the list. First, we had been representing our predicate registers as 1 bit (i1). The truth, however, is that they are 8 bits. The reason for this is that they serve as predicates for conditional execution of instructions, branch condition codes, and also as vector mask registers for conditional selection of vector elements.

We have run into problems with type mismatches with intrinsics for some of our vector operations. We decided to try to solve it by representing the predicate registers as what they really are, namely i8. We changed our intrinsic and instruction definitions accordingly, changed the data type of the predicate registers to be i8, and changed getSetCCResultType() to return i8. After doing this, the compiler builds just fine but dies at runtime trying to match some target independent operations (e.g. setcc/brcond) that appear to want an i1 for the condition code.

So, my question is this: is it even possible to represent our predicate registers (and our condition codes) as i8, and if so, what hook are we missing?

Thanks in advance for any help you might be able to provide.

Tony

Making getSetCCResultType return i8 is definitely supported, and
brcond should be okay with that. It's not obvious what is going
wrong; are you sure there isn't anything in your target still
expecting an i1?

-Eli

Thanks, Eli. We'll take another look at our target dependent information to see if some i1's are still lurking about. It's good to know that this should work.

Tony

Hi Eli,

Hey folks,

We are having some difficulty with how we have been representing our
predicate registers, and wanted some advice from the list. First, we
had been representing our predicate registers as 1 bit (i1). The truth,
however, is that they are 8 bits. The reason for this is that they
serve as predicates for conditional execution of instructions, branch
condition codes, and also as vector mask registers for conditional
selection of vector elements.

We have run into problems with type mismatches with intrinsics for some
of our vector operations. We decided to try to solve it by representing
the predicate registers as what they really are, namely i8. We changed
our intrinsic and instruction definitions accordingly, changed the data
type of the predicate registers to be i8, and changed
getSetCCResultType() to return i8. After doing this, the compiler
builds just fine but dies at runtime trying to match some target
independent operations (e.g. setcc/brcond) that appear to want an i1 for
the condition code.

So, my question is this: is it even possible to represent our predicate
registers (and our condition codes) as i8, and if so, what hook are we
missing?

Making getSetCCResultType return i8 is definitely supported, and
brcond should be okay with that. It's not obvious what is going
wrong; are you sure there isn't anything in your target still
expecting an i1?

I have specified that Hexagon has an i8 predicate register that
represents the true predicate as -1 with a sign extend like this:

    addRegisterClass(MVT::i8, &Hexagon::PredRegsRegClass);
    setBooleanContents(ZeroOrNegativeOneBooleanContent);

and I'm calling this code just before computeRegisterProperties, that
builds the TransformToType table specifying the type promotions:

i1 -> i8
i8 -> i8 (legal)
i16 -> i32
i32 -> i32 (legal)

This would be fine if the register for i8 could be used for any
integer operation (as in x86 for instance), but on Hexagon, predicate
registers can only be used in a few logical operations.

So my question is how do we specify that for most of the operations i8
should be promoted to i32 and that only a few logical operations are
legal on i8?

Thanks,
Sebastian

Hi Sebastian,

Hi Eli,

Hey folks,

We are having some difficulty with how we have been representing our
predicate registers, and wanted some advice from the list. First, we
had been representing our predicate registers as 1 bit (i1). The truth,
however, is that they are 8 bits. The reason for this is that they
serve as predicates for conditional execution of instructions, branch
condition codes, and also as vector mask registers for conditional
selection of vector elements.

We have run into problems with type mismatches with intrinsics for some
of our vector operations. We decided to try to solve it by representing
the predicate registers as what they really are, namely i8. We changed
our intrinsic and instruction definitions accordingly, changed the data
type of the predicate registers to be i8, and changed
getSetCCResultType() to return i8. After doing this, the compiler
builds just fine but dies at runtime trying to match some target
independent operations (e.g. setcc/brcond) that appear to want an i1 for
the condition code.

So, my question is this: is it even possible to represent our predicate
registers (and our condition codes) as i8, and if so, what hook are we
missing?

Making getSetCCResultType return i8 is definitely supported, and
brcond should be okay with that. It's not obvious what is going
wrong; are you sure there isn't anything in your target still
expecting an i1?

I have specified that Hexagon has an i8 predicate register that
represents the true predicate as -1 with a sign extend like this:

     addRegisterClass(MVT::i8,&Hexagon::PredRegsRegClass);
     setBooleanContents(ZeroOrNegativeOneBooleanContent);

and I'm calling this code just before computeRegisterProperties, that
builds the TransformToType table specifying the type promotions:

i1 -> i8
i8 -> i8 (legal)
i16 -> i32
i32 -> i32 (legal)

This would be fine if the register for i8 could be used for any
integer operation (as in x86 for instance), but on Hexagon, predicate
registers can only be used in a few logical operations.

So my question is how do we specify that for most of the operations i8
should be promoted to i32 and that only a few logical operations are
legal on i8?

I think the combo TargetLowerInfo::isTypeDesirableForOp() and IsDesirableToPromoteOp() may help you here. X86 does something similar.

Ivan

Hi Ivan,

Hi Sebastian,

So my question is how do we specify that for most of the operations i8
should be promoted to i32 and that only a few logical operations are
legal on i8?

I think the combo TargetLowerInfo::isTypeDesirableForOp() and
IsDesirableToPromoteOp() may help you here. X86 does something similar.

I just tried these functions, and it seems like they are only
modifying the behavior of type promotions for a small subset of
operations (PromoteIntBinOp, PromoteIntShiftOp, PromoteExtend,
PromoteLoad, SimplifyBinOpWithSameOpcodeHands, visitSRL, visitTRUNCATE
that matter to the performance of i16 on X86.)

I don't like the "desirable" in the name of these functions: in the
case of Hexagon it is illegal to use an i8 predicate register for
anything else than setcc, brcond, and the logical ops: so doing the
conversion is a matter of correctness, not of desirability.

Should I add a call to IsDesirableToPromoteOp in every other operation
that is currently missing this check for type promotion, or do we want
a new hook?

Thanks,
Sebastian

Hi,

Hi Ivan,

Hi Sebastian,

So my question is how do we specify that for most of the operations i8
should be promoted to i32 and that only a few logical operations are
legal on i8?

I think the combo TargetLowerInfo::isTypeDesirableForOp() and
IsDesirableToPromoteOp() may help you here. X86 does something similar.

I just tried these functions, and it seems like they are only
modifying the behavior of type promotions for a small subset of
operations (PromoteIntBinOp, PromoteIntShiftOp, PromoteExtend,
PromoteLoad, SimplifyBinOpWithSameOpcodeHands, visitSRL, visitTRUNCATE
that matter to the performance of i16 on X86.)

I don't like the "desirable" in the name of these functions: in the
case of Hexagon it is illegal to use an i8 predicate register for
anything else than setcc, brcond, and the logical ops: so doing the
conversion is a matter of correctness, not of desirability.

Should I add a call to IsDesirableToPromoteOp in every other operation
that is currently missing this check for type promotion, or do we want
a new hook?

I found it pretty difficult to modify the existing DAG combiner to add the
missing calls to isDesirableToPromoteOp, so I abandoned this path.

I found it easier to work with a new integer type p8 for the 8 bit
predicates, such that I can promote i1 into p8 and avoid the confusion
of integer and predicate registers that I had when using the same i8 type.

Would a patch adding the p8 type be ok to commit to llvm?

Thanks,
Sebastian

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Second, generically speaking, the problem that you
have seems much more general than the solution you propose. Correct
me if I'm wrong, but your fundamental issue is that you have a type, i8,
than can exist in different register classes, and the operations that
are legal on that type depend on the current register class. The reason
this is a problem is that legalization happens before register-class
assignment.

Currently, isTypeLegal does not take an opcode parameter, but maybe
changing it to depend on the type of operation (like getTypeToPromoteTo
does) and the opcode of the node's inputs would help?

-Hal

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Ok, thanks for the pointer, I'll go read in the PPC bits.

Second, generically speaking, the problem that you
have seems much more general than the solution you propose. Correct
me if I'm wrong, but your fundamental issue is that you have a type, i8,
than can exist in different register classes, and the operations that
are legal on that type depend on the current register class. The reason
this is a problem is that legalization happens before register-class
assignment.

Yes, that's correct.

Currently, isTypeLegal does not take an opcode parameter, but maybe
changing it to depend on the type of operation (like getTypeToPromoteTo
does) and the opcode of the node's inputs would help?

I will try to see if I can fix isTypeLegal.
Thanks for your helpful comments.

Sebastian

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Ok, thanks for the pointer, I'll go read in the PPC bits.

Second, generically speaking, the problem that you
have seems much more general than the solution you propose. Correct
me if I'm wrong, but your fundamental issue is that you have a type, i8,
than can exist in different register classes, and the operations that
are legal on that type depend on the current register class. The reason
this is a problem is that legalization happens before register-class
assignment.

Yes, that's correct.

Currently, isTypeLegal does not take an opcode parameter, but maybe
changing it to depend on the type of operation (like getTypeToPromoteTo
does) and the opcode of the node's inputs would help?

I will try to see if I can fix isTypeLegal.
Thanks for your helpful comments.

Just an idea, you may know that it's possible to custom expand operations with illegal types and it might be useful in this case (considering i1 as illegal). The TypeLegalizer will callback to your lowering function at the very beginning of the Combining/Legalization phases. If you add HexagonISD nodes in the process while promoting operands/result, you will be able to precisely match them later with its associated regclass (PReg?).
Unfortunately, it will not resolve your problem with non-allowed ops for i8 types and I think I'm missing something regarding this matter. Why don't you mark for promotion everything but logical ops ? Are copies between pred regs and IntRegs not allowed ?

Ivan

Hi Ivan,

Just an idea, you may know that it's possible to custom expand
operations with illegal types and it might be useful in this case
(considering i1 as illegal). The TypeLegalizer will callback to your
lowering function at the very beginning of the Combining/Legalization
phases. If you add HexagonISD nodes in the process while promoting
operands/result, you will be able to precisely match them later with its
associated regclass (PReg?).
Unfortunately, it will not resolve your problem with non-allowed ops for
i8 types and I think I'm missing something regarding this matter. Why
don't you mark for promotion everything but logical ops ?

I will try this, although I think it will be painful to maintain an up
to date list
of ops marked for promotion.

I was hoping to find a way to implement the opposite: specify that the
few logical ops are legal on i8, and the default action on the rest of
opcodes would be to promote to i32.

Another way that would be practical is to "remember" that an i8 type
is the result of a first promotion from i1: that would be a legal i8, and
an i8 type that has not been already promoted is illegal and has to
be promoted to i32. The only way I found to implement this is with a
different type: p8.

Are copies between pred regs and IntRegs not allowed ?

Copies between pred and int registers are allowed in Hexagon.

Sebastian

I see that PPC has its condition registers CRRC as i32, and that PPC
also has general purpose i32 registers GPRC, so the situation is slightly
different than on Hexagon, where there are no general purpose registers
of the same size as the predicate registers: i8.

So on PPC it is "safe" to promote from i1 to i32 and to "allow confusion"
between the promoted i32 and the existing operations that were using i32:
as we can always select between a CR and a GPR following the op type.

On Hexagon, if type legalization promotes i1 into i8, that would create
this confusion between the i8 ops existing before legalization and the
newly promoted ones. Then as Ivan was suggesting, we will have to
provide custom expansion to promote the "illegal" ops on i8 on almost
all the operations, except logical ops.

Sebastian

For reference, I just found out that adding a p8 type creates too many
problems in the generic code of LLVM: for instance, here is the last fail

include/llvm/Target/TargetLowering.h:473:
llvm::TargetLowering::LegalizeAction
llvm::TargetLowering::getCondCodeAction(llvm::ISD::CondCode,
llvm::EVT) const: Assertion `(unsigned)CC <
array_lengthof(CondCodeActions) && (unsigned)VT.getSimpleVT().SimpleTy
< sizeof(CondCodeActions[0])*4 && "Table isn't big enough!"' failed.

that looks like we would have to add some more space in a cond code
table, and we would have to specify how to handle p8 types in conditions.

I will thus use the i8 type for predicates in Hexagon, and I will deal with
the difficulties I mentioned in my previous email.

Sebastian

Hi Sebastian,

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Ok, thanks for the pointer, I'll go read in the PPC bits.

I see that PPC has its condition registers CRRC as i32, and that PPC
also has general purpose i32 registers GPRC, so the situation is slightly
different than on Hexagon, where there are no general purpose registers
of the same size as the predicate registers: i8.

So on PPC it is "safe" to promote from i1 to i32 and to "allow confusion"
between the promoted i32 and the existing operations that were using i32:
as we can always select between a CR and a GPR following the op type.

On Hexagon, if type legalization promotes i1 into i8, that would create
this confusion between the i8 ops existing before legalization and the
newly promoted ones. Then as Ivan was suggesting, we will have to
provide custom expansion to promote the "illegal" ops on i8 on almost
all the operations, except logical ops.

I think there is also another (and cleaner) workaround, a kind of operation-based type promotion of __illegal__ types.
This can be done by simply setting the operation with illegal type result to have a custom expander, for example:

setOperationAction(ISD::AND, MVT::i1, Custom)

See LowerOperationWrapper() & ReplaceNodeResults() hooks in TargetLowering. If you make this work only on logical ops, the rest will get automatically promoted by setting the promotion of i1 to be i32 by default. The latter will require a little hack though...
I hope this helps.

Ivan

Salut Ivan,

Hi Sebastian,

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Ok, thanks for the pointer, I'll go read in the PPC bits.

I see that PPC has its condition registers CRRC as i32, and that PPC
also has general purpose i32 registers GPRC, so the situation is slightly
different than on Hexagon, where there are no general purpose registers
of the same size as the predicate registers: i8.

So on PPC it is "safe" to promote from i1 to i32 and to "allow confusion"
between the promoted i32 and the existing operations that were using i32:
as we can always select between a CR and a GPR following the op type.

On Hexagon, if type legalization promotes i1 into i8, that would create
this confusion between the i8 ops existing before legalization and the
newly promoted ones. Then as Ivan was suggesting, we will have to
provide custom expansion to promote the "illegal" ops on i8 on almost
all the operations, except logical ops.

I think there is also another (and cleaner) workaround, a kind of
operation-based type promotion of __illegal__ types.
This can be done by simply setting the operation with illegal type
result to have a custom expander, for example:

setOperationAction(ISD::AND, MVT::i1, Custom)

I was exploring something similar using exactly this function.

See LowerOperationWrapper() & ReplaceNodeResults() hooks in
TargetLowering. If you make this work only on logical ops, the rest will
get automatically promoted by setting the promotion of i1 to be i32 by

I think I was not clear enough in my past emails, so let me try again:

As I am specifying that predicate registers are i8, LLVM considers i8
to be a legal type. i1 is then automatically promoted to the next
larger legal type, that is i8: this is the correct behavior.

The problem is that the existing integer arithmetic operations on i8
are not legal to be executed on the predicate registers (i.e., clang
would generate an i8 expression for the addition of two char
variables.) Hexagon cannot do integer arithmetic operations using the
predicate registers. The addition of two char variables has to be
promoted to the next available integer arithmetic register: that is
i32. Because LLVM automatically legalizes i8 types, it considers all
operations to be legal on i8 (i.e., both integer and boolean arithmetic.)

So the solution that I was investigating looks like this:

    for (unsigned int i = 0; i < ISD::BUILTIN_OP_END; ++i) {
      switch (i) {
      // By default all operations on i8 have to be promoted to i32.
      default:
        setOperationAction(i, MVT::i8, Custom);
        break;

      // Only the following operations are legal on i8 predicates.
      case ISD::AND:
      case ISD::OR:
      case ISD::XOR:
      case ISD::SETCC:
      case ISD::SIGN_EXTEND:
       break;
      }
    }

and promote all i8 to i32 in HexagonTargetLowering::LowerOperation

default. The latter will require a little hack though...
I hope this helps.

Thanks again for your ideas and guidance: very much appreciated.

Sebastian

Salut Sebastian!

Salut Ivan,

Hi Sebastian,

Sebastian,

First, it might be useful to look at what is done in the PowerPC
backend. PPC also has condition registers that are larger than the
1-bit conditional results, and it defines 1-bit subregisters in
addition to the larger condition registers. The spill-restore code ends
up being more complicated, but that, perhaps, is a separate issue. [To
be clear, I am not advocating for (or against) this solution even if it
would work for you].

Ok, thanks for the pointer, I'll go read in the PPC bits.

I see that PPC has its condition registers CRRC as i32, and that PPC
also has general purpose i32 registers GPRC, so the situation is slightly
different than on Hexagon, where there are no general purpose registers
of the same size as the predicate registers: i8.

So on PPC it is "safe" to promote from i1 to i32 and to "allow confusion"
between the promoted i32 and the existing operations that were using i32:
as we can always select between a CR and a GPR following the op type.

On Hexagon, if type legalization promotes i1 into i8, that would create
this confusion between the i8 ops existing before legalization and the
newly promoted ones. Then as Ivan was suggesting, we will have to
provide custom expansion to promote the "illegal" ops on i8 on almost
all the operations, except logical ops.

I think there is also another (and cleaner) workaround, a kind of
operation-based type promotion of __illegal__ types.
This can be done by simply setting the operation with illegal type
result to have a custom expander, for example:

setOperationAction(ISD::AND, MVT::i1, Custom)

I was exploring something similar using exactly this function.

See LowerOperationWrapper()& ReplaceNodeResults() hooks in
TargetLowering. If you make this work only on logical ops, the rest will
get automatically promoted by setting the promotion of i1 to be i32 by

I think I was not clear enough in my past emails, so let me try again:

As I am specifying that predicate registers are i8, LLVM considers i8
to be a legal type. i1 is then automatically promoted to the next
larger legal type, that is i8: this is the correct behavior.

Right. I've mixed things up. I thought you wanted to 'track' i1->i8 promotions to make the difference between the operations coming directly from clang in i8 and the promoted ones.

The problem is that the existing integer arithmetic operations on i8
are not legal to be executed on the predicate registers (i.e., clang
would generate an i8 expression for the addition of two char
variables.) Hexagon cannot do integer arithmetic operations using the
predicate registers. The addition of two char variables has to be
promoted to the next available integer arithmetic register: that is
i32. Because LLVM automatically legalizes i8 types, it considers all
operations to be legal on i8 (i.e., both integer and boolean arithmetic.)

So the solution that I was investigating looks like this:

     for (unsigned int i = 0; i< ISD::BUILTIN_OP_END; ++i) {
       switch (i) {
       // By default all operations on i8 have to be promoted to i32.
       default:
         setOperationAction(i, MVT::i8, Custom);
         break;

       // Only the following operations are legal on i8 predicates.
       case ISD::AND:
       case ISD::OR:
       case ISD::XOR:
       case ISD::SETCC:
       case ISD::SIGN_EXTEND:
        break;
       }
     }

and promote all i8 to i32 in HexagonTargetLowering::LowerOperation

That's hard work! Why don't you call it with "Promote" instead of "Custom" and let the Legalizer do the job? Does it not work?

Ivan

Hi,

The problem is that the existing integer arithmetic operations on i8
are not legal to be executed on the predicate registers (i.e., clang
would generate an i8 expression for the addition of two char
variables.) Hexagon cannot do integer arithmetic operations using the
predicate registers.

so what can you actually do with predicate registers?

Ciao, Duncan.

   The addition of two char variables has to be

So the solution that I was investigating looks like this:

 for \(unsigned int i = 0; i&lt;  ISD::BUILTIN\_OP\_END; \+\+i\) \{
   switch \(i\) \{
   // By default all operations on i8 have to be promoted to i32\.
   default:
     setOperationAction\(i, MVT::i8, Custom\);
     break;

   // Only the following operations are legal on i8 predicates\.
   case ISD::AND:
   case ISD::OR:
   case ISD::XOR:
   case ISD::SETCC:
   case ISD::SIGN\_EXTEND:
    break;
   \}
 \}

and promote all i8 to i32 in HexagonTargetLowering::LowerOperation

That's hard work!

Indeed, that was my concern as well: that's why I tried to avoid using
i8 for predicates and use p8, but now I know that is a dead-end.

Why don't you call it with "Promote" instead of
"Custom" and let the Legalizer do the job? Does it not work?

I tried this, and the legalizer will happily say that i8 is a legal type
and just return the exact same node: this is because we declared
that Hexagon has a register for i8, that makes i8 legal for all
promotions.

Sebastian

Salut Duncan,

As Hal mentioned, the problem is linked to the fact that type legalization
happens before register class assignments.

One way to solve this problem would be to teach type legalization about
the predicate register class: if a processor can perform only boolean
arithmetic it would declare a type to be in the PredRegs class, whereas
a processor that can do both integer and boolean arithmetic on a type
would declare the register to be in both the IntRegs and PredRegs class,
or just in the IntRegs class.

Opinions? How hard is it to teach type legalization about register classes?

Thanks,
Sebastian