Problems with DAG Combiner

Hi all,

i’m writing an back-end for a new research processor architecture and have problems with the DAG Combiner. The processor architecture supports i1 and i32 registers. 1-bit registers are mainly used as comparison result but basic operations like OR are not possible between i1 registers. So I wrote custom lowering for i1 OR operations and replaced it by (trunc (or (aext x), (aext y))). Now the problem is that the DAG Combiner optimizes it back to an i1 OR operations that is not supported by the architecture.

What is the best way to solve this problem? I take a look at the DAG Optimizer and for this OR operation it calls DAGCombiner::SimplifyBinOpWithSameOpcodeHands that folds (OP (aext x), (aext y)) → (aext (OP x, y)). No check if the new operation is legal is performed.

Kind regards

Timo Stripf

Hi all,

i’m writing an back-end for a new research processor architecture and have problems with the DAG Combiner. The processor architecture supports i1 and i32 registers. 1-bit registers are mainly used as comparison result but basic operations like OR are not possible between i1 registers. So I wrote custom lowering for i1 OR operations and replaced it by (trunc (or (aext x), (aext y))). Now the problem is that the DAG Combiner optimizes it back to an i1 OR operations that is not supported by the architecture.

The IA64 target has just been removed from the tree. It was the only target with legal i1 values, so there could be some problems.

The Blackfin DSP can do simple i1 operations with the CC flag and status bits. Initially I also marked i1 as a legal type, but it caused a lot of problems. Now I pretend that the CC register can hold an i32. It just happens to always hold the values 0 and 1. The i1 logical operations are rarely needed, and they can be custom inserted when necessary, see BlackfinTargetLowering::LowerADDE().

I don't think you have to write custom lowering code to get the behaviour you want. Have you tried this:

   setOperationAction(ISD::OR, MVT::i1, Promote);

If you can get your target to work with a legal i1 type, it would be great. The Blackfin target could use that as well.

What is the best way to solve this problem? I take a look at the DAG Optimizer and for this OR operation it calls DAGCombiner::SimplifyBinOpWithSameOpcodeHands that folds (OP (aext x), (aext y)) -> (aext (OP x, y)). No check if the new operation is legal is performed.

When LegalOperations is set, the DAG combiner must not create illegal operations. It is a bug if it does. I recently fixed this in the first if statement in SimplifyBinOpWithSameOpcodeHands(). Perhaps you could add a check to the second if statement and submit a patch?

/jakob

Sorry, the second 'if' is fine, I think. Do you have the patch from r78497? Are you still getting illegal ORs?

Sounds like a bug; the DAGCombiner isn't supposed to introduce illegal
operations after operation legalization.

-Eli

Hi Jakob,

I forget to mention that I'm working atm on the old 2.5 release code base and not on the svn. So I don't know if the problem still exists. I'm going to test it now.

The Blackfin DSP can do simple i1 operations with the CC flag and
status bits. Initially I also marked i1 as a legal type, but it caused
a lot of problems. Now I pretend that the CC register can hold an i32.
It just happens to always hold the values 0 and 1. The i1 logical
operations are rarely needed, and they can be custom inserted when
necessary, see BlackfinTargetLowering::LowerADDE().

I had also a lot of problems to get the i1 operations working. E.g. I had to override the getSetCCResultType to get is working and for ADDE/ADDC the i1 target registers are hardcoded.

I'm writing the back-end to research the influence of several ISA characteristics on the processor performance. Large parts of my back-end are automatically generated by a general description. So I'm very interested in keeping the DAG as realistic as possible.

I don't think you have to write custom lowering code to get the
behaviour you want. Have you tried this:

   setOperationAction(ISD::OR, MVT::i1, Promote);

I tried Promote and Expand but on the 2.5 code base it is not implemented.

If you can get your target to work with a legal i1 type, it would be
great. The Blackfin target could use that as well.

Alright, I'll inform you if i get it running. I successfully compiled and simulated jpeg 2000 encoder/decoder before I ran into this problem.

When LegalOperations is set, the DAG combiner must not create illegal
operations. It is a bug if it does. I recently fixed this in the first
if statement in SimplifyBinOpWithSameOpcodeHands(). Perhaps you could
add a check to the second if statement and submit a patch?

That sounds nice. I'll try out the svn code tomorrow. I think it'll take some days to converted the back-end to the new code base.

-Timo

I had also a lot of problems to get the i1 operations working. E.g. I had to override the getSetCCResultType to get is working and for ADDE/ADDC the i1 target registers are hardcoded.

What is your SetCCResultType now?

Can you compile the CodeGen/Blackfin/basic-i1.ll test case? I never got that one working with legal i1. The IA64 back-end couldn't compile it either.

I'm writing the back-end to research the influence of several ISA characteristics on the processor performance. Large parts of my back-end are automatically generated by a general description. So I'm very interested in keeping the DAG as realistic as possible.

I hate to scope-creep your research, but I would be very interested in an analysis of native i1 operations versus promotions. Are you planning an ISA version with built-in i1 AND/OR/XOR?

I don't think you have to write custom lowering code to get the
behaviour you want. Have you tried this:

  setOperationAction(ISD::OR, MVT::i1, Promote);

I tried Promote and Expand but on the 2.5 code base it is not implemented.

Yeah, you definitely need 2.6. Blackfin has setOperationAction(ISD::OR, MVT::i16, Promote), and it took a few bug fixes to get that working.

/jakob

I converted now my back-end with legal i1 lowering to the 2.6 branch and my original problem with the DAG combiner didn't occur any more and seems to be fixed. setOperationAction(ISD::OR, MVT::i1, Promote) also works fine for logical operations.

What is your SetCCResultType now?

I changed SetCCResultType to return MVT::i1 type.

Can you compile the CodeGen/Blackfin/basic-i1.ll test case? I never
got that one working with legal i1. The IA64 back-end couldn't compile
it either.

No, I was not able to compile your basic-i1.ll test case but I think it is no big deal to add the missing Promote or Expand code for these operations. "add i1" / "sub i1" can be changed to "xor i1", "mul i1" to "and i1". I think div/rem operations must be promoted.

I hate to scope-creep your research, but I would be very interested in
an analysis of native i1 operations versus promotions. Are you
planning an ISA version with built-in i1 AND/OR/XOR?

I've generated a back-end with built-in i1 AND/OR/XOR operations and compared the results against a back-end without these operations. Atm I don't have many applications I can test. I've one application that runs only the CABAC part of H.264 and thereby i1 AND/OR/XOR operations decrease the number of executed operations by 0.2% on a behavioural instruction-set simulator. I used gcc front-end, maybe clang uses more i1 instructions.

Also the AND/OR/XOR promotion is not perfect. It produces the following code:

  ZEXT %r4, %e0
  ZEXT %r0, %e1
  OR %r0, %r0, %r4
  AND %r2, %r0, 1

for i1 or operations when the trunc is eliminated. The last AND is needless. The lowering doesn't know that any_extend is selected to a zero_extend opcode later.

Another problem came up when I tried to promote constant i1 values to i32. It required a custom legalization but later the instruction selector found i1 constants. So I suspect that DAG combiner reintroduced i1 constants. I gave up and added a related instruction to the architecture.

Kind regards
Timo Stripf

Stripf, Timo wrote:

I converted now my back-end with legal i1 lowering to the 2.6 branch
and my original problem with the DAG combiner didn't occur any more
and seems to be fixed. setOperationAction(ISD::OR, MVT::i1, Promote)
also works fine for logical operations.

Excellent.

What is your SetCCResultType now?

I changed SetCCResultType to return MVT::i1 type.

Can you compile the CodeGen/Blackfin/basic-i1.ll test case? I never
got that one working with legal i1. The IA64 back-end couldn't
compile it either.

No, I was not able to compile your basic-i1.ll test case but I think
it is no big deal to add the missing Promote or Expand code for these
operations. "add i1" / "sub i1" can be changed to "xor i1", "mul i1"
to "and i1". I think div/rem operations must be promoted.

Yeah. div/rem can probably just return the first argument since division
by zero is undefined.

I hate to scope-creep your research, but I would be very interested
in an analysis of native i1 operations versus promotions. Are you
planning an ISA version with built-in i1 AND/OR/XOR?

I've generated a back-end with built-in i1 AND/OR/XOR operations and
compared the results against a back-end without these operations. Atm
I don't have many applications I can test. I've one application that
runs only the CABAC part of H.264 and thereby i1 AND/OR/XOR
operations decrease the number of executed operations by 0.2% on a
behavioural instruction-set simulator. I used gcc front-end, maybe
clang uses more i1 instructions.

Great, thanks for doing this.

Since C arithmetic is always promoted to at least int, I wouldn't expect
a C frontend to produce any i1 arithmetic. It could appear as a result
of optimizations and in the DAG combiner, though.

It looks like i1 operations are not that useful, but I can think of a
few cases where they vcould help performance. For instance "if (a<b &&
b<c)" would benefit from using only one conditional branch on a platform
with weak branch prediction.

Another problem came up when I tried to promote constant i1 values to
i32. It required a custom legalization but later the instruction
selector found i1 constants. So I suspect that DAG combiner
reintroduced i1 constants. I gave up and added a related instruction
to the architecture.

Hey, that's cheating! The rest of us don't get to do that, you know. :slight_smile:

/jakob