LLVM commit 410f38e01597120b41e406ec1cea69127463f9e5

Hi,

I’m working on a target which have a variable size for CC (the same size as the arguments). As a result getSetCCResultType, return a variable size.

In this commit, at the line DAG.getSExtOrTrunc(SetCC, DL, SelectVT), on my target, you end up generating the Node you are replacing, and so creating a loop in the DAG, which give a whole new meaning to the A in the acronym. Subsequent code manipulating the DAG to not like it at all.

Can you explain me what you were trying to do in that commit ? I know it is several month old, so the answer is likely not in cache, but that is capital to me to understand what is the correct fix.

Thank,

Amaury SECHET

I was fixing creating a setcc with the wrong type for the operands for a target with the same problem, which would then hit a selection failure later.

It was using getSetCCResultType on the result type of the SIGN_EXTEND node, rather than the types being compared in the setcc. The new setcc needs to have the right type, and then the result needs to be converted to the type of the sign_extend.

I think in that case, it was something like (i64 sext (setcc (i32 x) (i32 y)). getSetCCResultType is i64 for i64, and i32 for i32, so using the type of the sext created i64 (setcc (i32 x) (i32 y)) which doesn’t work

-Matt

OK, so in you case, you want DAG.getSExtOrTrunc(SetCC, DL, SelectVT) to tunc the result from i64 to i32 on 64 bits targets, if I understand correctly.

2 questions:

  • Why not generating a selectcc node directly ? It avoid having to mess up with intermediate values.

  • Why calling getSetCCResultType(VT) ? VT is not the type of a parameter of setcc, and this looks incorrect to me.

OK, so in you case, you want DAG.getSExtOrTrunc(SetCC, DL, SelectVT) to tunc the result from i64 to i32 on 64 bits targets, if I understand correctly.

2 questions:
- Why not generating a selectcc node directly ? It avoid having to mess up with intermediate values.

Well first, that’s what it did originally and wasn’t what I was changing. selectcc might not be legal, and I’m not sure why it even exists. I don’t see how it’s any harder to match select + setcc vs. selectcc, and selectcc just gives you another case to worry about.

- Why calling getSetCCResultType(VT) ? VT is not the type of a parameter of setcc, and this looks incorrect to me.

That’s what it did originally, and what I fixed. It now checks getSetCCResultType of the operand’s operand’s value type, the setcc’s operand

OK from what I understand, the DAG.getSExtOrTrunc(SetCC, DL, SelectVT) is unecessary and the SelectVT is nto the right type (as it is called with incorrect parameter).

Here is a patch so it won’t generate a loop. I ran make check and it doesn’t look like anything is broken.

0001-Remove-loop-generation-in-DAGCombiner.patch (1.04 KB)

No, it is necessary and is the fundamental change in that commit. SelectVT is not necessarily the same as VT, so it needs to be converted. This patch breaks the testcases I have that this fixed. If it helps, this is the testcase I was fixing. define i32 @test_sext_icmp_i64(i64 %arg) { %x = icmp eq i64 %arg, 0 %y = sext i1 %x to i32 ret i32 %y } The setcc type is i64 for the i64 icmp, so it then needs to be truncated to 32-bits in this situation

OK from what I understand, the DAG.getSExtOrTrunc(SetCC, DL, SelectVT)
is unecessary and the SelectVT is nto the right type (as it is called with
incorrect parameter).

Here is a patch so it won't generate a loop. I ran make check and it
doesn't look like anything is broken.

No, it is necessary and is the fundamental change in that commit. SelectVT
is not necessarily the same as VT, so it needs to be converted. This patch
breaks the testcases I have that this fixed. If it helps, this is the
testcase I was fixing.

define i32 @test_sext_icmp_i64(i64 %arg) {
  %x = icmp eq i64 %arg, 0
  %y = sext i1 %x to i32
  ret i32 %y
}

So this is not possible ?
(select (i64 setcc ...), (i32 -1), (i32 0))

The setcc type is i64 for the i64 icmp, so it then needs to be truncated
to 32-bits in this situation

I understand this? Why does the setcc needs to be truncated ? It doesn't
look like it usually is.

I have a test case which is exactly the other way around :
define i32 @test_sext_icmp_i32(i32 %arg) {
  %x = icmp eq i32 %arg, 0
  %y = sext i1 %x to i64
  ret i64 %y
}

The DAG.getSExtOrTrunc(SetCC, DL, SelectVT) is returning the node
DAGCombiner is working on, which create a loop after the substitution. If I
have X = (i64 (sext (i32 (setcc A B)))) to be "combined" as (select X, -1,
0) and then, after substitution occurs, the select become its own
condition, creating a loop in the DAG.

I think this will work if we pull the conversion from out of the select, so instead select (getSExtOrTrunc) we do getSExtOrTrunc (select). However, my target would much rather do the 32-bit select than the 64-bit select (although arguably i32 trunc (i64 select) → select (i32 trunc i64) should be a separate combine. Alternatively maybe this should only be done if the setcc type is the same as the sext result? Does this patch work for you?

0001-Move-sext-trunc-out-of-select.patch (1.3 KB)

I think we should actually do this. If you need to convert the setcc result after, you aren’t really gaining anything by doing this transformation

The proposed patch would work for me. The combine make sense for me 100% of the time as my target can have different type for the setcc and the selected values. So if it is not in DAGCombiner, I’m going to end up doing it one way or another in the backend.

Resurrecting this thread.

After quite a lot of discussion with Matt on IRC, here are some conclusion and some experiment I came up with.

  • getSetCCResultType use should probably limited to find the setcc result type based on setcc argument type. Using it to find out the expect type of a select’s predicate based on the selected value type is an incorrect use. The use case here is isolated enough so I’m not sure it really make sense is backend agnostic code.

  • Removing the optimization altogether break a bunch of tests. Backends expect it to work.

  • Not truncating or extending the SetCC value seems to work. select node with different type for predicate and value is already supported, and we take advantage of this. Note: this used to break some tests on x86, but it doesn’t anymore, so I suppose x86 target is now capable of handling this properly.

  • I tried to Truncate if SetCC is larger than the selected values but keep different types if it is extended. This also seems to work properly.

  • Using another type of extend (zero extend for instance) produce idiotic results.

  • doing the sext or trunc after the select (sext_or_trunc (select, setcc, 0, -1)) work as well but will make x86 generate 64 bit select when 32 bits one would be sufficient. This either need to be handled in the target (but in this case, going for a more general solution is preferable IMO) so this seems to be the wrong balance.

Basically, the deeper question here is do we want to support predicate of different type that selected value. I think we want considering the various option explored here.