Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

Hi All,

I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I have a llvm IR code snippet as following:

llvm IR code snippet:

for.body: ; preds = %entry, %for.cond
   %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ]
   %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23>
   %1 = extractelement <2 x i1> %0, i32 %i.022
   %vecext4 = extractelement <2 x i32> %vecinit1, i32 %i.022
   %vecext5 = extractelement <2 x i32> <i32 0, i32 -23>, i32 %i.022
   %cmp6 = icmp ne i32 %vecext4, %vecext5
   %cmp7 = xor i1 %1, %cmp6

...

and the SelectionDAG before TypeLegalizer is like this.

   t0: ch = EntryToken
   t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
   t3: ch = ValueType:i32
       t5: i32,ch = CopyFromReg t2:1, Register:i32 %vreg1
     t7: i32 = AssertZext t5, ValueType:ch:i1
   t8: v2i32 = BUILD_VECTOR t2, t7
   t11: v2i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<-23>
   t15: i32,ch = CopyFromReg t0, Register:i32 %vreg2
           t22: i32 = add t15, Constant:i32<1>
         t24: ch = CopyToReg t0, Register:i32 %vreg3, t22
         t27: ch = CopyToReg t0, Register:i32 %vreg8, Constant:i32<-1>
       t31: ch = TokenFactor t24, t27
             t13: v2i1 = setcc t8, t11, setne:ch
           t16: i1 = extract_vector_elt t13, t15
             t17: i32 = extract_vector_elt t8, t15
             t18: i32 = extract_vector_elt t11, t15
           t19: i1 = setcc t17, t18, setne:ch
         t20: i1 = xor t16, t19

...

I have not added any vector register class so 'DAGTypeLegalizer' tries to split the "t16: i1 = extract_vector_elt t13, t15" because t13's result type is 'v2i1'. If the size of vector element is less than 8bit, 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT()' function extends the elements to 8bit and stores them on stack. Finally, the function generates 'ExtLoad' to load specific element. But if the element's size is less than 8bit, I think it could be wrong. It looks it needs just 'Load' or "Load and Truncate" to match the result type of 'EXTRACT_VECTOR_ELT'. How do you think about it? If I missed something, please let me know.

Thanks,

JinGu Kang

Can someone give the comment about it please?

Thanks,

JinGu Kang

extends the elements to 8bit and stores them on stack.

Store is responsible for zero-extend. This is the policy...

- Elena

Hi Elena,

Thanks for your response.

The store is ok but the extending load generates assertion after the store because MemVT is i8 and VT is i1 on following line.

assert(MemVT.getScalarType().bitsLT(VT.getScalarType()) && “Should only be an extending load, not truncating!”)

so I think we need to use non-extending load for element size less than 8bit on “DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT” like this roughly.

if (N->getOperand(0).getValueType().getVectorElementType().getSizeInBits() < 8) {
return DAG.getLoad(N->getValueType(0), dl, Store, StackPtr, MachinePointerInfo());
} else {
return DAG.getExtLoad(ISD::EXTLOAD, dl, N->getValueType(0), Store, StackPtr, MachinePointerInfo(), EltVT);
}

How do you think about it?

Thanks,
JinGu Kang

Please open a bugzilla ticket and attach your testcase. It will allow us to debug and fix the problem.

Thanks

um… In order to reproduce the issue, we need to add ‘i1’ register class and avoid all vector register class on TargetLowering class… I am getting the issue on my custom target. I will try to find the existing target to reproduce the issuet but I am not sure whether the existing targets can reproduce the issue or not…

Thanks,

JinGu Kang

so I think we need to use non-extending load for element size less than 8bit on “DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT” like this roughly.

if (N->getOperand(0).getValueType().getVectorElementType().getSizeInBits() < 8) {
return DAG.getLoad(N->getValueType(0), dl, Store, StackPtr, MachinePointerInfo());
} else {
return DAG.getExtLoad(ISD::EXTLOAD, dl, N->getValueType(0), Store, StackPtr, MachinePointerInfo(), EltVT);
}

I assume that we need the opposite -

if (… < 8)

getExtLoad // VT should be MVT::i8, MemVT should be MVT::i1

else

getLoad

I think the EXTRACT_VECTOR_ELT’s output should be same with the load’s output. It means If the element’s size is less than 8 bit, we need Truncate after load. I guessed the load with less than 8bit would be lowered. It could be better to add truncation explicitly.

Thanks,

JinGu Kang