ISD::VAARG and DAGTypeLegalizer::WidenVectorResult

I am trying to debug a crash (llvm_unreachable) where
'DAGTypeLegalizer::WidenVectorResult' is being called with an SDNode of type
'ISD::VAARG' for a vector of type 'v2i32'. This node type is not handled in
this function, so I am assuming something is broken even earlier.

The target does not support 'v2i32' as a native vector type, but it does
support a native 'v4i32'. I have configured
'MYTargetLowering::getPreferredVectorAction' to return 'TypeWidenVector',
but this does not seem to help. I have also tried various methods in
'MYTargetLowering::ReplaceNodeResults' but this doesn't help either.

It looks like the type should have already been widened by the time it gets
to 'DAGTypeLegalizer::WidenVectorResult', but I cannot see why this is not
happening. It is also strange that a VAARG node is getting this far too.

The caller side pushes the 'v2i32' on the stack as a 'v4i32' as expected
(with two elements of rubbish), but when 'va_arg' is used in the callee for
the type 'v2i32' it fails in this way.

This happens for all vectors that do not match a native vector, so it
applies to vectors that should be widened and vectors that should be split.

It is probably a fairly strange thing to do passing vectors to variadic
functions, but it occurs in the OpenCL variant of 'printf'.

Has anybody seen this kind of problem before, and have any recommendations I
could use to identify the cause of this problem so that I can figure out
where best to fix it?

Thanks,

  Martin O'Riordan (Movidius Ltd.)

I have been having trouble getting 'va_arg' to work with vector parameters.
It seems to work fine for each of the native vector types in our processor
(128-bit vectors, of 8, 16 and 32 bit elements). But if I throw something
like 'v2i32' into the mix I get a crash in the type legalizer. This is all
with respect the the v3.5 release, but the same seems to be the case in the
v3.6 pending release.

Examining how other targets do this doesn't seem to show that I am handling
VAARG any differently, and I have registered a custom lowering action for
ISD::VAARG. However, the crash occurs a lot earlier than lowering.

What I can't tell for sure, is whether this is just absent functionality in
LLVM, or if I have missed out some hook or TD description necessary to make
this work.

In C an example that fails is:

  typedef int __attribute__((ext_vector_type(2))) int2;
  ...
  int2 vt = va_arg(vl, int2);

This results in LLVM IR that dumps as:

  %1 = va_arg i8** %vl, <2 x i32>

But if I rewrite the C as:

  typedef int __attribute__((ext_vector_type(2))) int2; // v2i32 is not
natively supported
  typedef int __attribute__((ext_vector_type(4))) int4; // v4i32 is
natively supported
  ...
  int2 vt = (int2)va_arg(vl, int4).s01; // Or '.xy'

it all works exactly as expected. This form results in the following IR:

  %1 = va_arg i8** %vl, <4 x i32>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <2 x i32> <i32 0, i32 1>

The problem is harder when it is a 'v3i32' because there is no 'MVT::v3i32'
declaration.

Looking at how the scalar 'va_arg's are legalized for inspiration, integers
use 'ExpandRes_VAARG' and 'PromoteIntRes_VAARG', while floating-point uses
'ExpandRes_VAARG' and 'SoftenFloatRes_VAARG', but there is no equivalent
handling for vectors.

I thought I'd add handlers for vectors to see if I could achieve the IR
transformation above, but still haven't had any luck. My attempted
resolution for widening follows this message, but I have not yet attempted a
solution for the splitting variant 'DAGTypeLegalizer::SplitVecRes_VAARG'.

So I'm just not seeing the wood for the trees, or is this scenario just not
yet implemented in LLVM?

Thanks in advance for any insights,

    Martin O'Riordan - Movidius Ltd.

====== My Attempt at Resolving This ======

To 'class DAGTypeLegalizer' in 'LegalizeType.h' I added the declaration:

SDValue WidenVecRes_VAARG(SDNode *N);

and to 'DAGTypeLegalizer::WidenVectorResult()' in 'LegalizeVectorTypes.cpp'
I added a use case:

case ISD::VAARG: Res = WidenVecRes_VAARG(N); break;

Finally I implemented 'DAGTypeLegalizer::WidenVecRes_VAARG()' in
'LegalizeVectorTypes.cpp' as follows:

SDValue DAGTypeLegalizer::WidenVecRes_VAARG(SDNode *N) {
   assert(N->getValueType(0).isVector() && "Operand must be a vector");
#ifndef NDEBUG
   dbgs() << "DAGTypeLegalizer::WidenVecRes_VAARG: Before widening:\n

";

   N->dump(&DAG);
   dbgs() << "\n";
#endif
   SDValue Chain = N->getOperand(0); // Get the chain
   SDValue Ptr = N->getOperand(1); // Get the pointer
   EVT VT = N->getValueType(0); // Get the requested type
   EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(),

N->getValueType(0));

   SDLoc dl(N);
   MVT RegVT = TLI.getRegisterType(*DAG.getContext(), VT);

   // Construct a VAARG chain with the replacement type
   Chain = DAG.getVAArg(RegVT, dl, Chain, Ptr, N->getOperand(2),
     N->getConstantOperandVal(3));

   // Now use a vector shuffle to extract the elements required from the
   // widened vector. Create a mask of the elements to select from the

vector

   const unsigned NumElts = VT.getVectorNumElements();
   const unsigned WidenNumElts = WidenVT.getVectorNumElements();

   // Adjust mask based on new input vector length
   SmallVector<int, 16> NewMask;
   for (unsigned i = 0; i != NumElts; ++i)
     NewMask.push_back(i); // Select this element
   for (unsigned i = NumElts; i != WidenNumElts; ++i)
     NewMask.push_back(-1); // Set this element to undefined

   SDValue Res = DAG.getVectorShuffle(WidenVT, dl, Chain,

DAG.getUNDEF(WidenVT),

                                      NewMask.data());

   // Modified the chain result - switch anything that used the old chain

to

   // use the new one
   ReplaceValueWith(SDValue(N, 1), Chain);

#ifndef NDEBUG
   dbgs() << "DAGTypeLegalizer::WidenVecRes_VAARG: After widening:\n >>>>

";