Extend SLPVectorizer to struct operations that are isomorphic to vector operations?

While playing with SLPVectorizer, I notice that it will happily vectorize cases involving extractelement/insertelement, but won't vectorize isomorphic cases involving extractvalue/insertvalue (such as the attached example). Is that something that could be straightforward to add to SLPVectorizer, or are there some hard issue? In particular, the transformation would seem to require casts of structures to vectors and back. The bitcast instruction requires a non-aggregate value.

I'm thinking such vectorization might be useful for codes that use structs for tuples, like (x,y,z) coordinates or complex numbers.

- Arch D. Robison

vec4.ll (837 Bytes)

While playing with SLPVectorizer, I notice that it will happily vectorize cases involving extractelement/insertelement, but won't vectorize isomorphic cases involving extractvalue/insertvalue (such as the attached example). Is that something that could be straightforward to add to SLPVectorizer, or are there some hard issue? In particular, the transformation would seem to require casts of structures to vectors and back. The bitcast instruction requires a non-aggregate value.

I'm thinking such vectorization might be useful for codes that use structs for tuples, like (x,y,z) coordinates or complex numbers.

Vectorization of struct values is not supported because it is not something we considered until now. It never showed up in any workload I looked at. It should not be too difficult to implement. We already insert casts when we vectorize loads and stores from memory.

So, the first thing to understand (for Arch who may not have this context)
is that almost all insertvalue/extractvalue instructions should be
optimized out of the IR long before any vectorizer sees it. The SROA pass
completely removes these instructions.

The only time they are likely to show up is when forming (or decomposing)
aggregates passed or returned by value at the LLVM IR level due to ABI
concerns. It would indeed be nice if the SLPVectorizer could vectorize
through these so that we end up with vector code and a tiny scalar peel
right at the ABI boundary where we need to arrange the elements into
specific registers.

Thanks for the information about SROA. It was missing from the Julia pass list, though adding it didn’t help a larger example where the structs were being loaded and stored to/from memory. I’ll have to poke around to figure out what scared it off (or maybe I misplaced the SROA pass).

  • Arch