vectorize a def-use chain

I'd like to replace scalar instructions with vector instructions such that the code corresponding to following tree would be vectorized:

load0 load1
   \ /
    \ /
     add/sub/mul
      >
    store

I had unsuccessful encounters with load0->replaceAllUsesWith(vec_load0) complaining about mismatching type (makes sense, add's other operand is still scalar at that time..)

Is the only way to create a vectorized version of this tree by

1) creating the vector loads vec_load0 AND vec_load1
2) retrieving the opcode of the arithmetic instruction and create a new one according to the opcode

How is step 2) in donepractice? Cloning the instruction and replacing the operands? (would that bypass the type checking for a moment?)

Thanks,
Frank

Hi Frank,

SLP vectorizer does something like this. For question (2) you could take a glance at BoUpSLP::vectorizeTree(TreeEntry *E) (file lib/Transforms/Vectorize/SLPVectorizer.cpp) - it also has an example of retrieving an instruction’s opcode and constructing vector operands.

Michael

Hi Frank,

I’m assuming you are widening the loads for you vectorization, so what I would do is something like that:

Step #1: Widen the load and extract:
vec_load0 load1

                          >

extract_element /
\ /
  \ /
   add/sub/mul
    >
    >
  store

Step #2: Do the same for the second load.
vec_load0 vec_load1

                          >

extract_element extract_element
\ /
  \ /
   add/sub/mul
    >
    >
  store

Step #3: Promote the arithmetic operation to a vector one (mutate type).
vec_load0 vec_load1
\ /
  \ /
   vec_add/sub/mul
    >
extract_element
    >
  store

Step #4: Promote the store and get rid of the extract.
vec_load0 vec_load1
\ /
  \ /
   vec_add/sub/mul
    >
    >
  vec_store

Note: There is a class to help promoting scalar operation to vector operation in codegen prepare. It was designed to reusable, so check it out and see it helps we could move its implementation elsewhere.

I'd like to replace scalar instructions with vector instructions such that the code corresponding to following tree would be vectorized:

load0 load1
\ /
  \ /
   add/sub/mul
    >
    >
  store

I had unsuccessful encounters with load0->replaceAllUsesWith(vec_load0) complaining about mismatching type (makes sense, add's other operand is still scalar at that time..)

Is the only way to create a vectorized version of this tree by

1) creating the vector loads vec_load0 AND vec_load1
2) retrieving the opcode of the arithmetic instruction and create a new one according to the opcode

How is step 2) in done practice?

What we did in codegen prepare is that we create a new operation (e.g., extract_element undef), that we use to replace all the users of the initial operation, then we mutate the type of the initial operation, and that use it in the new operation (e.g., extract_element undef becomes extract_element vec_op).

Cheers,
-Quentin

Two ways I’ve done this:

  1. Create a ReplaceAllUsesWithUnsafe() function that does the same as ReplaceAllUsesWith except without the check for compatible types, knowing that eventually you will end up with a fully compatible instruction eventually.
  2. Do a recursive traversal up the use-def chain from the store back through all the args. This will ensure that at every instruction, all the args will have been vectorized through the recursion, and all you really need to do at that point is call Instruction::mutateType() to ensure the return type of the instruction matches the vector args.

Thanks
Jason