[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us

Status Update on VPlan ---- where we are currently, and what's ahead of us

Hi Hideki/Ayal/Gil et. al,

First of all, thank you very much for the (past, current and future)
efforts in the vectoriser. It's much appreciated!

With the first patch, we introduced the concept of VPlan to LV and started explicitly recording decisions like interleave memory access optimization and serialization. In the first patch, we resisted introducing VPInstructions ----- and introduced VPRecipes instead, in an attempt to avoid duplicating Instructions in the abstract HCFG Representation (i.e., abstract Instructions in HCFG that is separate from incoming IR Instructions). As we moved on, it became more and more apparent that we have a need to introduce new abstract Instructions (see https://reviews.llvm.org/D38676 for more details) which also requires representation of new use-def relations that does not exist in incoming IR Instructions. As a result, with the second patch, as part of explicitly modeling masking in VPlan, we introduced VPInstruction, which is an abstraction of IR Instruction.

This was expected, as we move into a radically different model. I
think the current approach to implement & refactor is a good one and
we must continue like that. Pushing for too many features will break
the compiler and too much refactoring will break the spirits of
everyone involved.

Additional Work Needed to Handle Higher Complexity:
---------------------------------------------------
* Construct VPlan near the beginning of LV (right after Legal or Must-Vectorize directive check)

Additional Work Needed for Outer Loop Auto-Vectorization:
---------------------------------------------------------
* Legality check
* Cost modeling (compare it to inner loop vectorization strategy in apples-to-apples manner).

On these points, we may need to make it more clear what happens when.
There is an overall legality check, but there also may be
VPlan-specific legality issues (especially as we move to outer-loop
vectorisation) that will not be obvious before we create the VPlans.

I'm not too worried about illegal transformations made legal by VPlans
(for example Polyhedral or inner-loop LICM), but the other way round,
where we may break things outside a VPlan (for instance, A->C is legal
but A->B->C is not). I can't think of anything right now (why I used
"A" and "B"), but I'd welcome thoughts on the impact of more complex
VPlans on the whole legality->cost->transform model.

Summary of the current state of VPlan infrastructure project is presented, and the remaining steps towards outer loop vectorization is listed. We are currently at a point where we can slow down the refactoring effort for the purpose of expediting the big functionality boost: outer loop vectorization ----- and by doing so encourage more participation from the wider LLVM community in the refactoring effort to expedite the overall transition to the VPlan framework.

Sounds like a plan!

cheers,
--renato

Hi,

That sounds like an excellent idea! Any concrete ideas/plans how people could get involved, besides doing reviews?

Let's talk about this in the RFC context. http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html.
Divergence Analysis work mentioned there is a good example.

Thanks,
Hideki

This was expected, as we move into a radically different model. I think the current approach to implement & refactor is a good one and we must continue like that.

Outer loop vectorization implementation plan (http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html) is also like that. Since outer loop was never
supported in LV, we can safely do everything on VPlan infrastructure w/o causing any regressions in functionality/performance. That should help everyone think
where will be the best places in VPlan to fit the (remaining) non-VPlan aspects of LV.

On these points, we may need to make it more clear what happens when.
There is an overall legality check, but there also may be VPlan-specific legality issues (especially as we move to outer-loop
vectorisation) that will not be obvious before we create the VPlans.

I'm not too worried about illegal transformations made legal by VPlans (for example Polyhedral or inner-loop LICM), but the other way round, where we may break things outside a VPlan (for instance, A->C is legal but A->B->C is not). I can't >think of anything right now (why I used "A" and "B"), but I'd welcome thoughts on the impact of more complex VPlans on the whole legality->cost->transform model.

I'm not 100% sure if you and I are talking about the same thing here, but one of the easy ways to transform a legal-to-vectorize loop into an illegal-to-vectorize loop is THEN and ELSE flipping,
when THEN ==> ELSE forward dependence exists. After vectorization legality is "ensured" (OpenMP simd is one like that, ensured by programmer before clang parses the code), we can't flip THEN
and ELSE w/o making sure that dependence won't exist between THEN and ELSE ---- and this is relevant in inner loop vectorization scenario also. I think we need to document
when/where different kinds of vectorization legality assurance happens and what transformations can break that before actual vectorization transformation kicks in.

Thanks,
Hideki

Hello,

Just minor comment.

  • Improve uniformity/divergence analysis ----- Uniformity in innermost loop vectorization is
    invariance. For outer loop vectorization, there are uniform values that are not invariant.

I believe that uniformity/divergence analysis is one of key technologies for efficient vectorization, so I appreciate you bringing this up and looking forward to extensive and comprehensive framework here.

In fact there is uniformity in inner loop vectorization that is not invariance. Expressions like a[i/16] are uniform under certain conditions (namely i starts with 0 mod min(VL, 16), and 16 % VL == 0) while not invariant. It is unfortunate for many media codes operating on blocks that loop vectorizer (at least in my experience) cannot detect and harness this uniformity. I may even try to look into improving this if someone give me pointers where to start.

Regards,
Serge Preis

We are working with Univ. of Saarland folks for this aspect. What you wrote is true (and you know I know that) ---- I just didn’t write too

much details in that one-liner explanation on why we need to work in that area, as I expect Simon Moll (U. Saarland) to be sending in his

RFC on this topic in not too distant future. We think Divergence Analysis (DA) code from Region Vectorizer (RV) project has good potential

for reuse in Outer Loop Vectorization project (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html), and good

divergence analysis should also help innermost loop vectorization (e.g., gather/scatter versus unit-stride).

I suggest first trying to get in touch with Simon if you are interested in this aspect of vectorization to see what DA in RV already has. Let us

know if you are also interested in the outer loop vectorization. There are plenty of things for everyone interested.

Thanks,

Hideki