Just to be clear, you don’t think it has potential because it’s been
disgned into an inner-loop corner and would take extensive rewriting to
handle OLV?
First of all, if one wants to do dependence analysis for OLV (which is what I’m currently working on), they may get away
with extending LAA, or writing their own OLV checker.
That said, no it doesn’t seem that the current LAA can help. But that we can’t support OLV currently
is not the core problem for me. We may do it eventually, but if the implementation is still hacky, then
it would still be bad I think. The core problem is that there’s no clear theoretical foundation to support it.
The code seems to me like a set of “Oh, we needed to handle this case so we added an if
there”.
Which to me is not the correct way to go and it’s pretty much the opposite of DA.
(Again, I don’t mean to criticize LAA implementors - they probably had their reasons and I guess
they know way more than me)
it certainly seems like extending DA is the way to go
Maybe but maybe not. Maybe a good theory for run-time checks ends up
being different than DA’s ideas (this is partly what I have to do this summer).
In any case, extending
but I’d like to hear from the current vectorizer
maintainers because I don’t have enough knowledge to make an informed
judgment.
Me too!
There’s the VPlan infrastructure which I have not heard much about for
several months. What is going on with that? Yes, that’s a vector
codegen issue but it may be useful to have a more complete picture of
how all this works or will work.
Sorry, I don’t know much about VPlan. I’m involved in the RV (https://github.com/cdl-saarland/rv)
Note that when the development of LAA started it also did static checks only, even though DA already existed in the code base
Interesting, thanks.
Thanks for sharing your analysis.
No problem 
I am not sure if that is an entirely fair characterization of LAA. LAA is being used by the vectorizer (and other passes) in production for a few years now. None of the in-tree users of DA seem to be enabled by default and therefore LAA probably has an order of magnitude more testing, bug fixes & tuning.
No argument there. The fact that it is used, doesn’t necessarily mean it’s clean nor that it has some strong theory supporting it.
DA’s implementation might be cleaner, but as mentioned earlier, DA handles only a small subset of things LAA handles and hence I am not sure comparing the code-complexity is too helpful.
DA does not handle a small subset of LAA’s checks, unless I miss something. It handles way more when it comes to static checking.
I think that comparing code complexity is important. DA is about double the size of LAA yet it’s way more understandable. And the reason for that
I don’t think it is that it does something more trivial. Rather, it’s based on a clear paper and has clearly implemented it.
IMO a lot of LAA complexity comes from things DA does not handle, in particular runtime check generation.
I agree.
LAA also analyses & processes a whole loop whereas DA only checks dependences between 2 memory accesses, as well as decides whether it is profitable to generate runtime checks.
It processes innermost loops only and the fact that it can handle a whole loop rather than independent accesses I’m not sure it is a good path. For LAA’s usage it’s necessary but it creates a form of coupling (and complexity).
There is definitely potential for improving the structure & organization of LAA, as well as improving the documentation. Happy to collaborate on that.
Are we really sure of that? Personally, I was thinking of submitting a patch but I’m not sure it is worth the effort. However, I’m glad to hear that you’re happy to collaborate. 
We can talk about that more if you want.
I am not convinced it makes sense to add runtime check generating to DA directly, because I don’t think the static dependence checks really need to be strongly coupled with runtime-check generati
I agree to the latter, maybe to the former. In any case, I’d like to see the current DA staying as it is. And move the discussion to “what is the future of run-time checks”. Either that
is extending LAA, DA or something else completely.
To clarify, LAA does static checks and only generate runtime checks if it cannot prove that the dependence is safe for vectorization statically. Granted, the static checks mostly boil down to distance computations on SCEV expressions, but for the current use cases it seems to work well enough.
Yes, sorry for not stressing that LAA does static checks too as I said though, in my understanding they’re very weak, though still enough for its usage. And I agree that LAA’s capabilities is probably enough for innermost loop vectorization.
The important thing I believe is the future.
It might be feasible to use DA for the static checks in LAA. That might help for a few multi-dimensional cases, but in practice generating proper runtime-checks for multi-dimensional cases is probably more important, due to aliasing issues.
Well… I tried that and it doesn’t seem to be very useful unfortunately. The C/C++ way that arrays are defined is probably why DA is not that useful. Namely that a row can alias with another row in 2D arrays. The theory behind DA
is quite powerful if we knew that they don’t alias. Right now, it just gives up.
I don’t think that LAA can handle multi-dimensional cases either though, nor do I have a good idea about how to do it myself (in or out of LAA / DA).
Best,
Stefanos
Στις Τετ, 8 Ιουλ 2020 στις 12:48 π.μ., ο/η Florian Hahn <florian_hahn@apple.com> έγραψε: