I have a loop that fails to vectorize for AArch64 SVE due to an llvm.vector.extract
intrinsic call that is both conditional and loop invariant. I expected LICM to speculatively hoist the call, and a godbolt example shows that is sufficient to trigger vectorization. LICM fails to hoist the call, because these intrinsics are not currently marked IntrSpeculatable
. My question is, should they be?
The text in the language reference is ambiguous enough that the answer is not obvious, specifically, “idx
must be a constant multiple of the known-minimum vector length of the result type. If the result type is a scalable vector, idx
is first scaled by the result type’s runtime scaling factor. Elements idx
through (idx + num_elements(result_type) - 1)
must be valid vector indices. If this condition cannot be determined statically but is false at runtime, then the result vector is undefined.”
I suspect the intent is for “result vector is undefined” to mean one of two things:
- The result vector is a poison value, similar to how extractelement and insertelement are defined.
- The behavior is undefined. This seems to have been the conclusion for a similar issue with the vector predication intrinsics, which suffer from the same ambiguity in the language reference. See D125296.
If it’s the former, we should be able to mark these intrinsics IntrSpeculatable
, right? Targets will need to generate code that doesn’t crash or otherwise misbehave when idx
is out of range, but if I am reading the code correctly, the target independent SelectionDAG legalization of these intrinsics is already doing that. Here is another godbolt example showing the generated code for AArch64.
Any guidance on how to proceed would be appreciated. At a minimum, I would like to tighten up the specification in the language ref.
Thanks,
Dave Kreitzer