Vectorization factor limitation in Loop Vectorizer

Shahid_Asghar-ahmad · December 11, 2014, 6:48am

Hi Nadav/Devs

I am exploring Loop Vectorizer to vectorize i8 scalar operations into 8xi8 vector operation.

I was expecting the Loop Vectorizer to analyze the profitability for vectorization factor(VF) of 8,

However it is not doing so due to the widest type calculation done for the blocks inside the loop.

May be I am missing something, however, I am curious to know why Loop Vectorizer limits the

profitability check to widest type and not allowing for other narrower type?

Regards,

Shahid

Nadav_Rotem1 · December 12, 2014, 9:16pm

Hi Shahid,

Hi Nadav/Devs

I am exploring Loop Vectorizer to vectorize i8 scalar operations into 8xi8 vector operation.

I was expecting the Loop Vectorizer to analyze the profitability for vectorization factor(VF) of 8,
However it is not doing so due to the widest type calculation done for the blocks inside the loop.

May be I am missing something, however, I am curious to know why Loop Vectorizer limits the
profitability check to widest type and not allowing for other narrower type?

The vectorizer stops the search of profitable vectorization factors at the widest type because higher vectorization factors would require the compiler to split the vectorized value into multiple registers. The vectorizer’s cost model first tries to optimize for SIMD instruction utilization. Later, we optimize for ILP by doubling the vectorization factor (we call it “interleave”) and exposing ILP.

Thanks,
Nadav

Shahid_Asghar-ahmad · December 13, 2014, 3:43pm

So IMO, if we modify the VF calculation for targets/subtargets using TTI where higher VF is supported

The vectorizer’s scope will become wider.

Did/do you foresee any issue with this?

Thanks,

Shahid

James_Molloy1 · December 13, 2014, 10:18pm

Hi shahid,

Having looked into this in the past, we’re not taking into account the fact that integer promotions and truncations will disappear during code generation. The canonicalisation phase likes to create i32 and greater type arithmetic, which creates extensions and promotions . However of course the vectoriser can use i8 arithmetic I vectors, so those should go away.

Often this is the cause of insufficient vectorisation factors being picked, and we should fix it.

James

Shahid_Asghar-ahmad · December 15, 2014, 3:25pm

Hi James,

Thanks for the clarification. I hope to come back once I have a fix.

Pls let me know if you have any suggestion.

Thanks,

Shahid

Topic		Replies	Views
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types LLVM Dev List Archives	10	177	June 22, 2016
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types LLVM Dev List Archives	4	184	June 16, 2016
Potentially incorrect calculation of widest type in loop vectorizer Loop Optimizations	0	165	November 20, 2023
[VPlan] about vectorization factor selection LLVM Dev List Archives	2	106	June 5, 2018
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types LLVM Dev List Archives	0	89	June 16, 2016

Vectorization factor limitation in Loop Vectorizer

Related topics