Vectorizing min/max reduction pattern

madhur13490 · April 9, 2025, 9:54am

Hi @Mel-Chen, @fhahn
@rjj and I are looking at hotloop in polyhedron benchmark, particularly rnflow.f90. It is in Fortran, but here is the equivalent C version.

int func(int *restrict arr, int N) {
  int minloc = 0;
  for (int i = 0; i < N; ++i)
    if (arr[i] < arr[minloc])
      minloc = i;
  return minloc;
}

Godbolt link: https://godbolt.org/z/qWsWWr3Mq

I have pasted the example on the related issue, too: [GVN] Missed partial redundancy elimination opportunity of load through select instruction · Issue #58569 · llvm/llvm-project · GitHub

This is highlighted in the context of GVN/PRE, but we’re looking at vectorizing this as it offers more gains.

I saw Mel-Chen’s talk in EuroLLVM 2023 and the pending patch ⚙ D143465 [LoopVectorize] Vectorize the reduction pattern of integer min/max with index..

We’d like to know if there are any plans to progress on this, as it has been pending since 2023.

On the other hand, if this is not relevant anymore, then @fhahn is there any work planned in VPlan for handling such cases?

Returning the index and the value both was the crux of Mel’s patch but we are looking at a slightly reduced case.

Thanks
Madhur

Mel-Chen · April 10, 2025, 10:08am

Hi, @madhur13490,
The project for min/max with index is still ongoing. Its prerequisite, the FindLastIV reduction, only recently became available in LLVM 20, and @fhahn just stabilized it.

The loop you mentioned falls under the category of int min/max with the first index, though the min/max result itself is not used outside the loop. We’re currently redeveloping min/max with index in our downstream repo to align with the current VPlan design principles, and to support cases where the min/max result is not used outside the loop.

Mel

madhur13490 · April 10, 2025, 5:24pm

Thanks for the update @Mel-Chen

As a heads-up, we are working on a LoopIdiomVectorize-based solution for the above pattern. Ideally, VPlan would be good, but I’ll let you handle it since it is in a lot of development.

Mel-Chen · June 2, 2025, 6:53am

Hi, @madhur13490 ,
[LoopVectorize] Vectorize the reduction pattern of integer min/max with index. (2/2) by Mel-Chen · Pull Request #142335 · llvm/llvm-project · GitHub is the patch for your case. There are still some TODO items, but feel free to try it out and see if it fits your needs.

Mel

madhur13490 · June 3, 2025, 1:40pm

Hi @Mel-Chen
Thanks for the update.

The exact test case I am looking at is in the PR [LoopVectorize] Add test case for minloc reduction by madhur13490 · Pull Request #141556 · llvm/llvm-project · GitHub

I tried with your mentioned PR, but it doesn’t handle the case; it is still not vectorized and LV bails out. The test has two PHI nodes, and the structure is a bit different.

Do you think you can handle it in the same patch?

Mel-Chen · June 3, 2025, 2:34pm

After taking a closer look, I found that although [LoopVectorize] Add test case for minloc reduction by madhur13490 · Pull Request #141556 · llvm/llvm-project · GitHub and the C source you mentioned earlier differ slightly, they both fall outside the scope of what the current vectorizer can handle.
Compiler Explorer
The key issue is that the index recurrence phi has a loop-internal user that is not part of the recurrence chain:

%12 = phi i32 [ 0, %4 ], [ %20, %10 ]
%13 = getelementptr inbounds nuw i32, ptr %0, i64 %11
%14 = load i32, ptr %13, align 4, !tbaa !7
%15 = zext nneg i32 %12 to i64  ; <-- this user

This prevents it from forming a valid index reduction.
Currently, we can only vectorize semantically equivalent code like the following C source:

int func2(int *restrict arr, int N) {
  int minloc = 0;
  int min = arr[minloc];
  for (int i = 0; i < N; ++i)
    if (arr[i] < min) {
      min = arr[i];
      minloc = i;
    }
  return minloc;
}

Do you think you can handle it in the same patch?

This is not a easy pattern, I think.

Topic		Replies	Views
llvm 10: Why is float experimental_vector_reduce_fmin not tried? LLVM Dev List Archives	1	76	December 2, 2020
First attempt at recognizing pointer reduction LLVM Dev List Archives	13	74	October 24, 2013
How to efficiently vectorize linalg.generic reduction op? MLIR	2	382	January 9, 2024
loop vectorizer and storing to uniform addresses LLVM Dev List Archives	4	128	November 8, 2013
max/min intrinsics LLVM Dev List Archives	13	152	December 17, 2012

Vectorizing min/max reduction pattern

Related topics