Hi @Mel-Chen, @fhahn
@rjj and I are looking at hotloop in polyhedron benchmark, particularly rnflow.f90. It is in Fortran, but here is the equivalent C version.
int func(int *restrict arr, int N) {
int minloc = 0;
for (int i = 0; i < N; ++i)
if (arr[i] < arr[minloc])
minloc = i;
return minloc;
}
Godbolt link: https://godbolt.org/z/qWsWWr3Mq
I have pasted the example on the related issue, too: [GVN] Missed partial redundancy elimination opportunity of load through select instruction · Issue #58569 · llvm/llvm-project · GitHub
This is highlighted in the context of GVN/PRE, but we’re looking at vectorizing this as it offers more gains.
I saw Mel-Chen’s talk in EuroLLVM 2023 and the pending patch ⚙ D143465 [LoopVectorize] Vectorize the reduction pattern of integer min/max with index..
We’d like to know if there are any plans to progress on this, as it has been pending since 2023.
On the other hand, if this is not relevant anymore, then @fhahn is there any work planned in VPlan for handling such cases?
Returning the index and the value both was the crux of Mel’s patch but we are looking at a slightly reduced case.
Thanks
Madhur
1 Like
Hi, @madhur13490,
The project for min/max with index is still ongoing. Its prerequisite, the FindLastIV reduction, only recently became available in LLVM 20, and @fhahn just stabilized it.
The loop you mentioned falls under the category of int min/max with the first index, though the min/max result itself is not used outside the loop. We’re currently redeveloping min/max with index in our downstream repo to align with the current VPlan design principles, and to support cases where the min/max result is not used outside the loop.
Mel
Thanks for the update @Mel-Chen
As a heads-up, we are working on a LoopIdiomVectorize
-based solution for the above pattern. Ideally, VPlan would be good, but I’ll let you handle it since it is in a lot of development.
Hi @Mel-Chen
Thanks for the update.
The exact test case I am looking at is in the PR [LoopVectorize] Add test case for minloc reduction by madhur13490 · Pull Request #141556 · llvm/llvm-project · GitHub
I tried with your mentioned PR, but it doesn’t handle the case; it is still not vectorized and LV bails out. The test has two PHI nodes, and the structure is a bit different.
Do you think you can handle it in the same patch?
After taking a closer look, I found that although [LoopVectorize] Add test case for minloc reduction by madhur13490 · Pull Request #141556 · llvm/llvm-project · GitHub and the C source you mentioned earlier differ slightly, they both fall outside the scope of what the current vectorizer can handle.
Compiler Explorer
The key issue is that the index recurrence phi has a loop-internal user that is not part of the recurrence chain:
%12 = phi i32 [ 0, %4 ], [ %20, %10 ]
%13 = getelementptr inbounds nuw i32, ptr %0, i64 %11
%14 = load i32, ptr %13, align 4, !tbaa !7
%15 = zext nneg i32 %12 to i64 ; <-- this user
This prevents it from forming a valid index reduction.
Currently, we can only vectorize semantically equivalent code like the following C source:
int func2(int *restrict arr, int N) {
int minloc = 0;
int min = arr[minloc];
for (int i = 0; i < N; ++i)
if (arr[i] < min) {
min = arr[i];
minloc = i;
}
return minloc;
}
Do you think you can handle it in the same patch?
This is not a easy pattern, I think.