Hello all,
I believe I have found a bug in overload resolution for cuda code. I am currently awaiting permission to post to the bug tracker.
The following code doesn’t compile with newer versions of clang:
template
device host int foo(T *x) {
return 1;
}
device int foo(int *x) {
return 2;
}
host int foo(long *x) {
return 3;
}
device host int bar() {
auto long_val = 1l;
return foo(&long_val);
}
clang++ -O2 -g -x cuda --cuda-gpu-arch=sm_61 -std=c++14 -o main -c main.cpp give me:
error: reference to host function ‘foo’ in host device function
return foo(&long_val);
^
main.cpp:10:14: note: ‘foo’ declared here
host int foo(long *x) {
I believe that the issue is at https://github.com/llvm/llvm-project/blob/8224c5047e9cef2db4b0e31427cdf90a2568a341/clang/lib/Sema/SemaOverload.cpp#L9860
It’s possible that IdentifyCUDAPreference will return CFP_HostDevice for valid overloads, but this code doesn’t erase the wrong side candidates in that case. Then because the wrong side candidate is an exact match, minus its host device attributes, clang picks it as the best overload.
If I rewrite those lines as:
bool ContainsSameSideCandidate =
llvm::any_of(Candidates, [&](OverloadCandidate *Cand) {
// Check viable function only.
if (Cand->Viable && Cand->Function) {
auto MatchType = S.IdentifyCUDAPreference(Caller, Cand->Function);
return MatchType == Sema::CFP_HostDevice ||
MatchType == Sema::CFP_SameSide;
}
return false;
});
My code compiles again. I can submit a bug report once I am approved, but I figured I would post here in the mean time.
-Drew