Hi all,
one of our students run into the issue that below code seems to deadlock, when we try to execute the target region with omp_set_lock on a GPU.
We agree, that it is not the best idea to use locks on a GPU, but according to the OpenMP spec, nothing prevents us from using locks in a target region.
By declaring the lock variable globally in a declare-target block, we tried to put the lock variable in global memory (not sure that this actually would help?)
Any hints, how this could be fixed?
For similar code, which uses a doacross loop (ordered + depend) I saw some linker errors, and the execution fell back onto the host:
nvlink error : Undefined reference to '__kmpc_doacross_init' in '/tmp/ordered-test-a55a1a.cubin'
nvlink error : Undefined reference to '__kmpc_doacross_fini' in '/tmp/ordered-test-a55a1a.cubin'
nvlink error : Undefined reference to '__kmpc_doacross_wait' in '/tmp/ordered-test-a55a1a.cubin'
nvlink error : Undefined reference to '__kmpc_doacross_post' in '/tmp/ordered-test-a55a1a.cubin'
So, probably a target region with omp_set_lock should also fall back onto the host?
Best
Joachim
#include <omp.h>
#include <stdio.h>
#define N 100
int countervar = 0;
#pragma omp declare target
omp_lock_t lock;
#pragma omp end declare target
int count(){
#pragma omp target map(tofrom:countervar) device(0) if(0)
{
omp_init_lock(&lock);
#pragma omp parallel for
for(int i=0; i<N; i++){
omp_set_lock(&lock);
printf("%i: %i\n", omp_get_thread_num(), i);
countervar++;
omp_unset_lock(&lock);
}
omp_destroy_lock(&lock);
}
return 0;
}
int main(){
count();
printf("counter: %i expected: %i\n", countervar, N);
return 0;
}