Avoid using is_device_ptr clause

Is there a way to avoid adding many is_device_ptr clauses every time
program enters the target region? I like the llvm_target_alloc_shared API
can handle internally the device pointer and make it accessible from the
target region as well.

You can use the target pointer association API (see 5.12 in the new
OpenMP Examples 5.1 document). This should help you avoid usage of
is_device_ptr and actually get proper mapping behavior.

- Joachim

Joachim, all,
Is the example you pointed to me just an expansion of llvm_omp_tarfet_alloc_shared, then?

The way I understood your problem is that you want to pass
a device accessible pointer that is not mapped into a target
region, right?

When we implement 5.2 semantics you can pass pointers into
a target region and they will not be nulled but instead retain
their value if no mapping was found. That makes use of
managed or device memory possible without declaring a mapping
or adding the is_device_ptr clause.

For now, you might get around is_device_ptr by type punning,
e.g., cast the pointers into intptr_t before the target region
and back inside.

~ Johannes

Johannes,

The code like this?

#include <cstdlib>
#include <iostream>

extern "C" {
  void* llvm_omp_target_alloc_shared(size_t, int);
}

int main()
{
  std::intptr_t *p5 =
(std::intptr_t*)llvm_omp_target_alloc_shared(sizeof(int)*1024,0);
  p5[0]= 999;

#pragma omp target parallel for device(0)
  for (int i=0;i < 1024;i++) {
    std::printf("p5 %d %d\n", i, reinterpret_cast<int*>(p5)[0]);
  }
  return 0;
}

Johannes,

Can you provide us the rough estimate on when the 5.2 semantics would
be merged upstream; that'd
allow users to handle vectors much easily as there'd be a limit in
creating std::uintptr_t values on the stack per process.
I'd be happy to follow the discussions if you point us to the entry.

Thanks,
Itaru.