Taskgroup task_reduction with target in_reduction segfaults

The following code segfaults at the end of the taskgroup when waiting for all tasks. If I remove nowait on the target tasks, the segfault goes away, which suggests the synchronisation at the end of the taskgroup construct is failing.

Tested with LLVM at commit 9ad0ace2, built for -fopenmp-targets=nvtpx64 on a node with 4 A100 GPUs.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>

#include <omp.h>

int main(void) {

  int N = 1000000;
  double *A, *B;
  A = malloc(N*sizeof(double));
  B = malloc(N*sizeof(double));

  double sum = 0.0;

  for (int i = 0; i < N; ++i) {
    A[i] = 1.5;
    B[i] = 2.0;
  int num_devices = omp_get_num_devices();
  printf("Num devices: %d\n", num_devices);
  assert (N % num_devices == 0);
  int len = N / num_devices;

  #pragma omp taskgroup task_reduction(+:sum)
    for (int dev = 0; dev < num_devices; ++dev) {
      int start = dev * len;
      int end = start + len;
      #pragma omp target map(to: A[start:len], B[start:len]) device(dev) in_reduction(+:sum) nowait
        #pragma omp loop
        for (int i = start; i < end; ++i) {
                sum += A[i] * B[i];

  assert(fabs(sum - (3.0*N)) < 1.0E-10);



Last time I checked the in_reduction clause was not yet implemented for target regions (but only for normal tasks).

This is actually something my group was thinking about implementing.

@jdoerfert do you know if someone is working on this?

It’s listed on the feature list as worked on, with Review r308768.

I know but it has been written this way for years (see this update of the table in july 2019) and i have never been able to find this review r308768…

Wrt. in_reduction. I doubt anyone is working on it, though we can ask Wednesday in our weekly meeting. HPE or Intel may have something. That said, we just picked up some steam on reductions on targets in general, maybe a good time to address this.

@tomdeakin Could you file an issue on GH? Will be easier to subscribe the right people there, I think.

Done: GitHub Issue #57522

1 Like