Using taskyield to emulate a background service

Dear devs,

I am trying to emulate recurring service calls in OpenMP (similar to what OmpSs-2 offers) in order to progress outstanding MPI operations. My attempt is to have one task executed by some thread that invokes the progress service and calls taskyield to participate in the execution of tasks generated by any of the other threads. The task's code is similar to this snippet:

   #pragma omp task shared(do_progress)
   {
     while(do_progress) {
       call_progress();
       #pragma omp taskyield
       }
     }
   }

The variable `do_progress` is a volatile flag that is unset by another thread once progress is not needed anymore.

What I found is that the thread executing this service task does not participating in the execution of available tasks from the other threads. In other words, the yield does nothing. However, if I set KMP_TASK_STEALING_CONSTRAINT=0 then the thread does participate in the execution of tasks. All tasks are tied (because of issues with untied tasks).

Can someone tell me why taskyield does not steal tasks by default? Is that related to the the fact that all tasks are tied?

Many thanks!

Cheers
Joseph

Hi Joseph,

OpenMP specification has Task Scheduling Constraints on the scheduling of a new task for a thread. The first constraint is:

1. Scheduling of new tied tasks is constrained by the set of task regions that are currently tied to the
  thread and that are not suspended in a barrier region. If this set is empty, any new tied task may
  be scheduled. Otherwise, a new tied task may be scheduled only if it is a descendent task of
  every task in the set.

So a thread can only execute descendants of its suspended tasks. Looks like your program is affected by this constraint. Setting KMP_TASK_STEALING_CONSTRAINT=0 disables the check of constrain in the runtime, making the program officially non-conforming, but it still can work given that the tree of generated tasks is simple enough. In general, complicated trees of generated tasks may not work without obeying the constraints, causing e.g. deadlocks.

Regards,
Andrey