[OpenMP]: wrong threadprivate storage name in task reduction

Hi,

I've noticed that clang does not work properly when using task reductions with non-constant arrays and/or classes with omp_orig constructor initializers.

As reduction initialize/combine/finalize functions only receive references to whatever should be initialized, combined..., clang generates some additional storages to save the size of the reduction array and a pointer to the omp_orig. That storages are generated through __kmpc_thread_private_cached() calls in both reduction functions and outline task function.

The problem is that the storage used in outline task function does not match the storage in reduction functions. This happens in programs that have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective() in CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp

Hi, thanks for the report, I’ll take a look.

Hi,

``

I did some little research and I have realized that the patch I attached doesn't fix the problem completely. I fixed the storage of the data. When a using task reduction with vla, or a class that uses omp_orig to initialize the private copies, the size of the vla and omp_orig are stored in a storage using __kmpc_threadprivate_cached().

``

The problem comes because, since in outline parallel function each reduction has lazy_priv flag, the initialization will be done in __kmpc_task_reduction_get_th_data(), called from outline task function. This is done ``before`` storing the vla size or omp_orig, so the initialization would be wrong in these cases.

``

I think that if the storage is put before the __kmpc_task_reduction_get_th_data() it should work.

``

Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

Sorry for the mail duplicate.

``

Regards,

Raúl

WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

Hi,

Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

What do you mean?Could you provide some more detais?

Hi,

I mean that ``it seems that this implementation restricts the tasks that participate in the reduction to be defined in the same lexical scope. I have not been able to find this restriction in the OpenMP specification. What the spec says about this topic is:

"A list item that appears in an in_reduction clause of a task-generating construct must appear
in a task_reduction clause of a construct corresponding to a taskgroup region that includes
the participating task in its taskgroup set. The construct corresponding to the innermost region
that meets this condition must specify the same reduction-identifier as the in_reduction
clause."

I tried to compile the attached file. The compile command is:

clang -fopenmp task_red.cc

Regards,
Raúl

WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

task_red.cc (410 Bytes)

Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.

I see. In that case, what about using task reductions with a global variable like the attached example?

I've seen that you fixed the storage calls. I've tested it but still doesn't work omp_orig initialization. If I'm not wrong, the reduction init function use the pointer returned by __kmpc_threadprivate_cached() directly without dereference.

Regards,
Raúl

task_red_gvar.cc (307 Bytes)

1. I don't think that this is correct construct too. Since we don't support such kind of construct for locals, we should not support it for globals,
since OpenMP does not have any difference between locals and globals. I think it is just the problem of the standard that it is not quite clear about it.
Probably, we can support this construct, but not sure that it is required. I don't want to spend my time implementing the things that won't be used and
not allowed by the standard.
2. I'll check it. All the problems with it because runtime is not quite correct and does not fully support task reductions. Because of that I had to introduce
all that stuff with the threadprivates.