[OpenMP] Runtime problem with subtraction reduction

DylanFleming-arm · August 11, 2022, 2:45pm

I have recently begun working on adding OpenMP reductions to Flang. I’ve come across a peculiar problem while trying to implement subtraction.

For a loop such as:

!$omp parallel
!$omp do reduction(-:x)
do i = 1, 100
  x = x - i
end do
!$omp end do
!$omp end parallel

The correct output here would be -5050, however when I attempt to run it, x instead prints as 0. This same loop reduces perfectly fine for addition and multiplication.

I’ve been trying to debug the issue, but can’t seem to find the problem.
This is the MLIR emitted:

module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.target_triple = "aarch64-unknown-linux-gnu"} {
  omp.reduction.declare @subtract_reduction_i_32 : i32 init {
  ^bb0(%arg0: i32):
    %c0_i32 = arith.constant 0 : i32
    omp.yield(%c0_i32 : i32)
  } combiner {
  ^bb0(%arg0: i32, %arg1: i32):
    %0 = arith.subi %arg0, %arg1 : i32
    omp.yield(%0 : i32)
  } 
  func.func @_QPreduction_subtract() {
    %0 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFreduction_subtractEi"}
    %1 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFreduction_subtractEx"}
    %c0_i32 = arith.constant 0 : i32
    fir.store %c0_i32 to %1 : !fir.ref<i32>
    omp.parallel   {
      %2 = fir.alloca i32 {adapt.valuebyref, pinned}
      %c1_i32 = arith.constant 1 : i32
      %c100_i32 = arith.constant 100 : i32
      %c1_i32_0 = arith.constant 1 : i32
      omp.wsloop   reduction(@subtract_reduction_i_32 -> %1 : !fir.ref<i32>) for  (%arg0) : i32 = (%c1_i32) to (%c100_i32) inclusive step (%c1_i32_0) {
        fir.store %arg0 to %2 : !fir.ref<i32>
        %3 = fir.load %2 : !fir.ref<i32>
        omp.reduction %3, %1 : !fir.ref<i32>
        omp.yield
      }
      omp.terminator
    }
    return
  }
}

The only difference between the MLIR here and for addition is the subi in omp.reduction.declare. I’m unsure if I’m overlooking something here and would appreciate any help/suggestions.

I’ve created a WIP patch on phabricator containing the changes, here:
D131679 [WIP][Flang][OpenMP] Add support for integer subtraction reduction in worksharing-loop

clementval · August 11, 2022, 3:59pm

I guess you cannot just make a subtraction of your temporary accumulators in your reduction when you combined them because you will have your subtraction and the negative result in the accumulator cancel themself.

-25 - -25 = -25 + 25 = 0

DylanFleming-arm · August 11, 2022, 5:01pm

I think you’re correct, and it makes logical sense that the code generated would act that way, because the combining operator for the rest of the reductions is the same as the reduction operator.

My problem is, changing the operator in MLIR to arith.addi means the end result is 5050, as it uses the one MLIR line to emit both blocks meaning it becomes an addition reduction. Is there an extra line I can emit in MLIR to change this behaviour? Or is the solution going to be to change how MLIR is lowered further?

jeffhammond · August 11, 2022, 5:12pm

Subtractive reductions are an absolutely nonsense garbage feature that never should have gone into OpenMP. The committee barely knows what they are supposed to do. You’d be doing the world a favor by throwing an error instead of trying to support them. No one will ever use them and the compliance test suits that verify this feature are wasting everyone’s time.

kkwli · August 11, 2022, 5:38pm

The minus operator for reductions was deprecated in 5.2.

DylanFleming-arm · August 12, 2022, 2:53pm

Ah, okay that makes things easier! Thank you everyone for the help.

Topic		Replies	Views
[RFC] OpenMP reduction support MLIR	17	1933	May 22, 2024
Why some OpenMP ops only interoperate with llvm dialect? MLIR	2	199	October 18, 2023
Support for target in_reduction OpenMP	0	136	October 27, 2020
OpenACC Reductions Flang	1	201	March 24, 2022
[OpenMP] Redundant store inside reduction loop body Clang Frontend	4	88	December 9, 2020

[OpenMP] Runtime problem with subtraction reduction

Related topics