Compiler directives between OpenACC constructs and DO loops

Hello!

I was testing an internal codebase with flang and for the following MRE I got a compiler error:

subroutine foo
  integer :: bar

  bar = 1
  !$acc parallel loop vector reduction(+:bar)
  !dir$ ivdep
  do i = 1, 100, 1
    bar = bar + i
  end do
end subroutine foo
$ build/bin/flang-new -fopenacc t.f90
error: Semantic errors in t.f90
./t.f90:5:9: error: A DO loop must follow the PARALLEL LOOP directive
    !$acc parallel loop vector reduction(+:bar)
          ^^^^^^^^^^^^^

This is probably an extension, but it is a common one: both nvfortran and gfortran support it and there is not even a warning about it. But for gfortran the directive should be changed: dir$ → gcc$, though.

So, is there a patch for this behavior somewhere downstream meant to be upstreamed?
If no, should I submit patch for this?

I have 2 possible solutions in mind:

  1. Change the parse tree so OpenACC constructs or DO loops are able to contain the compiler directives associated with them.
  2. Simply move irrelevant compiler directives before the OpenACC constructs. Basically, the needed information is preserved with this solution and can be used later.

Currently, I work on the second proposed solution because the first one seems more radical.

1 Like

It is mostly @clementval who works on OpenACC for Flang. Due to focusing on upstreaming he has not been able to work on it for close to an year now. Might restart soon now that upstreaming is close to completion. Valentin is currently away but will be back in the second week of June. Please wait.

  1. Can ivdep be converted to the independent clause. If it is semantically equivalent then this is a reasonable choice. Otherwise will this be an extension of the standard?
  2. If there are nested openacc directives will this involve movement across multiple directives? Also, this will necessitate the lowering of the ivdep directive being made aware of intervening openacc directives.
  3. Will another alternative be to not issue this semantic error in the presence of an ivdep? And then adding code in lowering to handle the nested ivdep.

1.

Can ivdep be converted to the independent clause. If it is semantically equivalent then this is a reasonable choice.

To be honest, I didn’t quite get what you mean here. Could you provide an example, please?

Otherwise will this be an extension of the standard?

This whole thing itself is an extension because:

  • Fortran 2018 standard says nothing about compiler directives - or I searched it in a wrong way so I couldn’t find a mention of them.
  • OpenACC 3.1 specification states (2.9):

    The OpenACC loop construct applies to a loop which must immediately follow this directive.

So, it is an extension by OpenACC standard, yet supported by other compilers.

2.

If there are nested openacc directives will this involve movement across multiple directives?

No, just the one. Are there cases when it is needed?

Also, this will necessitate the lowering of the ivdep directive being made aware of intervening openacc directives.

Currently, there are no lowerings for any compiler directive AFAIK. LLVM upstream version has an internal compiler error when using ivdep and F18 fir-dev branch version has a warning that all compiler directives are ignored.

3.

If we just ignore the presence of a compiler directive between an OpenACC directive and DO loop, the ignored compiler directive will reside after OpenACC directive, because semantic checks put DO loop node inside OpenACC directive node. This behavior is written in canonicalize-acc.cpp file (there’s also a sketch of the parse tree before and after rewriting).

I suggest moving compiler directives before the OpenACC constructs because it seems when someone is going to implement ivdep (or any other) compiler directive it will be easier to find the clause ivdep associated with when the clause is next to the ivdep, not previous to.

And then adding code in lowering to handle the nested ivdep.

I do not think that nesting of ivdep or any other directive is an option. This requires rewriting parse tree structure and I do not think it’s worth it, especially for the feature that is ignored and not used yet.

—

Did I answer your questions or there are some things I missed?
Also, I prepared a draft revision as a proof of concept: âš™ D126649 [flang] Allow compiler directives between OpenACC constructs and loops.

I was asking whether the following

  !$acc parallel loop vector reduction(+:bar)
  !dir$ ivdep
  do i = 1, 100, 1

can be converted to (note the addition of the independent clause)

  !$acc parallel loop independent vector reduction(+:bar)
  do i = 1, 100, 1

and whether they are equal in semantics?

If we look at the following definition for ivdep, there is a requirement to keep the directive close to a loop. So how can we prefer one over the other to get a semantic check to pass?

Yes, but at some point when we lower either ivdep or the openacc directive lowering should be made aware of intervening directives, isn’t it?

Thanks. I was thinking that the ivdep might have been parsed as nesting inside the OpenACC directive. I missed the OpenACC canonicalization part.

I was asking whether the following

  !$acc parallel loop vector reduction(+:bar)
  !dir$ ivdep
  do i = 1, 100, 1

can be converted to (note the addition of the independent clause)

  !$acc parallel loop independent vector reduction(+:bar)
  do i = 1, 100, 1

and whether they are equal in semantics?

This would be actually nice.

Both independent clause and ivdep directive seem similar to one another. If I understand correctly, they both mean that there are no data dependencies between loop iterations, but their goals are different though. The goal of ivdep is to vectorize the loop and the goal of independent clause according to OpenACC spec (the citation is below) is to run iterations in parallel - does it also mean that vectorization is allowed? If vectorization is allowed, then I think ivdep can be converted to independent clause.

The independent clause tells the implementation that the loop iterations must be data independent, except for vars which appear in a reduction clause or which are modified in an atomic region. This allows the implementation to generate code to execute the iterations in parallel with no synchronization.

IMHO, the general solution is to prefer the standard directive (OpenACC) over non-standard one (ivdep in this case). Given that, there are several ways to solve the issue:

  1. (Current) Disallow any intervening directives.
  2. Allow directives to intervene with each other if a special flag is specified, e.g. -fallow-intervening-directives.
  3. Allow directives to intervene with each other with no special flag specified as an extension supported by default.

The first one is most correct one from the OpenACC standard point but users’ code is written relying on this extension already and I think flang should allow such behavior in some way or another.

The second one is the most compromise solution. It disallows non-standard behavior by default, but allows users to compile old code written in this way.

The third one can be chosen if we want to mimic other compilers behavior. It also can emit a warning that the code written in non-standard manner.

Let’s assume we decided that ivdep can be converted to independent clause. In this case we could:

  1. Collect all compiler directives that can be converted to OpenACC (and probably OpenMP) clauses and implement the conversions if both directives are associated with the same tree node, like DO loop.
  2. Emit a warning or a portability diagnostic that goes something like this:

    warning: ivdep directive following OpenACC loop directive is non-standard extension.
    note: ivdep can be converted to independent clause of the corresponding OpenACC directive.

Also, this could be expanded even to converting non-standard directives to standard ones. For example, we have OpenMP feature enabled and a single ivdep directive, ivdep could be converted to a dedicated !$omp simd directive.

Yes, you are right. I can come up with 2 solution how it can be achieved:

  1. Change the parse tree to set connections between directives: either put one inside the other one or just add some kind of reference between them.
  2. Use some kind of pattern-matching of the tree in lowering or semantic passes.

I think that the first one is more reliable yet harder to implement. The second one is easier to implement, but it is probably more fragile to changes in semantics checks.

If we can convert all of the most used non-standard directives to OpenMP / OpenACC clauses / directives, then we do not need to solve this issue. But it is rather impossible.