[RFC] Proprosition of coarray implementation in LLVMFLang

Hi All,

Introduction

The Fortran 2008 standard has introduced a new form of parallelism named “Coarray” that is integrated directly into the Fortran language and based on the SPMD model.
The Fortran 2018 standard extends coarrays to include the notion of “team” and “collective” procedures.
The advantage of integrating the use of coarrays is that only minor changes to existing Fortran code are required to take account of this parallelism.

GCC currently supports all coarray-related features from Fortran 2008 and 2018. The use of coarrays can be done by using one of the three modes, which can be selected with -fcoarray option at compile time.
- none: Disables features on coarrays and displays an error message when one of the coarray constructs is encountered.
- single: An optimized version of coarrays but using only 1 image.
- lib: A communication lib based coarray version where OpenCoarrays provides 3 backends (MPI, GasNet, ARMCI).
OpenCoarrays is an open-source library that produces an ABI used by GCC Fortran frontend to build programs with Coarray and that abstracts away the underlying communication library. [GitHub - sourceryinstitute/OpenCoarrays: A parallel application binary interface for Fortran 2018 compilers.]

Today in LLVMFlang can only parse and perform semantic analysis on code containing coarray features. Execution is not possible since instructions or intrinsics related to coarrays because they are not yet implemented in LLVM.

Proposal

I’m currently working on integrating the OpenCoarrays library into LLVM.
In the same way as GCC (explained above), I’d like to integrate the -fcoarray= option, which would take none, single and lib as arguments.

To compile a program in lib mode, in GCC we also need to link to -libcaf_mpi if we’re using the OpenCoarray’ MPI backend, or -libcaf_gasnet and -libcaf_mpi if we’re using the GasNet backend. Theses libraries need to be located in the LD_LIBRARY_PATH or in a path given by -L option.

However, GCC has a libcaf_single and libcaf_mpi in libgfortran which it can link to use coarrays. Flang doesn’t have these libraries. So It would be possible to add an option to select the backend has also been added like -fcoarray-backend= to tell LLVMFlang which of libcaf_mpi, libcaf_gasnet and other we should link and these libraries need to be in the LD_LIBRARY_PATH.

  • What is your feeling about this option?

Nowadays OpenCoarrays can only be built with GCC. When trying to build this library with LLVMFlang, we encounter problems due to the lack of implementation of coarray related features.

Or problems are related to the way OpenCoarrays has been developed, which implements certain functions depending on the version of GCC used.

So today I have proof-of-concept where a vast majority of the features from Fortran 2008 and some from Fortran 2018 are implemented using function calls from the OpenCoarrays library in the lib mode.
However, I’m working on a version of libcaf built with GCC and several symbols I use are prefixed with _gfortran_caf due to having been built with GCC.

  • Is it appropriate to use symbols prefixed with _gfortran_caf in LLVM, even if this is currently the only way to use Coarray functions from this library ?

  • Should we support flang-new by OpenCoarrays before integrating its use in LLVM ? But this can only be done if flang-new has implemented the basic functionality of coarrays ?

  • Can I first release a version that uses _gfortran_caf symbols ?

At the end of my work, it will then be possible to build OpenCoarrays library with LLVMFlang and with symbols that can be prefixed with _flang_caf for example.

Thank you in advance for your feedback.
Best regards,

Jean-Didier

5 Likes

I would like the coarray runtime to not be dependent on compiler flags, but rather runtime controls (environment variables, probably), except perhaps to support the special case where doing so eliminates all runtime library overhead and turns coarray operations into load-store (which does not imply single). One can create a shared-memory optimized runtime that has no library calls on the critical path because coarray references compile to load-store operations against shared-memory segments (or atomics, obviously, if appropriate). One can do this with MPI, for example, by using MPI_Win_allocate_shared rather than MPI_Win_allocate (although there is a slight change here - for the better - in MPI 4.1, which we will vote on this week).

In particular, if we support multiple runtimes, e.g. MPI, GASNet and/or OpenSHMEM, this choice should depend on the runtime environment. This is necessary to make it possible to have a single binary that works everywhere. We should not enable or motivate a situation where we have app_gasnet.exe, app_mpi.exe, etc. as this puts a greater burden on the end user and diminishes the portabililty of the Fortran coarray model.

We can observe the runtime dependent action in the GCC and LLVM implementations of parallel STL, which look for and use libtbb.so if available, otherwise they run sequentially. (At least this has been the behavior in the past - perhaps it has changed.) I do not love the silent failure to run in parallel but it is better than having binaries that do not work at all. For coarray programs, one can of course verify the use of parallelism with the image count.

I would prefer to not use the gfortran namespace anywhere. If we use OpenCoarrays, we should figure out different entry points, to make it unambiguous that nothing depends on the GCC license.

For context, I have contributed to the development of MPI 3+ RMA, OpenSHMEM, ARMCI-MPI, Global Arrays, OpenSHMEM-over-MPI, and - to a small degree - both OpenCoarrays and Intel’s implementation of coarray Fortran. On the other hand, I know virtually nothing about compilers. I hope that once somebody else defines a coarray Fortran runtime library API that the compiler uses, I will be able to contribute to the MPI and possible OpenSHMEM implementation of said library.

Finally, while my employer is heavily involved in the Flang project, these comments are entirely my own.

Hi Jean-Didier,

Welcome to flang! Thank you for offering to help with coarrays!

I think the various components of a coarray implementation are:

  1. Parsing

  2. Semantics

  3. Tests for parsing and semantics

  4. Design document for a coarray MLIR dialect, runtime model, and lowering

  5. Extensions of HLFIR and FIR for coarrays

  6. Lowering of coarrays to HLFIR and FIR

  7. Test for the MLIR lowering

  8. Implement/integrate the runtime library

  9. Unit tests for runtime API

  10. Lowering to the runtime API

  11. End-to-end tests for coarray in each model

  12. Driver options and linker pipeline

  13. Tests for driver and linker pipeline

To which of these steps do you intend to contribute and support?

Before going too far, I think we should have a document covering point 4, a design document for a coarray MLIR dialect, runtime model, and lowering.

A good model for this document is llvm-project/flang/docs/ParameterizedDerivedTypes.md

  • Steve

P.S. If you’d like to split up some work, let us know what you are interested in doing.

1 Like

Thanks @JDPailleux for the coarray proposal.

@rouson, @ktras, Brad and others have been working on the coarray implementation of Flang. Yesterday, in the Flang call @rouson mentioned that the plan is to use caffeine since it does not have dependencies on gfortran and also it works with multiple backends. I believe they will post a reply here. Just adding below a few documents that were submitted previously regarding their work.

Hi @jeffhammond, what you ennounce at the beginning for the choice to not depend on compiler flags and more is closer to Intel’s approach regarding coarrays ?
I also agree that we should avoid using the gfortran namespace and find a better entry point to remove this dependency. This was one of my fears for the acceptance of the use of gfortran symbols. Thank you

Thank you very much for your reply and for the links. I will have a look on the work done by @rouson and the others.

A coarray implementation that pushes all runtime model dependences into the runtime library would necessarily miss compile-time loop optimization opportunities that benefit the potentially highest-performing runtime models. I want those opportunities to be available, at least by way of an option, even when it would restrict execution to fewer models (or to one).

The amount of work remaining to complete semantics for coarrays seems to overshadow any of the other tasks, but nobody has contributed to that area in a long time.

@JDPailleux, thanks for your interest in LLVM Flang and OpenCoarrays. I’ve led the OpenCoarrays project since inception, although I wrote only a tiny fraction of the core OpenCoarrays library and have contributed mainly through fundraising and writing the installer. I now lead the project that contributes to LLVM Flang at Berkeley Lab.

Overall, I recommend against incorporating OpenCoarrays into LLVM Flang. Our current thinking on a path forward for developing a runtime library supporting parallel Fortran 2018 features is summarized in the following paper paper: “Caffeine: CoArray Fortran Framework of Efficient Interfaces to Network Environments”. Subsequent to that paper being presented, we started working on a compiler-independent and runtime-independent Coarray Fortran Parallel Runtime Interface (CAF-PRI) Design Document as cited in this thread by @kiranchandramohan . Importantly, OpenCoarrays is neither compiler-independent nor runtime-independent. The gfortran dependence is not just about a namespace: roughly half of the functions in OpenCoarrays require a gfortran descriptor that is not fully compliant with the Fortran 2018 standard CFI_cdesc_t descriptor FWIW. Also, as noted in several OpenCoarrays README files, the only currently maintained communication layer is the one based on MPI. The others are present only for archival purposes because they are cited in publications. Finally, OpenCoarrays lacks full support for several Fortran 2018 parallel features.

@kiranchandramohan to clarify, the CAF-PRI Design Document under development is intended to facilitate the use of any backend that implements the interface. We would like to develop Caffeine as one such alternative based on the exascale networking middleware GASNet-EX, which is also developed here at Berkeley Lab and offers some performance benefits relative to MPI. However we expect others might contribute implementations of the interface targeting different communication layers.

I hope this helps and I look forward to further discussion.

Intel Fortran also has single/shared/distributed as a compile-time option (docs). I don’t love the Intel design because the interaction with MPI is annoying. The require Intel MPI and make it hard/impossible to use MPICH, even though the libraries are ABI-compatible.

The pattern so far has been to model programming models as MLIR dialects:
OpenMP, OpenACC, SYCL, MPI, …
coarray should follow that pattern.