We, IBM XL Fortran compiler team, is interested in representing Fortran alias information in LLVM IR. We use the XL Fortran frontend to emit LLVM IR that includes alias information to feed to the LLVM in order to create object files. For the Fortran alias representation in LLVM IR, we considered both TBAA and ScopeAlias/NoAlias metadata approaches, we think that the ScopeAlias/NoAlias metadata is more appropriate for refined alias information for Fortran. The XL Fortran frontend emits the alias info in terms of what other symbols that a symbol alias to. We experiment a scheme that represents the alias relation in terms of noalias and scope alias metadata in LLVM IR. An example is shown in the attached slides and the full .ll file for the example is also attached.
In this experiment, we observe that the performance gain varies from workload to workload, and the extent can be from a few percent to 2X. The compile time and the size of the IR increase as well.
We briefly investigated the possible causes of the long compile time and the large IR size issues. For the compile-time performance, we observe:
- Each alias query (ScopedNoAliasAAResult::mayAliasInScopes) involves partitioning a metadata set based on the domains of the metadata elements. One possible solution is that pre-partitioning the metadata sets and maintaining the partitions on updates can help.
- Intersection of noalias sets is O(n^2) as metadata elements do not have any ordering. Defining some order on the elements can help significantly.
- Some optimizations do not scale well when the size of the working instruction set increases, e.g. SCEV functions.
For the size of LLVM IR, the noalias metadata requires a flattened set of metadata nodes. A hierarchical representation can reduce memory footprint.
With these findings, we would like to start a thread to discuss how to express Fortran alias in LLVM IR. Any comments and information regarding any previous approaches are welcome.
Fortran alias in LLVM.pdf (213 KB)
example.ll (2.59 KB)