[RFC][Flang] Representation for objects inside physical storage

szakharin · August 22, 2025, 11:26pm

Hello,

I would like to propose and discuss representation of objects inside the physical storage to aid the generation of more optimal TBAA in Flang.

This came up during my attempt to generate distinct TBAA sub-trees for variables inside COMMON blocks: [flang] Create TBAA subtree for COMMON block variables. by vzakhari · Pull Request #153918 · llvm/llvm-project · GitHub

For the following Fortran code:

common /blk/ a, b, c
a = b + c

I would like to create TBAA tags using different sub-tree roots to make it clear that a, b, c cannot alias. The TBAA tree structure for the particular function containing a = b + c would look like this:

  "global data"
  |
  |- "blk_"
     |
     |- "blk_/a" <- TBAA tag for a uses this root
     |- "blk_/b" <- TBAA tag for b uses this root
     |- "blk_/c" <- TBAA tag for c uses this root

There is a problem, though, with variables participating in EQUIVALENCE statements, e.g.:

module data2
  real :: glob1, glob2, glob3
  equivalence (glob1, glob2)
  common /glob1c/ glob1, glob3
end module data2
subroutine test2
  use data2
  glob1 = glob2 + glob3
end subroutine test2

glob1 and glob2 alias and reside at the same 0 offset within glob1c storage, and they are repsented with different fir.declare operations at the point Flang assigns TBAA tags:

  fir.global common @glob1c_(dense<0> : vector<8xi8>) {alignment = 4 : i64} : !fir.array<8xi8>
  func.func @_QPtest2() {
    %1 = fir.address_of(@glob1c_) : !fir.ref<!fir.array<8xi8>>
    %2 = fir.convert %1 : (!fir.ref<!fir.array<8xi8>>) -> !fir.ref<!fir.array<?xi8>>
    %3 = fir.coordinate_of %2, %c0 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
    %4 = fir.convert %3 : (!fir.ref<i8>) -> !fir.ptr<f32>
    %5 = fir.declare %4 {uniq_name = "_QMdata2Eglob1"} : (!fir.ptr<f32>) -> !fir.ptr<f32>
    %6 = fir.declare %4 {uniq_name = "_QMdata2Eglob2"} : (!fir.ptr<f32>) -> !fir.ptr<f32>
    %7 = fir.coordinate_of %2, %c4 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
    %8 = fir.convert %7 : (!fir.ref<i8>) -> !fir.ref<f32>
    %9 = fir.declare %8 {uniq_name = "_QMdata2Eglob3"} : (!fir.ref<f32>) -> !fir.ref<f32>

Flang’s AddAliasTags pass discovers that the source of all the loads and stores is the global glob1c_, but there is little information available about the overlapping (if any) between the members of the physical storage. Flang Lowering deliberately using !fir.ptr type for the fir.declare operations related to the member variables that participate in any EQUIVALENCE. I used this fact to generate TBAA tags like this:

  "global data"
  |
  |- "glob1c_" <- TBAA tags for glob1 and glob2 use this root
     |
     |- "glob1c_/glob3" <- TBAA tag for glob3 uses this root

This works, but it is suboptimal, because neither glob1 nor glob2 alias with glob3, whereas TBAA does not convey this information. The actual TBAA representation I want to have is the following:

  "global data"
  |
  |- "glob1c_"
     |
     |- "glob1c_/glob3" <- TBAA tag for glob3 uses this root
     |- "glob1c_/<TBD>" <- TBAA tag for glob1 and glob2 use this root

Basically, I need to create subsets of the members within a single COMMON block, where:

In each subset, a variable overlaps with at least one another variable in the subset.
Two variables from different subsets never overlap.

This will allow me to create different TBAA sub-tree for each subset.

I do not think we can generate better TBAA for variables withing the subsets. At least Clang is not generating any useful TBAA for C unions, where the same [partial ]overlapping of the members happen.

I think recognizing the fir.coordinate arithmetic in the AddAliasTags to discover the overlapping members is not the right approach. Flang FE has all the information about the offsets of the members within the storage, so I think we should propagate it up to AddAliasTags pass. Other passes may also benefit from it, e.g. AddDebugInfoPass is processing fir.coordinate arithmetic currently and it is only looking at particular sequence of operations, which is not very reliable.

I am suggesting adding a AnyReferenceLike operand and an IntegerAttr to [hl]fir.declare that will represent a reference to the beginning of the physical storage and the byte offset within the storage where the variable starts. The pair is optional, and its absence means that the variable’s location is known to reside at offset 0 in memory as defined by [hl]fir.declare’s memref.

Example for the above case:

    %1 = fir.address_of(@glob1c_) : !fir.ref<!fir.array<8xi8>>
    %2 = fir.convert %1 : (!fir.ref<!fir.array<8xi8>>) -> !fir.ref<!fir.array<?xi8>>
    %3 = fir.coordinate_of %2, %c0 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
    %4 = fir.convert %3 : (!fir.ref<i8>) -> !fir.ptr<f32>
    %5 = fir.declare %4 (storage %1+0) {uniq_name = "_QMdata2Eglob1"} : (!fir.ptr<f32>) -> !fir.ptr<f32>
    %6 = fir.declare %4 (storage %1+0) {uniq_name = "_QMdata2Eglob2"} : (!fir.ptr<f32>) -> !fir.ptr<f32>
    %7 = fir.coordinate_of %2, %c4 : (!fir.ref<!fir.array<?xi8>>, index) -> !fir.ref<i8>
    %8 = fir.convert %7 : (!fir.ref<i8>) -> !fir.ref<f32>
    %9 = fir.declare %8 (storage %1+4) {uniq_name = "_QMdata2Eglob3"} : (!fir.ref<f32>) -> !fir.ref<f32>

Another example for local EQUIVALENCE variables:

subroutine test()
  real :: local1(10), local2(10)
  equivalence(local1(2), local2(5))
end subroutine test

HLFIR:

    %1 = fir.alloca !fir.array<52xi8> {uniq_name = "_QFtestElocal1"}
    %c12 = arith.constant 12 : index
    %2 = fir.coordinate_of %1, %c12 : (!fir.ref<!fir.array<52xi8>>, index) -> !fir.ref<i8>
    %3 = fir.convert %2 : (!fir.ref<i8>) -> !fir.ptr<!fir.array<10xf32>>
    %c10 = arith.constant 10 : index
    %4 = fir.shape %c10 : (index) -> !fir.shape<1>
    %5:2 = hlfir.declare %3(%4) (storage %1+12) {uniq_name = "_QFtestElocal1"} : (!fir.ptr<!fir.array<10xf32>>, !fir.shape<1>) -> (!fir.ptr<!fir.array<10xf32>>, !fir.ptr<!fir.array<10xf32>>)
    %c0 = arith.constant 0 : index
    %6 = fir.coordinate_of %1, %c0 : (!fir.ref<!fir.array<52xi8>>, index) -> !fir.ref<i8>
    %7 = fir.convert %6 : (!fir.ref<i8>) -> !fir.ptr<!fir.array<10xf32>>
    %c10_0 = arith.constant 10 : index
    %8 = fir.shape %c10_0 : (index) -> !fir.shape<1>
    %9:2 = hlfir.declare %7(%8) (storage %1+0) {uniq_name = "_QFtestElocal2"} : (!fir.ptr<!fir.array<10xf32>>, !fir.shape<1>) -> (!fir.ptr<!fir.array<10xf32>>, !fir.ptr<!fir.array<10xf32>>)

Having this information, I can construct the subsets of overlapping COMMON members (the size of the members is always static).

Please let me know if you see any issues with that representation or if you can think of a different representation that might be more useful in other parts of Flang.

Thank you!

szakharin · August 22, 2025, 11:27pm

@tblah @abidh FYI

szakharin · August 25, 2025, 11:54pm

Here is the initial PR modifying the operations’ definitions: [flang] Added storage specification for [hl]fir.declare. by vzakhari · Pull Request #155325 · llvm/llvm-project · GitHub

tblah · August 26, 2025, 10:11am

This looks great to me. TBAA can describe offsets of structure members although I am not sure how easy that would be to take advantage of given the atypical way we are using TBAA. I think your solution is better anyway because it is more consistent with what already exists.

abidh · August 26, 2025, 3:57pm

This looks good to me. As you mentioned, AddDebugInfo pass could use this information and get rid of the manually traversing the IR.

jeanPerier · September 2, 2025, 2:07pm

+1, thanks @szakharin !

The “global data” entry in your example is under the current function TBAA tree or in a tree at the compilation unit level? I am asking to check if you need to deal with the cases where a same common block is defined with different members in different functions of the same compilation unit.

szakharin · September 2, 2025, 6:33pm

Yes, Jean, the “global data” entry is under the function’s TBAA root, as usual.

szakharin · September 3, 2025, 5:56pm

For information, there are two patches in review:

github.com/llvm/llvm-project

[flang] Create TBAA subtree for COMMON block variables.

main ← vzakhari:tbaa_common2

opened 12:16AM - 03 Sep 25 UTC

vzakhari

+1058 -40

In order to help LLVM disambiguate accesses to the COMMON block variables, this… patch creates a TBAA sub-tree for each COMMON block, and the places all variables belonging to this COMMON block into this sub-tree. The structure looks like this: ``` common /blk/ a, b, c "global data" | |- "blk_" | |- "blk_/bytes_0_to_3" |- "blk_/bytes_4_to_7" |- "blk_/bytes_8_to_11" ``` The TBAA tag for "a" is created in "blk_/bytes_0_to_3" root, etc. The byte spans are created based on the `storage` information provided by `fir.declare` (#155742). See also: https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026

Topic		Replies	Views
[RFC] Propagate FIR Alias Analysis Information using TBAA Flang	8	798	April 28, 2025
RFC: Representing unions in TBAA LLVM Dev List Archives	92	826	August 23, 2017
RFC: Generate plain !tbaa tags in place of !tbaa.struct ones LLVM Dev List Archives	7	176	December 7, 2017
Request for comments: TBAA on call LLVM Dev List Archives	21	232	October 13, 2013
An ambiguity in TBAA info format LLVM Dev List Archives	5	144	November 2, 2017

[RFC][Flang] Representation for objects inside physical storage

Related topics