Annotating loops with codegen options

arpith · February 18, 2021, 2:59pm

It is sometimes beneficial to annotate loops generated in LLVM with metadata to control code generation. See llvm.loop, parallel_access, and friends.

We would like a way to do so from high-level MLIR constructs. E.g. scf::ParallelOp may lower to LLVM IR with the parallel_access annotation. This RFC proposes a way to control this generation of LLVM metadata using MLIR attributes. See also the associated code:
https://reviews.llvm.org/D96820

During lowering of ParallelOp or some similar op to the standard dialect we tag the branch instruction in the loop latch block with a loop name and associate it with a loop options attributes tagged on the module operation. This is carried through to lowering into the LLVM dialect as shown:

module attributes {llvm.loops = {loop1 = [
  #llvm.loopopt<parallel_access = true>,
  #llvm.loopopt<disable_unroll = true>,
  #llvm.loopopt<disable_licm = true>,
  #llvm.loopopt<interleave_count = 1>]}} {
  llvm.func @loopOptions(%arg1 : i32, %arg2 : i32) {
      %0 = llvm.mlir.constant(0 : i32) : i32
      llvm.br ^bb3(%0 : i32)
    ^bb3(%1: i32):
      %2 = llvm.icmp "slt" %1, %arg1 : i32
      llvm.cond_br %2, ^bb4, ^bb5 {llvm.loop = "loop1"}
    ^bb4:
      %3 = llvm.add %1, %arg2  : i32
      llvm.br ^bb3(%3 : i32) {llvm.loop = "loop1"}
    ^bb5:
      llvm.return
  }
}

This information is used during translation to generate the required metadata that is associated with the branch instruction. E.g:

!7 = distinct !{!7, !8, !10, !11, !12}
!8 = !{!"llvm.loop.parallel_accesses", !9}
!9 = distinct !{}
!10 = !{!"llvm.loop.unroll.disable"}
!11 = !{!"llvm.licm.disable"}
!12 = !{!"llvm.loop.interleave.count", i32 1}

Feedback on this approach is appreciated. If there is a better way to model this information in MLIR please let me know.

ftynse · February 18, 2021, 3:21pm

I think this is a reasonable modeling that does not require introducing new core concepts into MLIR.

If we were to introduce new concepts, we could add a mechanism to generate uninspectable “token” attributes, unique in context. Such attributes can be used as identifiers inside other attributes, e.g. for loops here. In printed form, they would be identified by some transient names that are generated by the printer (alternatively, the attribute can store a unique increasing value with the context having a counter of such values).

mehdi_amini · February 18, 2021, 6:12pm

It seems to me that you’re the first one to hit the issue of the discrepancy between how metadata are handled in LLVM and what we have in MLIR so far. This is interesting and thanks for bringing this up !

In particular, LLVM has the concept of “distinct” metadatas, which are stored in the context and share its lifetime, but aren 't uniqued there.
We don’t have this in MLIR, and you’re trying to somehow reproduce this storage as a custom dictionary attribute on the module with string-stitching to keep references from every operations.

Unfortunately this seems quite ad-hoc to me and reminds me of the “JSON of compiler” analogy we had in many slides presenting the concept of MLIR. As such I’m not convinced by this design at the moment:

Anyone manipulating the module has to know about the semantics of the attached attributes: they become load bearing and can’t be just dropped. This is in contrast to LLVM where the distinct metadata are attached to the operations directly and only the operation knows about them. We should avoid such use of attributes as much as possible. An LLVM Module linker does not need to handle such attributes in LLVM, and forcing it in MLIR is just indicating a scaling issue with the design.
The “string stitching” is a very fragile mechanism to maintain the reference: this is something that is out-of-band from anything that MILR can track (compared to symbolic references for example). I’m opposed to such modeling in general: this indicates a hole in the system that require design and such solution is a “non-design” approach (path of least resistance).

Since this won’t be the last time we hit the issue of distinct metadata when we reach the LLVM dialect, I’ll insist that we take the time to design it properly.
We could for instance (non-exhaustive list):

Extend MLIR Core, for example by adding the concept of “distinct attributes” in the MLIRContext directly to map what LLVM allows. We could also look into a refcounting system or something along these lines.
Not change MLIR and find a way to store it. Instead of string-stitching and out-of-band JSON-like references to maintain the integrity of the system, I’d favor stronger modeling, for instance using symbolic references to operations in a special llvm.metadata region living in the module.
Here is a quick draft of a modeling that relies on more structured relationship in MLIR:

module {
 llvm.func @loopOptions(%arg1 : i32, %arg2 : i32) {
      %0 = llvm.mlir.constant(0 : i32) : i32
      llvm.br ^bb3(%0 : i32)
    ^bb3(%1: i32):
      %2 = llvm.icmp "slt" %1, %arg1 : i32
      llvm.cond_br %2, ^bb4, ^bb5 {llvm.loop @_metadatas::@loop1}
    ^bb4:
      %3 = llvm.add %1, %arg2  : i32
      llvm.br ^bb3(%3 : i32) {llvm.loop = @_metadatas::@loop1}
    ^bb5:
      llvm.return
  }
 llvm.metadatas @_metadatas {
  llvm.loop @loop1 { parallel_access = true, disable_unroll = true, disable_licm = true, interleave_count = 1 }
 }
}

The use of an operation to contain the LLVM metadata feels more structured and generalizes better the concept of distinct metadata, and provide a unique point of entry for verification and manipulation. This is a technique we used for shape functions and in other similar situation. Relying on symbolic references instead of custom strings leverages the existing MLIR infrastructure instead of reimplementing one, allowing for example auto-renaming of the metadata when inserting a new one, better verification, reuse of the existing use-def infra associated with symbolic reference.

mehdi_amini · February 18, 2021, 7:45pm

@River707 pointed to me that LLVM LangRef indicates that the !llvm.loop metadata isn’t intended to act as a unique identifier for loops? I’m a bit confused now where the distinct takes place then.
LLVM Language Reference Manual — LLVM 18.0.0git documentation :

Loop metadata nodes cannot be used as unique identifiers. They are neither persistent for the same loop through transformations nor necessarily unique to just one loop.

This would fit the use of regular attribute, we should dig a bit more into LLVM here.

mehdi_amini · February 19, 2021, 1:13am

Following up on llvm-dev@ FYI: [llvm-dev] !llvm.loop ID metadata clarification

mehdi_amini · February 19, 2021, 5:52pm

Michael confirmed that we don’t need distinct here, it seems like we can just use directly an attribute on the branch here without any indirection!

mehdi_amini · February 19, 2021, 9:03pm

To clarify my last comment, that means we can safely do:

module {
  llvm.func @loopOptions(%arg1 : i32, %arg2 : i32) {
      %0 = llvm.mlir.constant(0 : i32) : i32
      llvm.br ^bb3(%0 : i32)
    ^bb3(%1: i32):
      %2 = llvm.icmp "slt" %1, %arg1 : i32
      llvm.cond_br %2, ^bb4, ^bb5 {llvm.loop = { parallel_access = true, disable_unroll = true} }
    ^bb4:
      %3 = llvm.add %1, %arg2  : i32
      llvm.br ^bb3(%3 : i32) {llvm.loop = { parallel_access = true, disable_unroll = true} }
    ^bb5:
      llvm.return
  }
}

mehdi_amini · February 20, 2021, 6:42pm

Actually, parallel_accesses isn’t a boolean, it points to a distinct empty metadata which represents an “access group”: LLVM Language Reference Manual — LLVM 16.0.0git documentation

Memory accesses (load/store) can also point to the same access group metadata, in which case the loop does not carry a dependency for such memory accesses.

We’re back to the need of maintaining a “unique identifier” (for the access group) that has to be shared by a specific set of operations.

arpith · March 3, 2021, 6:08am

Updated https://reviews.llvm.org/D96820 to use an llvm.metadata operation as described in #3.

Topic		Replies	Views
Loop Metadata? LLVM Dev List Archives	6	172	February 12, 2012
[RFC] Attribute interface for loop annotation metadata MLIR mlir	0	142	March 18, 2025
Parallel Loop Metadata LLVM Dev List Archives	68	526	March 12, 2013
RFC: Use Attributes to Model Distinct LLVM Metadata Nodes MLIR	15	1197	June 6, 2023
RFC: [PATCH] parallel loop metadata LLVM Dev List Archives	2	115	February 7, 2013

Annotating loops with codegen options

Related topics