[RFC] Allow symbol references to escape from SymbolTable

This RFC proposes to add a mechanism for inner references to symbols outside of a symbol table.

Background

Currently, it is impossible to reference from within a symbol table symbols that are outside that symbol table. From the symbol table documentation:

// This `func.func` operation defines a symbol named `symbol`.
func.func @symbol()

// Here we define a nested symbol table. References within this operation will
// not resolve to any symbols defined above.
module {
  // Error. We resolve references with respect to the closest parent operation
  // that defines a symbol table, so this reference can't be resolved.
  "foo.user"() {uses = [@symbol]} : () -> ()
}

However, it is often desirable to define scopes and nesting of symbols while still being able to navigate the parent scope. Many programming languages offer this feature in their item path references, which are hard to model in MLIR using symbols.

The issue has already been encountered in upstream MLIR, for example in IRDL:

irdl.dialect @some_dialect { // symbol table
    irdl.type @foo
}

// From outside the dialect definition, some_dialect.foo can easily
// be referenced.
%foo_type_from_outside = irdl.base @some_dialect::@foo

irdl.dialect @other_dialect { // symbol table
    irdl.type @bar

    irdl.operation @baz {
        // Referencing a type defined in the same dialect is possible.
        %bar_type = irdl.base @bar

        // Referencing another dialect is however impossible.
        %foo_type = irdl.base @??? // How to reference some_dialect.foo here?
        
        // ...
    }
}

Proposal

We propose a way for a symbol reference to target symbol tables that are the parent of the current symbol table. This is achieved by prefixing symbol references with a super-like construct, for which the proposed syntax is @.super. Symbol tables would no longer isolate inner symbol references to themselves.

A general-purpose super-like referencing construct has the unfortunate consequence of creating multiple paths to reference the same symbol. Indeed, symbol reference @foo::@bar would now be equivalent to @foo::@.super::@foo::@bar. To limit this effect and keep implementation simple, we propose that @.super may only be used at the beginning of a symbol reference. While @.super::@foo::@bar and @bar are still equivalent in a symbol table called @foo, the implementation does not need backward lookups. It may be interesting to nonetheless offer infrastructure to canonicalize symbols at a later time, as passes like CSE may not be able to prove equality between symbols otherwise.

The syntax for a symbol reference would now be:

symbol-part ::= `@` suffix-id | string-literal
symbol-ref-id ::= `@.super::`* symbol-part (`::` symbol-part)* 

Consider the following example:

module {
    module @foo {
        func.func @nested()
    }

    module @bar {
        "symbol_user.user"() {uses = [@.super::@foo::@nested]} : () -> ()
    }
}

The user operation here references the @nested function from the @foo module.

The resolution algorithm is unchanged, except that for each @.super:: prefix to the reference, the selected symbol table for resolution is one level higher in the parent hierarchy.

It may be important, for example for multithreading purposes, to ensure symbol references cannot escape certain scopes. This has so far been achieved by implicitly adding this property to the SymbolTable trait. We propose to decouple this by introducing an IsolatedSymbols trait. When going up the parent hierarchy to find which symbol table to start from, if an operation implementing IsolatedSymbols is encountered, an appropriate error is raised, informing that a symbol has attempted to cross the symbol isolation boundary. This not only allows to have symbol tables that do not isolate symbols, but it also enables operations that are not symbol tables to act as boundaries.

In the scope of this RFC, IsolatedSymbols still allows accessing inner symbols from outside, in order to mimmick current symbol resolution behavior. It is not clear if it is useful to introduce visibility modifiers to let users disallow this.

Implementation

The implementation will be carried out at once, where:

  • The SymbolRefAttr attribute will be updated to support prefix @.super.
  • The IsolatedSymbols trait will be created and manually added to all upstream operations with the SymbolTable trait to avoid breaking any previous assumption.
  • The symbol resolution algorithm will be updated to skip as many symbol tables as required by the references, checking for the presence of IsolatedSymbols operations.

Existing upstream users of SymbolTable that desire removing the isolation currently provided may drop the IsolatedSymbols trait afterwards.

Alternatives

This RFC makes symbol isolation explicitly opt-in via the IsolatedSymbols trait. However, while this is taken care of manually upstream, this may create inconvenience to downstream users as they may depend on the isolation in some way. The opposite approach could be considered, where operations with the SymbolTable trait could implement a TransparentSymbolTable trait that allows going to the parent via @.super. However, we believe that isolation being opt-in is a more natural default.

We would like to hear from users:

  • Does your use of symbol tables depend on the provided isolation?
  • Do you know of a better way to communicate the changes to downstream users if we keep the opt-in isolation approach?

Thank you for your attention.

1 Like

This has the potential to break many invariants in the model of symbols in MLIR (did we discuss this in person? I remember a discussion on this exact topic, maybe at EuroLLVM).
As such this is a can of worms that requires a proposal that first understand deeply the invariants of the current system (the current proposal apparently wseems to just consider this an arbitrary restriction that can be freely removed).

Right now, accessing symbols from a sibling table isn’t modeled natively in the system but enabled through external references: that is just like calling a function from another module in LLVM.

Isn’t it the “nested” visibility?

Which programming models? And also, you’re not making it very clear in the motivation why is it desirable to use MLIR native symbols for this purpose?
The alternative can also be to not rely on MLIR symbols for this kind of modeling.

1 Like

I recall relative and absolute symbol “paths” discussions (e.g., adding a .. or /), but yes not sure whom and where. I recall part discussion was had even when symbols were added initially.

Now, where this has also come up, the end result was often a pivot to multiple contexts/files and having a “top-level” marker followed by symbols. These are similar to what Mehdi mentions wrt sibling tables. It came up where folks were reaching for was a single, self-contained .mlir file but then ended up instead multiple contexts and files as the lifetime, reuse etc didn’t line up.

Thank you for the replies!

Could you detail this more? My understanding is that IsolatedSymbols addresses the cases where this feature may not be desirable by, in practice, locally reverting to the isolated model. In any case, I would like to know where this limitation is depended on. I know in IREE there are operations containing kernels that should be entirely separately compiled, for which I believe IsolatedSymbols is enough to cover.

Yes. We believe it would be an improvement to have a mechanism in MLIR to have the ability to model symbol hierarchies where sibling tables can be accessed.

No I am pretty sure we did not, I have only personally been interested in making this change in the past week.

Sorry, I meant add visibility modifiers to IsolatedSymbols. It is otherwise handled by visibility levels on the symbol operations themselves, hence why it is probably not necessary to add visibility modifiers on IsolatedSymbols.

The most obvious example that comes to my mind is Rust modules, in which in a single file you can reference items (such as functions, constants, types, traits, etc.) while navigating a hierarchy of modules. Python also has some of this, although I think it is limited to navigating file hierarchies.

In the example above, this sort of hierarchical access is used to reference things like functions, which in MLIR are modeled using symbols. What else would you use to reference functions from a call site in MLIR? For IRDL, what other infrastructure would be used to allow referencing other operations in a statically checked manner?

The reverse question is also interesting: what are nested MLIR symbols intended for if not this?

Could you detail what you mean by this? I am not sure I understand what one would be pivoting towards instead of symbol resolution?

IsolatedSymbols isn’t an answer: the invariant isn’t about “some projects depends on this” but rather the MLIR Core infra is built around this invariant.
I can’t provide a comprehensive list on top of my head, but at least it interacts with SymbolTable construction, as well as their invalidation.
At the moment a SymbolTable operation is also a “standalone” entity that is “moveable”, which wouldn’t be the case anymore (again the implications are hard to grasp).

Yes, one of the implications of this change is that symbol tables would not be movable by default, IsolatedSymbols would be. This RFC is about making this change, hence why we want to make sure it will not be too much of a breaking change in practice and to find the list of things that should be modified (this is why this RFC exists). Do you mean the symbol table infrastructure is frozen?

No: but to reiterate what I wrote before: profound changes to the infra require a lot of consideration of the existing invariants and all the implications.

Could you clarify what you meant here?

  • construction: right now building a SymbolTable walks an operation. Now for completeness it would need to walk up and down.
  • invalidation: right now when I have a SymbolTable object, it is valid as long as I don’t change any symbol nested under the associated symbol table operation. If I rename a symbol, I don’t need to think about a SymbolTable constructed on a sibling operation.

Ah I think I see what you mean. The way we wanted to implement this was by just walking up the parent chain on lookup, similarly to how getNearestSymbolTable does it. There would not be a need to cache anything in the symbol tables, and in fact the data structure would not change. Does that clarify your concerns?

It does not actually: this is overly expensive to do on a per-query basis.
Actually all the APIs like getNearestSymbolTable are problematic and we discussed quite a few times about deprecating these (or at least name-spacing them to make it very explicit about their cost and discourage their uses). Instead there are some options to making the SymbolTable more integrated to the PassManager in order to facilitate keeping them up-to-date through the pass pipeline.

Note also that I mentioned before that this is not intended to be an exhaustive list of issues.

Okay. We have a concrete use case for this in IRDL. Without this feature, this is the sort of solution we have. It feels very suboptimal as it requires arbitrarily moving the root of symbol lookups, which is hard to predict and inefficient.

Is this the sort of workaround you would recommend as an alternative to adding the proposed functionality? Or do you have a suggestion for a different API to upstream to MLIR proper?

Just to reiterate what Mehdi is saying: this is a very large change to the semantics – so much so that it seems closer to something else than a modification of the symbol mechanism.

It may not be obvious, but the current symbol system is very loosely coupled to MLIR: there is really nothing stopping someone from defining something else.

There are many kinds of scopes and nesting data structures in the universe, and I don’t think it is the job of this one to justify why it shouldn’t be a different one. I don’t have any insights into the needs of IRDL, but I can tell you that the invariants that SymbolTable currently provides are very important for performance of the compilers we build with MLIR, and I think they are about as general as they can be while still enabling that performance. The design isn’t frozen, but it would be a high bar to justify dramatic changes at this point.

A half step to something else may be a different kind of symbol ref attr and a different mechanism for resolution/verification than SymbolTable.

2 Likes

I agree with @Moxinilian that there are weaknesses with the upstream SymbolTable implementation and they are reflected in the motivation in this RFC. E.g. the inability to refer to symbols in certain contexts.

However, the base infrastructure (interfaces, lookup helpers, SymbolTableCollection, etc) is flexible enough to implement different lookup semantics on top of it without reinventing the whole world. This is to @stellaraccident’s point to roll-your-own symbol tables. But I’m not a huge fan of people solving the same problems downstream. Perhaps we can cleanly separate this “base infrastructure” from the APIs that implement lookup semantics to provide multiple symbol reference semantics upstream? I would love to have absolute references, personally.

I’ve recently been thinking about this, because I’ve been trying to work out how I would implement something other than the current symbol infrastructure that would allow modeling a “normal” module system in MLIR, while retaining some of the nice aspects of the current system.

The current situation, I think, can be summarized as follows:

  • Symbols belong to SymbolTables, which are themselves Symbols, and thus Symbols are by definition namespaced. From a scope containing some SymbolTable, one refers to some Symbol in that table using its qualified (namespaced) name, e.g. @foo::@bar
  • It is not currently possible to reference a Symbol in a sibling/ancestor SymbolTable. Instead, the way this gets modeled is via a “declaration” of the desired Symbol in the current SymbolTable. There are a couple problems that fall out from this:
    1. This “declaration” must encode the fully-qualified name of the “real” Symbol, so that whatever is responsible for making sense of references to the declaration, can work out what the reference means (i.e. how to actually link against that symbol). As a result, in order to understand what the meaning of a given symbol name is, you need to first determine if it is a declaration or not, and if so, have some out-of-band rule for how to recover the information you need from the symbol string.
    2. Because you are stuffing the fully-qualified external symbol name into what is supposed to be an unqualified symbol name, this leads to the declaration having two different names, the one used within the current symbol table, and the implicit namespaced name in the IR that is derived from the relationship between SymbolTables and their Symbols.
    3. As a consequence of 2, it is not possible to re-export symbols that were “imported” from somewhere else, e.g. as seen in the Wasm Component Model.
  • Because of the inability to directly reference Symbols in sibling/ancestor SymbolTables, you lose some nice things:
    • Consistent naming that is represented in the hierarchy of SymbolTables and their Symbols
    • The ability to validate that a symbol reference is allowed by the visibility assigned to that symbol. You can trivially “declare” a symbol to exist, and you won’t know that the reference is invalid until link-time. That’s not ideal.
    • The builtin tracking of symbol users, and the interaction with dead-code elimination
    • A single canonical definition for important details about a given Symbol (type information, ABI details, etc.), and the interaction with the CallOpInterface and CallableOpInterface. IIRC, the dataflow analysis framework leverages those interfaces, and information about Symbol usage. If you build your own thing, you either have to reinvent the wheel there, or maintain a fork of the core infra with whatever modifications you need to make.

The result is that the current system works pretty well for modeling a single module/namespace, but not much else. It can handle references into nested modules, but provides no way to do the inverse, so there is not even an option to represent cross-module references via explicit import/export-like operations. Ultimately, it is of quite limited utility in modeling anything high-level of any significant complexity (e.g. Rust’s module system). It matches LLVM’s notion of symbols just fine, but LLVM isn’t designed to model higher-level abstractions, while MLIR is.

An ideal system would:

  • Be consistent, i.e. you don’t have Symbol names which mean different things depending on whether the Symbol is a declaration or not. One should be able to model their namespacing hierarchy using SymbolTable and Symbol.
  • Be able to model externally-defined Symbols using declarations, similar to today, but by placing those declarations at appropriate places in the SymbolTable hierarchy, rather than duplicating them in every SymbolTable where they might be referenced
  • Provide the means for referencing Symbols given a fully-qualified name (aka absolute path), in addition to unqualified references. This would at least allow frontends to produce qualified names during lowering.
  • Provide the ability to track symbol usage and participate in inter-procedural analysis, even across SymbolTables
  • Integrate with the various relevant core interfaces, e.g. CallOpInterface
  • Permit separate/concurrent compilation. This is the trickier part, and may really be the aspect that presents the most issues. IMO, a simple way to enable this, is to require each context to have its own view of the IR that it cares about, i.e. you’d model everything as externally-defined, but using the standard SymbolTable/Symbol hierarchy. This would require some work at link-time to incorporate information about symbol usage (i.e. removing unused symbols), and/or to perform inter-procedural optimization across codegen units (by merging the IR into a single context and running whatever optimizations you want to run at that point) - but at a minimum it provides the flexibility to decide how to slice up your IR to balance concurrency and optimization opportunities, while retaining a single system for symbols/symbol resolution.

Some other aspects I struggle with when trying to come up with a unified design for this:

  1. You need some kind of top-level SymbolTable-like thing that represents the global namespace in which all symbols are expected to be resolvable. This thing doesn’t need a name, which could imply that some SymbolTable impls are not also Symbols. Having some kind of top-level operation is needed so that you have a place to start resolving a fully-qualified symbol name from, while being able to identify when a given symbol is undefined. It might be possible to avoid this by assuming that, if you’ve reached a SymbolTable operation with no parent, then the first symbol in the path must either refer to that operation, or by definition refers to an externally-defined symbol that is unknown (because we cannot resolve it to anything).
  2. What does a symbol reference actually concretely consist of. Is it just a vector of symbol names, where a vector of length 1 implies an unqualified symbol, and anything else is a fully-qualified symbol, resolved by starting from the root namespace and resolving each symbol in the vector in order? Or is it something more flexible in that, so as to support relative references in addition to absolute (or local) references?
  3. How to represent re-exported, externally-defined symbols? I guess these would amount to aliases of some kind. Having some facility for this might be an alternative approach to representing these kind of relationships, but I’m not sure what that would look like concretely. Supporting this is maybe less important, but might be worth examining to see if it provides an interesting alternative.

Anyway, definitely interested if anyone has made any private progress on this, or just get some more discussion going on this topic. I don’t expect anything to happen on this front for some time, but would like to explore various approaches and see if something better could come out of it in the end.

1 Like

One more aspect that I would inject in the design: if I remember correctly we also modeled it to mesh well with MLIR Pass-Manager multi-threading. Basically this aligns with the SSA value model in MLIR (IsolatedFromAbove is the level of isolation there): the symbol infrastructure has a level of isolation that means that you can anchor the analysis/transformation on a SymbolTable operation and be isolated from the code running concurrently on sibling SymbolTable operations while using the symbol table infrastructure safely.

Yeah that’s basically what I was getting at with my point about an ideal system permitting separate/concurrent compilation.

The approach that I suggested might work, i.e. requiring each codegen unit to contain its own view of the world - would allow a lot of flexibility for deciding where you want the concurrency boundary to be, so you can balance the level of concurrency with the level of optimization that is possible. I should clarify that by codegen unit, I mean the subset of IR that is being compiled by a single thread.

As an aside: I can’t imagine there is a situation where you’d want to choose an codegen unit boundary that doesn’t fall on a IsolatedFromAbove operation, since what is being modeled here are symbols, and IIRC, Symbol impls must also be IsolatedFromAbove. The interaction with the pass manager is something I hadn’t considered too closely though, as any design here would need to incorporate its behavior, or change it as well. The approach I’m describing here is sort of premised on the idea that each codegen unit corresponds to the unit of concurrency, each codegen unit gets its own pass manager, and that pass manager does not provide any concurrency on its own, unless specifically configured do to so. I’m not sure how much I like the idea of the pass manager making decisions like that on its own anyway, but regardless, there is already an implicit footgun there, as you noted.

But to illustrate what I mean by each codegen unit having its own view of the world, consider the following examples:

First, here’s a rough sketch of structuring “everything in a single thread”, i.e. you want maximum optimization, and don’t care about concurrency. I’m using MLIR-ish syntax and operations here, but mostly just to convey the basic idea in an easier to grasp representation:

/// The `world` operation is just intended to represent an unnamed 
/// container from which all absolute paths are resolved, i.e. the 
/// root namespace. Obviously this doesn't exist today, and doesn't 
/// have to even be a builtin, just something that provides those 
/// semantics
world {
    /// Here we're modeling an externally-defined, separately compiled
    /// library as a module consisting only of declarations.
    ///
    /// Because we can track whether there are any symbols in this module
    /// with uses, we can leverage that information at link-time to remove 
    /// the dependency if it is not needed for whatever reason, possibly as 
    /// the result of dead code elimination
    module $external_lib {
        pub func $external_func(%0 i32) -> i32;
    }

    /// Then we have a couple of modules illustrating the use of both
    /// unqualified and absolute symbol references, for intra- and 
    /// inter-module references.
    ///
    /// Because all of the definitions are available, the compiler is able to fully
    /// optimize both locally and inter-procedurally, the only exception being
    /// the one call to an externally-defined function.

    module $a {
        pub func $foo(%0 i32) -> i32 {
            %1 = call @increment(%0)
            ret %1
        }

        func $increment(%0 i32) -> i32 {
            %1 = call @external_lib::@external_func(%0)
            ret %1
        }
    }

    module $b {
        pub func $bar(%0 i32) -> i32 {
            %1 = call @a::@foo(%0)
            ret %1
        }
    }
}

That’s perfectly fine for a small program, but what about real-world programs, where there are many modules and functions can be large, and we want to compile them concurrently in order to keep compilation fast, without sacrificing too much in performance?

One way to slice up the previous IR for that purpose would be to compile each module separately. This would look very similar, but with some key differences:

NOTE: Each world here is assumed to be compiled in a separate thread, I’m just placing them together in the same code block to make comparison easier:

/// This world represents everything needed by module `a`, as declarations
world {
    module $external_lib {
        pub func $external_func(%0 i32) -> i32;
    }

    module $a {
        pub func $foo(%0 i32) -> i32 {
            %1 = call @increment(%0)
            ret %1
        }

        func $increment(%0 i32) -> i32 {
            %1 = call @external_lib::@external_func(%0)
            ret %1
        }
    }
}

/// This world represents everything needed by module `b`
world {
    module $a {
        pub func $foo(%0 i32) -> i32;
    }

    module $b {
        pub func $bar(%0 i32) -> i32 {
            %1 = call @a::@foo(%0)
            ret %1
        }
    }
}

So hopefully it is readily apparent that, while this reduces the opportunity for inter-procedural optimization across modules, it retains module-local inter-procedural optimizations, as those are most likely to be the biggest wins. In exchange, we’re able to double the throughput of compilation for the original example (obviously in general, the degree of improvement is dependent on the number of threads, the number of modules, amongst other things).

This technique can be further extended to allow for function-level concurrency as well. You’d sacrifice all inter-procedural optimization opportunity, and it would require a lot of duplicate IR objects to represent each function’s “world”, but you’d gain a significant amount of additional concurrency.

The two main tradeoffs with this idea, as I see it, are:

  1. You basically need to decide how you are going to slice up the IR for concurrency a priori, because your frontend will need to lower into MLIR accordingly. It’s not impossible to make the choice dynamic, but would require the frontend to know how to extract declarations needed to build a “world” for whatever operation you decide is going to represent a codegen unit
  2. The backend, if it wants to incorporate symbol usage information, and/or perform inter-procedural optimizations at the very end of the pipeline, will need to merge “worlds” to do so (or in the case of symbol usage, at least accumulate symbols and usage information from each “world”, and use that to make decisions about whether a given symbol is needed or not). This seems fine to me? I could see it being prohibitively expensive to do this, given that late-stage inter-procedural optimization is much less likely to be valuable compared to earlier in the pipeline, but on the same token, that feels like part of the calculus for where you draw the codegen unit boundary.

It’s probably also worth mentioning that, while I use world, module and function operations in my example and descriptions, I picked those mostly arbitrarily, since I think they are easier to reason about than something more abstract. The only thing in here that doesn’t really have a corollary today is world, which amounts to a SymbolTable operation that has no parent, and is not itself a Symbol, and essentially acts as a root for symbol resolution of fully-qualified symbols. Maybe a different trait is needed for that, but I could also imagine an argument being made that maybe SymbolTable impls shouldn’t be required to also be Symbol impls (i.e. an anonymous SymbolTable would behave like introducing a new lexical scope, and you would not be able to reference the symbols it contains from outside). Then, an anonymous SymbolTable with no parent by definition requires all symbols to be resolvable within its body.

Sorry for the brain dump, hopefully it is useful!

2 Likes