Caching dialect specific types?

One nice thing that builtin types enjoy is caching for important singletons, like empty strings, empty dictionaries, the bool attributes etc.

Is there a way to do this for dialect specific types? In circt, the firrtl.uint<1> type is the vastly most commonly used type, and I’d love for UIntType::get(context, 1) to be able to do a few pointer dereferences instead of diving into the guts of the uniquing logic.

I could store this in a global variable, but that wouldn’t work for the (fairly academic :slight_smile: case where you have multiple MLIRContext’s floating around. Is this a thing in MLIR already, or is this an unpaved road?

-Chris

What I’ve done in the past is cache them in the dialect they belong to. Haven’t looked to see if we can do something more scientific, but it’s generally worked alright when I’ve done it.

– River

Cool, adding them to the relevant dialect’s object would be great. Is there a good way to get that from an MLIRContext? Did you just use context->getLoadedDialect<FIRRTLDialect>()? it looks like that goes through a string hash table.

Hmm, I could have swore that we were using the TypeID of the dialect class to do the lookup when possible. Maybe we should add a TypeID->Dialect* mapping in the context? We do dialect lookups quite often in certain situations, and I’d prefer that to not be a string lookup

– River

1 Like

That’s an interesting idea, Type::getDialect is also very hot. Wouldn’t that require more fancy initialization in the TypeID stuff? Also, how do you handle TypeID’s being global but types being MLIRContext specific?

TypeID’s are a bit annoying because they don’t contain integers so they can’t be used for indexing into side arrays (see the stuff @GeorgeL is struggling with). Now that we require all dialect types to be registered, we could assign them an integer value the first time they’re registered.

TypeID’s are process global (not MLIRContext specific), but I’m thinking something like this would work:

  • TypeID contains an (atomic) uint32_t that is initialized to ~0U, and we have a process global “next type id” atomic.
  • When registering the type with a dialect, if it is ~0U, it tries to assign the next number with CAS.
  • Numbers are never unregistered, so we’d get a “mostly dense” assignment of numbers, which would be fine for indexing.

Either as an assertion or in expensive checks mode we could make various type accessors (e.g. the ::get methods) verify the TypeId isn’t ~0U.

WDYT?

2 Likes

That is surprising, given that it is effectively just impl->getAbstractType()->dialect. Sent out a small revision that inlines a few methods I’ve seen on profiles (⚙ D104756 [mlir][NFC] Move several small methods from .cpp to .h to allow more aggressive inlining).

The TypeIDs are generally used for isa<> like things, so it doesn’t matter that it isn’t tied to a specific context. It’s similar in usage to using an “enum kind” value, except that you don’t need to statically enumerate all of the possible things. Creating the mapping wouldn’t require anything additional from the TypeID class, I was just imagining something as simple as DenseMap<TypeID, Dialect *>. We grab the type id for the dialect class via TypeID::get<T>, and use that for the lookup.

Do you have a link to that stuff? (I’m preparing for a move right now, so haven’t been following a lot of things closely). What kind of indexing is necessary here? I’ve considered the atomic numbering scheme you mention in the past, but I’ve never had a really compelling use case to think about it further than “that could be interesting”.

– River

Sure, i was referring to this thread. We’d like to have an efficient “static MLIR type” to “swift type ctor” mapping. We can do the equiv of DenseMap<TypeID, funcptr> but it would be far more efficient to do std::vector<funcptr>.