Will mlir support hashmap type in near future?

Hi ,
Will mlir support hashmap type in near future? The memref type only suport sequential stored data , the same as dense multi-dimensional tensor do. For operations like tf.Unique, a hashmap type is needed to get the unique ids.
There are two possible ways to implement hashmap:

  1. Implement a native hashmap via these existed type primitive. As we know, the DenseHashMap is implemented by an array as lower level data structure. So we can implement a hashmap like DenseHashMap.
  2. Introduce external hash map. And we implement some handler functions like set_value() and get_value(). We call these hashmap handler functions in our IR code.

Looking forward to all the opinions. :slight_smile:

Thanks.

-Tao

A hashmap type is useful especially in some structured ops. Despite tf.Unique, hashmap type can also describe a non-conflict id-to-embedding in sparse math.
I wonder whether these two ways mentioned above is the appropriate way to import hashmap type into mlir core, I remember a CIL (clang IR ) is proposed by Chris in the keynote ,support something like:

%map = cil.alloc_stack : !cil.std::map<int, int>
cil.call @’std::map<int,int>::insert(int)’(%vec, %i)

I think this IR is very clean, is it possible to import such kind of ir into mlir core, so a complex type is easy to be expressed

Yes, I think we need something like it in mlir core.
Hope some other opinions.

Not sure the status for cir impl. as what i know, it’s very early. anyone else familiar with this can comment, pls.

so far, looks like an external or self-implemented library is a good way to go. for the other hand, to do hashmap or etc., map data structure may not the best way to go with. llvm has a hash impl which is more efficient and easy to vectorize.

I’m not sure what you’re asking about actually: MLIR allows to create IR abstractions. If my input problem would benefit to model a concept of hashmap then I’d start by trying to model it in my own dialect. Possible creating a dialect to model specifically the interaction with the hashmap itself if it grows out to be worthwhile.
What would you need from the core of MLIR to change? You mentioned memref originally, but even memref is just something you can make up in a dialect: MLIR Core isn’t coupled to memref (hopefully)

Thanks for reply, Mehdi.
We want to compile embedding graph in the tensorflow scenery.
There are some ops, for example UniqueOp in the tensorflow graph, which should use a hashmap to de-duplicate ids.
Will MLIR support a builtin ‘hashmap’ type in the future?
Also, maybe we should create a dialect to interaction with the hashmap.
If no plan so far, we can propose a RFC for that if it’s worthwhile.

Thanks.