Using Tensor `Encoding` attribute

Hi folks, i was looking at potentially storing some information in the Encoding field of tensor types. I would then like to use this information to build the memory space attribute on the memref types that get allocated from this during bufferization.

I built up a very basic prototype of this and hit a few issues. I’d like ask about this and also see if this may have been thought of before and rejected due non-viability, as i’m sure there are a mess of corner cases i haven’t considered yet.

First issue i hit is that the Tensor Dialect drops the Encoding field on InsertSliceOp, ExtractSliceOp, and a few other ops. This seems straightforward to fix and if this is indeed a bug/oversight i can push a commit for this right away.

Second issue is that bufferization doesn’t have a way to use the Encoding field. I have some code that adds a callback on the bufferization options. The callback takes the current TensorType as input and returns an optional memoryspace. The default implementation just returns defaultMemorySpace, but i can use this as a hook to translate Tensor Encoding to MemRef MemorySpace. I’m also interested in pushing this up as a change to one-shot-bufferization if that is interesting to folks.

I’m interested to get feedback on this idea and the two issues i’ve identified.

Thanks in advance,

ian Bearman

Hello @manbearian

I think this is generally interesting and would allow better interop for thing like e.g. Triton which had to duplicate a lot of infra/ops, in part due to the issues you point out.

I don’t think any of those are fundamental. Encodings were introduced after insert/extract ops and bufferization, for the specific purpose of the sparse application domain but never connected outside of that work.

What you propose would be a welcome extension. We shouldn’t just drop semantic information on the floor…

This is great to hear. I’m going to drive through the prototyping locally. If it pans out for how i want to use it (which i believe it will), i’ll push the changes upstream to public MLIR. Should be a week or so depending on distractions.