Using Tensor `Encoding` attribute

manbearian · March 11, 2023, 12:12am

Hi folks, i was looking at potentially storing some information in the Encoding field of tensor types. I would then like to use this information to build the memory space attribute on the memref types that get allocated from this during bufferization.

I built up a very basic prototype of this and hit a few issues. I’d like ask about this and also see if this may have been thought of before and rejected due non-viability, as i’m sure there are a mess of corner cases i haven’t considered yet.

First issue i hit is that the Tensor Dialect drops the Encoding field on InsertSliceOp, ExtractSliceOp, and a few other ops. This seems straightforward to fix and if this is indeed a bug/oversight i can push a commit for this right away.

Second issue is that bufferization doesn’t have a way to use the Encoding field. I have some code that adds a callback on the bufferization options. The callback takes the current TensorType as input and returns an optional memoryspace. The default implementation just returns defaultMemorySpace, but i can use this as a hook to translate Tensor Encoding to MemRef MemorySpace. I’m also interested in pushing this up as a change to one-shot-bufferization if that is interesting to folks.

I’m interested to get feedback on this idea and the two issues i’ve identified.

Thanks in advance,

ian Bearman
Microsoft

nicolasvasilache · March 11, 2023, 7:48am

Hello @manbearian

I think this is generally interesting and would allow better interop for thing like e.g. Triton which had to duplicate a lot of infra/ops, in part due to the issues you point out.

I don’t think any of those are fundamental. Encodings were introduced after insert/extract ops and bufferization, for the specific purpose of the sparse application domain but never connected outside of that work.

What you propose would be a welcome extension. We shouldn’t just drop semantic information on the floor…

manbearian · March 13, 2023, 4:58pm

This is great to hear. I’m going to drive through the prototyping locally. If it pans out for how i want to use it (which i believe it will), i’ll push the changes upstream to public MLIR. Should be a week or so depending on distractions.

ian

sagark · January 17, 2024, 4:07pm

Hi @manbearian

Did you happen to pursue this work? I am also looking into bufferizing a tensor with encoding attribute to a memref but there seems to be a check that fails.

error: ‘bufferization.to_memref’ op failed to verify that type of ‘tensor’ is the tensor equivalent of ‘memref’

I am attempting to store layout information in the tensor’s encoding attribute and would like to add a conversion to a memref with a layout map. I cannot find any existing support for this in MLIR.

Is there a recommended way of doing this?

Sagar

manbearian · January 17, 2024, 5:16pm

Wow, such a great coincidence. This week, i finally started putting together a PR against LLVM to get this change done (i’ve had it locally for nearly a year :embarrased_face:).

Would you like to collaborate on this Sagar? I will try and get a branch up on my github by the end of this week. I’d love to get your feedback on if it meets your needs before publishing fully!

EDIT I’m still testing this draft PR but please take a look.

ian Bearman
Microsoft, AI Compilers

sagark · January 17, 2024, 6:27pm

Awesome! Yes, I’d love to collaborate on this.

Thanks for sharing the PR, taking a look at this.

manbearian · January 18, 2024, 10:08pm

At this point the proposed changes are passing all of the tests (no regressions). Would you like me to hold my PR into LLVM until you get a chance to give it a try? I don’t have a big rush to get this in, but i think a PR by next week into LLVM would be ideal.

ian

sagark · January 18, 2024, 10:25pm

Hi @manbearian

The changes look good to me and I will give a try tomorrow (but don’t let this be a blocker to push the changes into LLVM). I think currently it would not work for my exact use-case-- may need to extend it further. I had added a comment on the PR yesterday btw. Just posting it here as well below:

Thanks for creating this PR! I think this could be very helpful in using the encoding attribute during bufferization.

I have a few points that can be possible future extensions:

Since the encoding attribute is quite generic i.e., it can be used for storing any information, I wonder if only converting it to the MemorySpace attribute is restrictive? For ex: In my use-case, I am storing layout information that may be represented with a LayoutMap in MemRef.
The Bufferization pass results in the creation of bufferization.to_memref Ops. These Ops have verifiers that fail when trying to bufferize a tensor with encoding attribute. Namely, getTensorTypeFromMemRefType defined in mlir/include/mlir/Dialect/MemRef/IR/MemRef.h, used to verify if the new MemRef Type created by bufferization.to_memref is equivalent to the tensor type, does not check the encoding attribute and fails if an encoding attribute is present on the tensor type.

Wanted to know your thoughts on these.

manbearian · January 18, 2024, 10:57pm

Thank you for the response. I didn’t see your comment on github, so thanks for calling that out too. I’ll think on these two items and get back to you.

Topic		Replies	Views
Bufferization.to_tensor/to_memref MLIR	3	185	April 29, 2024
Sparse Tensors in MLIR MLIR	62	5892	March 25, 2025
Open MLIR Meeting 1/13/2021: One-Shot Function Bufferization of Tensor Programs Announcements	3	1147	January 17, 2022
[mlir][One-Shot Bufferizer] Customizable tensor <-> memref conversions MLIR	4	285	February 13, 2024
[RFC] MemRef memory shape as Attribute MLIR	18	1384	March 18, 2021

Using Tensor `Encoding` attribute

Related topics