Attribute storage memory consumption

Hello!

I have a question about memory consumption for Attributes internal storage.

As far as I understand, Attributes use the same mechanism as Types, so they are unique and immortal objects living in MLIR Context. But, in contrast to Types, Attributes holds not only meta data (like information about internal type), but the actual value. For some cases, like DenseElementsAttr the size of the values array might be large (for example, tensor constant). Such Attributes will consume large amount of memory even if they become unused.

The first example, if we try to use single MLIR Context to process several separate IRs. The first IR will create several DenseElementsAttr attributes for own Tensor constants. When we start process the second IR, which will use its own constants (and unlikely that they will have the same values), all attributes created for the first IR will still be alive, occupying large amount of memory.

The second example, if we try to do aggressive constant folding. Imaging the following code:

%0 = constant dense<some values> : tensor<100x100xf32>
%1 = constant dense<some other values> : tensor<100x100xf32>
%2 = foo.add %0, %1

Now, if we do constant folding for the foo.add operation and replace the code with new constant (in case if %0 and %1 is not used in other operations):

%2 = constant dense<new values> : tensor<100x100xf32>

We will get new DenseElementsAttr in the MLIR Context, while the constant values for previous constant tensors still will be alive, occupying memory. In more complex cases, we can simply get out of memory issue.

So, my questions: is this expected behavior? Or I’m missing something? Or I have misunderstanding regarding such attributes usage and constant folding?

Thank you in advance!

I encountered the same situation. Any update on this? Thanks!

(Sorry for the delay @vinograd47, I missed this post originally)

Yes this understanding is correct. Destroying the context and using a new one is the best way right now to clear the memory. If you have IRs with large constants, it may be better to just use a new Context every time.

We talked on multiple occasions on a system to optionally externalize the storage of constants, but we haven’t go to it yet: this requires careful design and it touches the very core of the system.

I see.

@tungld In our project we used the following approach to overcome memory overuse for constants:

  1. Used OpaqueDenseElementsAttr to import large constants out-side of the MLIR (eg. from another NN framework). The flow of our projects guarantees that thier lifetime will be longer than the lifetime of the MLIR context.
  2. Used “lazy constant folding” approach. We have own ConstantLike Operation for tensors, which supports different types for attribute and result value. It performs the conversion between them in on the fly in some kind of mapped range. But it has limited number of supported transformations.
1 Like

Thank you @vinograd47!