Hello!
I have a question about memory consumption for Attributes internal storage.
As far as I understand, Attributes use the same mechanism as Types, so they are unique and immortal objects living in MLIR Context. But, in contrast to Types, Attributes holds not only meta data (like information about internal type), but the actual value. For some cases, like DenseElementsAttr
the size of the values array might be large (for example, tensor constant). Such Attributes will consume large amount of memory even if they become unused.
The first example, if we try to use single MLIR Context to process several separate IRs. The first IR will create several DenseElementsAttr
attributes for own Tensor constants. When we start process the second IR, which will use its own constants (and unlikely that they will have the same values), all attributes created for the first IR will still be alive, occupying large amount of memory.
The second example, if we try to do aggressive constant folding. Imaging the following code:
%0 = constant dense<some values> : tensor<100x100xf32>
%1 = constant dense<some other values> : tensor<100x100xf32>
%2 = foo.add %0, %1
Now, if we do constant folding for the foo.add
operation and replace the code with new constant (in case if %0
and %1
is not used in other operations):
%2 = constant dense<new values> : tensor<100x100xf32>
We will get new DenseElementsAttr
in the MLIR Context, while the constant values for previous constant tensors still will be alive, occupying memory. In more complex cases, we can simply get out of memory issue.
So, my questions: is this expected behavior? Or I’m missing something? Or I have misunderstanding regarding such attributes usage and constant folding?
Thank you in advance!