Array globals in LLVM dialect

The llvm dialect docs list a few ways of how you can create global variables:

// Global values use @-identifiers.
llvm.mlir.global constant @cst(42 : i32) : i32

// Non-constant values must also be initialized.
llvm.mlir.global @variable(32.0 : f32) : f32

// Strings are expected to be of wrapped LLVM i8 array type and do not
// automatically include the trailing zero.
llvm.mlir.global @string("abc") : !llvm.array<3 x i8>

// For strings globals, the trailing type may be omitted.
llvm.mlir.global constant @no_trailing_type("foo bar")

// [further down]
llvm.mlir.global private constant @y(dense<1.0> : tensor<8xf32>)
    : !llvm.array<8 x f32>

Integer tensors are supported as well:

llvm.mlir.global private constant @tensor(
    dense<[1, 4, 8, 9, 4, 5, 6, 3]> : tensor<8xi32>
) : !llvm.array<8 x i32>

However, arrays are not supported (and neither are structs I believe, just specialized support for complex numbers):

llvm.mlir.global constant @array([0 : i8, 19 : i8, 42 : i8])
    : !llvm.array<3 x i8>

Since some fairly involved global constants are already supported, even including integer arrays through dense tensors, I’d expect plain old arrays and structs (possibly represented as nested arrays?) to work as well. Is there something I’m overlooking as to why they don’t exist, or is it just that nobody had a usecase for this and all that’s needed is actually adding support?

Thanks!

The tensor example is actually an array. The tensor type is just an implementation detail in the sense that the global operation uses a dense attribute to store the initialization values of the array. The type of the constant itself is !llvm.array<3 x i8>.

It is also possible to define a struct:

llvm.mlir.global internal constant @struct_var() : !llvm.struct<"my_struct", (i32, i32)> {
  %0 = llvm.mlir.constant(42 : i32) : i32
  %1 = llvm.mlir.constant(7 : i32) : i32
  %2 = llvm.mlir.undef : !llvm.struct<"my_struct", (i32, i32)>
  %3 = llvm.insertvalue %1, %2[0] : !llvm.struct<"my_struct", (i32, i32)> 
  %4 = llvm.insertvalue %0, %3[1] : !llvm.struct<"my_struct", (i32, i32)> 
  llvm.return %4 : !llvm.struct<"my_struct", (i32, i32)>
}

Here the initialization is done using a region that sets the elements of the struct to 42 and 7. The initialization is done using a region since LLVM dialect does not have a direct counterpart to LLVM’s constant expressions.

3 Likes

Cool, so the tensor type is the intended way to get constant arrays. I was using the initialization region for arrays and structs, and then saw that the docs list some more convenient attribute-based initializers for certain types, but I couldn’t immediately find arrays and structs. The tensors cover the array use case, and for the structs there’s not really a corresponding attribute that you could use.

I guess what I’m asking is: would there be value in allowing llvm.mlir.global to construct an !llvm.array or !llvm.struct type from an ArrayAttr directly?

GlobalOp has an AnyAttr type that in principle can take any initialization value. The limiting factor is probably the translation to LLVM IR that happens in the getLLVMConstant method (ModuleTranslation.cpp). It seems like there is already some support for ArrayAttr limited to complex numbers.

I guess what I’m asking is: would there be value in allowing llvm.mlir.global to construct an !llvm.array or !llvm.struct type from an ArrayAttr directly?

Regarding !llvm.array I would say that a DenseElementsAttr or a SplatElementsAttr is the preferred / more efficient way of storing an array. An ArrayAttr rather models a tuple and seems to generic for an array?

Using an ArrayAttr makes sense for structs. We still need the region based initialization though since a constant expression in LLVM can be fairly involved and may not map to an ArrayAttr. For example, a constant can be initialized with the address of another global or its own address. The question is thus if there is a use case for the ArrayAttr based initialization that justifies having an alternative way of initializing simple structs. I would guess that it is ok to slightly extend to support compared to what we have now (e.g, from complex numbers to simple structs that have integer/float elements only).

It does not make sense to use ArrayAttr for !llvm.array-typed constants. The former is a misnomer for “tuple” and can contain as elements attributes of different kinds. !llvm.array cannot. Something like Dense*ArrayAttr could make sense, but so does Dense*ElementsAttr that is currently used. The “tensor example” is in fact DenseFPElementsAttr with a tensor type, and it could use another shaped type.

It may be possible to use ArrayAttr for fully initializing simple structs. I.e., we shouldn’t have an equivalent of polymorphic undef as a new kind of attribute that may appear in ArrayAttr to avoid initializing one of the struct fields, let alone the other pieces of LLVM IR constant expressions that are just duplicating the IR constructs. Also, this will need a verifier that the types of attributes contained in
the ArrayAttr match the types of attributes in the struct.

1 Like

That makes a lot of sense! Thanks for the pointers, I wasn’t aware of all the intricacies involved here :-).

How would you declare a global array of structs?

I suppose the region initializer with individual llvm.insertvalue creating first the structs and then the array will do the trick.

Thank you. It worked here as well.