`DenseI32ArrayAttr` cannot be constructed from `llvm::ArrayRef<uint32_t>`

FranklandJack · June 11, 2024, 5:06pm

Hey folks,

Sorry, this might be a silly question but… I’m trying to construct a DenseI32ArrayAttr from a llvm::ArrayRef<uint32_t> instance. Given that I32 is a “signless” type this doesn’t seem like an unreasonable thing to do given there is no signless 32 bit integer in C++. However, calling:

DenseI32ArrayAttr::get(...);

with an llvm::ArrayRef<uint32_t> causes a compilation error since the template specialization of DenseI32ArrayAttr is defined as follows:

using DenseI32ArrayAttr = detail::DenseArrayAttrImpl<int32_t>;

and the implementation of DenseI32ArrayAttr::get comes from the class template and expects the llvm::ArrayRef template type to match that of the class: DenseArrayAttrImpl<T>::get(MLIRContext *context, ArrayRef<T> content).

Is this a bug, or is there a reason you shouldn’t be able to construct a DenseI32ArrayAttr from unsigned integer data?

I can work around this by reintrepet casting the underlying data but it’d be nice to not have to do that I’m happy to try and push a patch for this, I just wanted to check before I started working on it.

Thanks in advance.

kuhar · June 11, 2024, 5:20pm

This sounds like something that will violate the strict aliasing rules. This would be my guess as to why element type conversions are not supported by the constructor.

FranklandJack · June 11, 2024, 5:41pm

Right yeah, good point, thanks. In that case you’d need to do some sort of copy first though just to construct the array…

The definition of get just seems to be doing this anyway: llvm-project/mlir/lib/IR/BuiltinAttributes.cpp at main · llvm/llvm-project · GitHub

template <typename T>
DenseArrayAttrImpl<T> DenseArrayAttrImpl<T>::get(MLIRContext *context,
                                                 ArrayRef<T> content) {
  Type elementType = DenseArrayAttrUtil<T>::getElementType(context);
  auto rawArray = ArrayRef<char>(reinterpret_cast<const char *>(content.data()),
                                 content.size() * sizeof(T));
  return llvm::cast<DenseArrayAttrImpl<T>>(
      Base::get(context, elementType, content.size(), rawArray));
}

Which is fine because I think you can always alias a char * without violating the strict aliasing rule (I can’t remember), but if that’s the case maybe I should just use a StringAttr and give it a reinterpret casted pointer to a const char *.

mehdi_amini · June 11, 2024, 8:06pm

All the API and storage around DenseI32ArrayAttr is setup for signed integer. Would you want a constructor that would do the copy/narrowing conversion for you? That seems like a potentially acceptable helper constructor to me (even though it can be a bit bug prone, it’s in line with C++ implicit conversion rules somehow right?), feel free to send a patch.

kuhar · June 11, 2024, 9:06pm

If anything, wouldn’t a constructor that takes a byte array ref be better, with the specific intent to allow for type punning?

mehdi_amini · June 11, 2024, 9:13pm

We don’t want type punning though, if we were taking a uint32_t the process would be to explicitly narrow and convert during the copy. When read back the data would always be int32_t: if the user wants some unsafe type punning, then I’d rather have this visible in the user code.

kuhar · June 11, 2024, 9:34pm

if we were taking a uint32_t the process would be to explicitly narrow and convert during the copy. When read back the data would always be int32_t

This sounds like a limited implementation of type punning for me: T1 → copy → T2, where you are suggesting to limit what T1 can be and have T2 be fixed. I’d imagine that this won’t stop at just (u)int32_t and would extend to other dense array attr element types. Wouldn’t it be simpler to add a new universal get function that allows for char* if we are going to go this way, with the added benefit that I’d expect pointer casts to be more apparent at the call sites where types don’t match?

mehdi_amini · June 11, 2024, 9:42pm

I’m not sure I totally follow: to me type punning is a reinterpret_cast, while a type conversion through copy (T1->T2) is semantically quite different?
In any case, we want int32_t, if folks want to do “type punning” why wouldn’t they reinterpret_cast to int32_t* ?

kuhar · June 11, 2024, 9:57pm

There are many ways to implement type punning: reinterpret_cast/memcpy/union (with some limitations depending on the context). What we want to prevent is having the caller pass in one type T1 (here uint32_t) and then have something else access it as an incompatible type T2 (say int32_t). This won’t be the case here if the storage that we copy to is a byte array.

What I’m saying is that, IMO, it makes more sense to expose a char*-ish overload that makes callers perform any such casts on their side if they wish to do type conversions. Otherwise you would have to check what ::get does and inspect it to make sure the implementation is safe and won’t violate strict aliasing. Exposing an explicit overload for unsigned types is also fine, but feels more like a special case to me given that we have dense array attrs for floats and doubles too.

mehdi_amini · June 11, 2024, 10:51pm

Right, and you are proposing an API that puts us in the reinterpret_cast/union category. I don’t understand why it is a better thing here than exposing what we intend to support?
You’re basically breaking type-safety entirely and exposing a footgun to the user somehow, and I can’t see the benefits on the other hand?

And what would the use case be for the floating point variant to have a char* API?

FranklandJack · June 12, 2024, 9:07am

Thanks both for your suggestions.

Honestly, I think the easiest thing to do might just be to reinterpret_cast<const char *> create a StringAttr and be done with it. For a bit more context though, my use case is that I’ve got a serialized SPIR-V module stored in a SmallVector<uint32_t> and I’m trying to attach it to a gpu.module as an attribute, something like:

llvm::SmallVector<uint32_t> jacksSPIRV(...);
DenseI32ArrayAttr spirvAttr = mlir::DenseI32ArrayAttr::get(&getContext(), jacksSPIRV);
gpuMod->setAttr(gpu::getDefaultGpuBinaryAnnotation(), sirvAttr);

later in the pipline I’d like to get that same attribute, something like:

auto spirvAttr = cast<DenseI32ArrayAttr>(gpuMod->getAttr(gpu::getDefaultGpuBinaryAnnotation()));
auto jacksSPIRV = llvm::SmallVector<uint32_t>(after.asArrayRef());

It’d be nice if the above pattern worked and the user didn’t need to reintpret_cast<const char *>(jacksSPIRV) and use a StringAttr or copy like llvm::SmallVector<int32_t> jacksInt32SPIRV(std::begin(jackSPIRV), std::end(jacksSPIRV); in order to create the attribute, but its not the end of the world.

mehdi_amini · June 12, 2024, 11:55am

StringAttr seems like appropriate, but you may want to look at the Blob resource management as well! (unfortunately one of the only, if not the only, thing in MLIR that isn’t documented I believe)

FranklandJack · June 12, 2024, 12:05pm

Thanks Mehdi, I’ve never heard of the blob resource manager , time to do some reading, either way I think this discussion has answered my so marking as resolved.

Cheers.

Topic		Replies	Views
[RFC] Introduce a new Dense Array attribute MLIR	8	1295	July 3, 2022
MLIR-ODS DenseXXXArrayAttr + ArrayMinCount seem like a footgun MLIR	8	519	August 23, 2022
UI64ArrayAttr not found MLIR	10	449	November 8, 2021
DenseElementsAttr + i1 element type MLIR	13	537	May 19, 2022
[RFC] Inconsistency between dynamic and static attributes (i64 v. index) MLIR	7	663	February 2, 2023

`DenseI32ArrayAttr` cannot be constructed from `llvm::ArrayRef<uint32_t>`

Related topics