Sorry, this might be a silly question but… I’m trying to construct a DenseI32ArrayAttr from a llvm::ArrayRef<uint32_t> instance. Given that I32 is a “signless” type this doesn’t seem like an unreasonable thing to do given there is no signless 32 bit integer in C++. However, calling:
DenseI32ArrayAttr::get(...);
with an llvm::ArrayRef<uint32_t> causes a compilation error since the template specialization of DenseI32ArrayAttr is defined as follows:
using DenseI32ArrayAttr = detail::DenseArrayAttrImpl<int32_t>;
and the implementation of DenseI32ArrayAttr::get comes from the class template and expects the llvm::ArrayRef template type to match that of the class: DenseArrayAttrImpl<T>::get(MLIRContext *context, ArrayRef<T> content).
Is this a bug, or is there a reason you shouldn’t be able to construct a DenseI32ArrayAttr from unsigned integer data?
I can work around this by reintrepet casting the underlying data but it’d be nice to not have to do that I’m happy to try and push a patch for this, I just wanted to check before I started working on it.
This sounds like something that will violate the strict aliasing rules. This would be my guess as to why element type conversions are not supported by the constructor.
Which is fine because I think you can always alias a char * without violating the strict aliasing rule (I can’t remember), but if that’s the case maybe I should just use a StringAttr and give it a reinterpret casted pointer to a const char *.
All the API and storage around DenseI32ArrayAttr is setup for signed integer. Would you want a constructor that would do the copy/narrowing conversion for you? That seems like a potentially acceptable helper constructor to me (even though it can be a bit bug prone, it’s in line with C++ implicit conversion rules somehow right?), feel free to send a patch.
We don’t want type punning though, if we were taking a uint32_t the process would be to explicitly narrow and convert during the copy. When read back the data would always be int32_t: if the user wants some unsafe type punning, then I’d rather have this visible in the user code.
if we were taking a uint32_t the process would be to explicitly narrow and convert during the copy. When read back the data would always be int32_t
This sounds like a limited implementation of type punning for me: T1 → copy → T2, where you are suggesting to limit what T1 can be and have T2 be fixed. I’d imagine that this won’t stop at just (u)int32_t and would extend to other dense array attr element types. Wouldn’t it be simpler to add a new universal get function that allows for char* if we are going to go this way, with the added benefit that I’d expect pointer casts to be more apparent at the call sites where types don’t match?
I’m not sure I totally follow: to me type punning is a reinterpret_cast, while a type conversion through copy (T1->T2) is semantically quite different?
In any case, we want int32_t, if folks want to do “type punning” why wouldn’t they reinterpret_cast to int32_t* ?
There are many ways to implement type punning: reinterpret_cast/memcpy/union (with some limitations depending on the context). What we want to prevent is having the caller pass in one type T1 (here uint32_t) and then have something else access it as an incompatible type T2 (say int32_t). This won’t be the case here if the storage that we copy to is a byte array.
What I’m saying is that, IMO, it makes more sense to expose a char*-ish overload that makes callers perform any such casts on their side if they wish to do type conversions. Otherwise you would have to check what ::get does and inspect it to make sure the implementation is safe and won’t violate strict aliasing. Exposing an explicit overload for unsigned types is also fine, but feels more like a special case to me given that we have dense array attrs for floats and doubles too.
Right, and you are proposing an API that puts us in the reinterpret_cast/union category. I don’t understand why it is a better thing here than exposing what we intend to support?
You’re basically breaking type-safety entirely and exposing a footgun to the user somehow, and I can’t see the benefits on the other hand?
And what would the use case be for the floating point variant to have a char* API?
Honestly, I think the easiest thing to do might just be to reinterpret_cast<const char *> create a StringAttr and be done with it. For a bit more context though, my use case is that I’ve got a serialized SPIR-V module stored in a SmallVector<uint32_t> and I’m trying to attach it to a gpu.module as an attribute, something like:
later in the pipline I’d like to get that same attribute, something like:
auto spirvAttr = cast<DenseI32ArrayAttr>(gpuMod->getAttr(gpu::getDefaultGpuBinaryAnnotation()));
auto jacksSPIRV = llvm::SmallVector<uint32_t>(after.asArrayRef());
It’d be nice if the above pattern worked and the user didn’t need to reintpret_cast<const char *>(jacksSPIRV) and use a StringAttr or copy like llvm::SmallVector<int32_t> jacksInt32SPIRV(std::begin(jackSPIRV), std::end(jacksSPIRV); in order to create the attribute, but its not the end of the world.
StringAttr seems like appropriate, but you may want to look at the Blob resource management as well! (unfortunately one of the only, if not the only, thing in MLIR that isn’t documented I believe)
Thanks Mehdi, I’ve never heard of the blob resource manager , time to do some reading, either way I think this discussion has answered my so marking as resolved.