Volatile load of packed struct

Hi! I’m wondering about LLVM’s treatment of volatile loads of packed structs. Let’s assume a type definition and corresponding load operation like this one:

%mystruct = type <{ i8, i8, i8, i8 }>
[...]
%result = load volatile %mystruct, ptr %p

During code generation, the load is split into four byte-sized loads even if the target supports 32-bit loads. Is this expected? The documentation doesn’t seem entirely clear to me, or maybe I’m misreading it:

Platforms may rely on volatile loads and stores of natively supported data width to be executed as single instruction. For example, in C this holds for an l-value of volatile primitive type with native hardware support, but not necessarily for aggregate types.

The first sentence suggests to me that the struct load should still be a single instruction (because the data width is supported natively), whereas the second sentence says that this may not be the case (for C?).

Is the splitting of the load operation expected behavior, and would it make sense to mention this more explicitly in the documentation?

It’s expected behavior and I think the phrasing “data width” is ambiguous. No struct is a native type. It’s really referring to the case in the “for example” section with the integer width with hardware support.

I see, thanks for the clarification and quick reply @arsenm!