Status of GEPs into vectors of overaligned elements

Summary

GEPs into vectors of overaligned elements are currently allowed and generated by generic passes like SROA. However, such GEPs are broken, because offsets are incorrectly computed in many cases.
I’d like to discuss how to improve the situation.

See also https://discourse.llvm.org/t/status-of-overaligned-i8 for a recent similar discussion regarding overaligned types.

Details

Vectors can contain overaligned elements, in which case the elements are tightly packed in the vector, not respecting the ABI alignment (LLVM Language Reference Manual — LLVM 16.0.0git documentation).

GEPs into such vectors are currently allowed. However, offsets of such GEPs are inconsistently computed, as many places (including GEPOperator itself) incorrectly use getTypeAllocSize() for element sizes, which respects ABI alignment.

The dedicated GEP guide (The Often Misunderstood GEP Instruction — LLVM 16.0.0git documentation) mentions that GEPs into vectors are not recommended, and that GEPs into vectors might be outright disallowed in the future.

Even if frontends avoid GEPs into vectors, generic passes introduce them, so there is a need for change: For example, SROA tries to rewrite byte-based accesses as "natural GEP"s using DataLayout::getGEPIndexForOffset which (correctly) returns GEP indices into such a vector if the byte access happens to match a vector element. However, later steps (e.g. GetElementPtrInst::accumulateConstantOffset) compute incorrect offsets.

See the test case overaligned-datalayout.ll in ⚙ D139034 [IR] GEP: Fix byte-offsets in vectors of overaligned types for a miscompilation caused by this issue.

Options

I see the following options to improve the situation:

Fix offset calculations

I recently tried to fix these GEP offsets in ⚙ D139034 [IR] GEP: Fix byte-offsets in vectors of overaligned types
But @nikic correctly pointed out that there are far more places in LLVM that rely on the same assumption, and suggested to add some sort of gep_offset_iterator that could be used everywhere instead. It seems this would be possible, but it’s not clear to me whether the nontrivial work for that is actually needed.

Forbid GEPs into vectors of overaligned elements

Given the currently broken state of such GEPs, it seems unlikely there are any users depending on such GEPs. For DXIL, DXC seems to replace vectors by arrays in case objects are alloca’ed (@beanz, can you comment on this?).

So we could forbid such GEPs instead, which I’d personally prefer. I’m not sure how such a rule could be enforced though, except for updating the LangRef and adding a few asserts?

Forbid GEPs into vectors

This would formally also be an option. I don’t have an opinion on this, but it seems to be a fairly large change that is not sufficiently motivated by this corner case?

For the sake of completeness: Is it viable to forbid overaligned primitive types entirely? From the recent discussion about DXIL i8 alignment it sounds like DXIL was unnecessarily marking these as overaligned – is that specific to just i8, or might we do away with this entirely?

In any case, I don’t think we want to forbid only GEPs into vectors of overaligned elements. I would be fine with forbidding all GEPs into vectors though. Forbidding it only in the overaligned case requires all the complexity of forbidding it entirely, but hidden behind a configuration that nobody tests.

Longer-term, I expect this problem to mostly solve itself as part of the switch from type-based to offset-based GEPs.

There’s been a note in some document somewhere for over a decade that indexing into vectors with GEP will be disallowed. Can we just do that?

I think that indexing into vectors should be allowed and identical to the corresponding array if the vector element is a whole number of bytes and not overaligned. We would like to depend on that vector in-memory representation for Rust’s project-portable-simd since we are trying to define the in-memory representation of Rust’s Simd type to be identical to an array but with possibly larger alignment, but never large enough alignment to introduce padding after the end of the array (e.g. f32x7 has align at most 4 since otherwise there would be padding, but f32x8 can have align 32 since that doesn’t introduce padding).

From what I understand from the linked issue, Rust’s portable-simd will use arrays for in-memory representation and pointers, so you’re never going to gep into a vector?

FYI, currently, the alignment of vectors in LLVM depends alone on its size and is independent of its element type. So, alignof(<3 x i32>) = alignof(<6 x i16>) = alignof(<96 x i1>) = alignof(<32 x i3>).
The “store size” of a vector in LLVM is the size of the vector elementCount * elementType.BitSize aligned to it’s alignment (which does not depend on the element type).
There were and are proposals to change that.

I think it’s reasonable to forbid GEPs into a vector, for the current semantics where the size and alignment is independent of the element type.

Is it viable to forbid overaligned primitive types entirely?

There is no target that overaligns i8 (or any integers) like dxil, but it’s common to overalign vector types to the next power of 2. I.e. for amdgpu, <3 x i16> with size 48 bit is aligned to 64 bit.

yes, but we are hoping to be able to use vector-typed load/stores, e.g.:

%p = alloca [5 x float], align 4
%v = fadd <5 x float> %a, %b
store <5 x float> %v, ptr %p, align 4

so that being valid should guarantee gep is valid on <5 x float> unless you add imho artificial restrictions.

This is still making an assumption about the memory layout of the vector elements. You could still perform the same getelementptr on the array type. This whole conversation feels pretty pointless with opaque pointers, there isn’t much reason to have getelementptr type indexing anymore.

which “this”? load/stores? geps? afaict the llvm lang ref guarantees those load/stores work correctly, as part of the definition of vectors in-memory layout in terms of integers (e.g. <3 x float> is guaranteed to load/store just like i96)

Yes, the load and store do work correctly. Your example didn’t actually include a GEP indexing into a vector which is the problematic bit, and only used a well behaved float type. Consider the non-byte sized vector cases described here

that’s exactly what I proposed: non-power-of-2 Simd types have wrong size · Issue #319 · rust-lang/portable-simd · GitHub

ok, then we just decide gep is artificially limited to not allow indexing into vector types, you have to use array types for gep instead.

trying to use gep to get a pointer to an element of a vector with non-byte-aligned elements sounds like nonsense to me, since pointers can’t point to anything smaller than a byte (unless we want to add new bit-pointer types that basically no one will use?).

As first step, I’ve landed ⚙ D142146 [IR] Avoid creation of GEPs into vectors (in one place) to avoid creation of GEPs into vectors during transforms.