Hi, I’m going to vectorize load&store instructions in an LLVM IR generated on a new GPU backend.

However, I found that slp vectorizer fails to vectorize these load&store instructions. Because it failed to prove that stores are in consecutive memory.

Then I tried to manually modify the *getelementptr* instructions in my LLVM IR as follows,

**From**

```
%linear_index3 = add nuw nsw i32 %linear_index_plus_base, 3
%linear_index2 = add nuw nsw i32 %linear_index_plus_base, 2
%linear_index1 = add nuw nsw i32 %linear_index_plus_base, 1
getelementptr inbounds half, ptr %base_ptr, i64 %linear_index_plus_base
getelementptr inbounds half, ptr %base_ptr, i64 %linear_index1
getelementptr inbounds half, ptr %base_ptr, i64 %linear_index2
getelementptr inbounds half, ptr %base_ptr, i64 %linear_index3
```

**To**

```
%ld_base = getelementptr inbounds half, ptr %base_ptr, i32 %linear_index_plus_base
getelementptr inbounds half, ptr %ld_base, i64 0
getelementptr inbounds half, ptr %ld_base, i64 1
getelementptr inbounds half, ptr %ld_base, i64 2
getelementptr inbounds half, ptr %ld_base, i64 3
```

It succeeds to vectorize these load&store instructions.

Are there any LLVM passes that could do the same thing? Any hints or tips would be greatly appreciated.