load instruction to gather intrinsics

Hi All,

Can I change a vector load to gather intrinsic? If so, how can I do it? For example, I want to change the following IR code

%1 = load <2 x i64>* %arrayidx1, align 8

to

%1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef)

Basically, I am not sure how to get two consecutive addresses started from arrayidx1. Thanks for your time and help in advance.

Best,
Zhi


Hi All,

Can I change a vector load to gather intrinsic? If so, how can I do it?
For example, I want to change the following IR code

%1 = load <2 x i64>* %arrayidx1, align 8

to

%1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef)

​How those IR would be generated? By frontend or your IRBuilder?​ And why
you want to use gather intrinsic? From the LangRef [1], seems it is mainly
for discontinuous memory locations.

Basically, I am not sure how to get two consecutive addresses started from
arrayidx1. Thanks for your time and help in advance.

​Maybe `getelementptr` is what you need to calculate consecutive addresses.

​[1] LLVM Language Reference Manual — LLVM 16.0.0git documentation

​HTH,
chenwj​

The frontend would generate the load in the IR. I am using IRBuilder to generate gather. I know it is mainly for discontinuous memory locations. It’s a long story why I want to use this. I want to gather some memory locations. Suppose there are an array A, I manually duplicated it somewhere with an offset x. Now, we have two arrays A and A’, where A’[i] - A[i] = offset. I want to gather the two values at A+i and A+i whenever there was a load instruction to get a value from A+i.

It is easy to do this if the code was not vectorized, but it become tricky when they are vectorized. If there is a load %1 = load <2 x i64>* %Ai, align 8, which loads two consecutive values at Ai, I need to gather the values from A’i as well. So I think I need to use gather to get values from A+i, A+i+1, A’+i, A’+i+1.


I think you can use `getelementptr` to calculate ​the address needed
by the gather
intrinsic, though I don't know
if there is better way to achieve your goal.

But I think I still need to get A+i and A+i+1 from %1 = load <2 x i64>* %Ai, align 8 even if I want to use getelementptr, is that correct?