Automatically scaled offset Load/stores for arrays


I have a question on how I would support load/store instructions where the offset is automatically scaled by the type. Simply put, any array index would be scaled by the width, so that there no longer needs to be a separate

int32_t arr = {…}
for (int i = 0; i < 100. i++) {
x += arr[i]

the llvm code of (code below approximation of what would showup)
%a = ld i32 %arr_ptr, %offset
%x = add i32 %x %a
%offset = offset + 4
%i = %i + 1
%cond = icmp %i 100 ICMP_ULT
br for.body

would lower to:

%a = ld i32 %arr_ptr, %i
%x = add i32 %x %a
%i = %i + 1

%cond = icmp %i 100 ICMP_ULT

br for.body

Right now, the only idea I would have is to replace getelementptr instructions with an intrinsic that computes the offset and pass that in to the load instruction, and in instruction selection look for patterns of loads with intrinsic, and convert that to a scaled_arr load.

Sorry if what I am asking is somewhat confusing.

This is a reasonably common pattern. Typically earlier LLVM passes
(Loop Strength Reduction in particular) wrangle equivalent induction
variables and offsets into what's best for your machine (using
callbacks like TargetTransformInfo::getScalingFactorCost by the looks
of it).

In this case, I'd expect that after setting scale-4 to free and
scale-1 to expensive it will produce something like

    %ptr = %arr_ptr + 4 * %i
    %a = load i32 %ptr
    %x = %x + %a
    %i = %i + 1

At that point you have the much simpler task of looking for patterns
like "(load (add $base, (mul $offset, 4)))" during ISel. Quite a few
other targets do this (they usually find it's actually simpler to do
it in C++ using ComplexPatterns, see AArch64's "SelectAddrModeXRO"
functions for example).

Let me know if I've been unclear anywhere.