Why does LLVM keep some loads in the loops even after applying the O3 optimization?

Hello all,

I am looking at the assembly code of a loop body which is created by applying O3 optimization. Here it is:

.LBB4_19: @ %for.body.91
@ =>This Inner Loop Header: Depth=1
ldr r0, [r5]
mov r1, r8
add r0, r0, r7
vldr s0, [r0]
mov r0, r6
vcvt.f64.f32 d0, s0
vmov r2, r3, d0
bl fprintf
cmp r0, #0
blt .LBB4_25
@ BB#20: @ %for.cond.89
@ in Loop: Header=BB4_19 Depth=1
ldr r0, .LCPI4_2
add r4, r4, #1
add r7, r7, #4
ldr r0, [r0]
cmp r4, r0
blt .LBB4_19

There are no other basic blocks in the loop. I am wondering why the first load instruction (ldr r0, [r5]) is repeatedly executed in the loop while the load address (r5) is never changed in the loop body. Shouldn’t this instruction be moved out of the loop as a result of -licm flag? I mean this load could have been executed only once outside of the loop and the result could have been saved in the register and used in the loop. I’d greatly appreciate if anyone can tell me why this is not the case.

Thank you in advance,
Fami

Hi Fami,

r0 gets overwritten inside the loop (assuming dst, src, src), is ldr r0, [r5] needed to initialize r0 for the loop at each iteration?

Ryan Taylor via llvm-dev <llvm-dev@lists.llvm.org> writes:

r0 gets overwritten inside the loop (assuming dst, src, src), is ldr
r0, [r5] needed to initialize r0 for the loop at each iteration?

Register allocation should handle that if the load is hoisted.

I'm with the others. The printf is the most likely culprit.

                      -0David

Thank you all for your responses.

Is there any structure at the LLVM backend that holds this information? I mean, can I get one machine instruction load and find out which other instructions in this function or even in other functions may modify its accessed memory location?

Is there any structure at the LLVM backend that holds this information?

The general topic is known as alias analysis. Within a function it's
designed to answer the question, given two instructions, of whether
they could be accessing the same memory. The answers are often
approximate though; if the compiler can't prove two accesses
definitely don't overlap it'll respond with something like MayAlias.

When function calls are involved things get a lot cruder. Basically
LLVM may sometimes know that a function doesn't access any memory, or
only accesses memory through its arguments, but that's not terribly
common or easy to prove. In the general case it assumes function might
modify any pointer that's not strictly local to the caller.

For example:

   void foo(int *in) { // "in" could be modified by any call; who
knows where it really lives?
     int var = 42;
     bar(); // var hasn't escaped yet, so we wouldn't have to reload
after this call.

     myGlobal = &var; // Bad: var is now accessible from any function call.
     baz(&var); // Bad: baz could save a copy of &var for use by any
other function.
     myStruct.thing = &var;
     baz1(myStruct); // Bad: basically the same as baz above but
slightly more hidden.
   }

I mean, can I get one machine instruction load and find out which other instructions in this function or even in other functions may modify its accessed memory location?

There's no specific interface like you're describing. You might be
able to enumerate all accesses in the same function that could
interfere, but it sounds potentially expensive. If possible you should
rethink your algorithm in terms of just comparing two MachineInstrs
you care about.

To do that you'd use MachineInstr::mayAlias, and your pass would have
to declare that it requires valid alias analysis (it's sometimes
costly to keep, so LLVM doesn't maintain that unless requested). Take
a look at (for example)
lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:getAnalysisUsage for
how it's done.

Cheers.

Tim.