Loop invariant expression using pointers not being optimized

markoshorro · January 18, 2023, 12:08pm

Hi all,

For the following snippet (Compiler Explorer):

#include <iostream>

template <typename T>
void bar(T arg);

void foo0(char *ptr) {
    while (true) {
        const auto mask = (uintptr_t)ptr & 0xf;
        bar(mask);
        ptr += 16;
    }
}

void foo1(int64_t ptr) {
    while (true) {
        const auto mask = ptr & 0xf;
        bar(mask);
        ptr += 16;
    }
}

In both cases, the computation of mask can be moved outside the loop, as ptr is incremented 0x10, which is greater that the value of the mask 0xf, therefore the 4 LSB of ptr remain the same as bar does not modify these bits. However, in the first case mask is always computed within the loop. Taking a look at the IR:

define dso_local void @_Z4foo0Pc(ptr noundef %0) local_unnamed_addr #3 !dbg !908 {
  call void @llvm.dbg.value(metadata ptr %0, metadata !912, metadata !DIExpression()), !dbg !916
  br label %2, !dbg !917

2:                                                ; preds = %1, %2
  %3 = phi ptr [ %0, %1 ], [ %6, %2 ]
  call void @llvm.dbg.value(metadata ptr %3, metadata !912, metadata !DIExpression()), !dbg !916
  %4 = ptrtoint ptr %3 to i64, !dbg !918
  %5 = and i64 %4, 15, !dbg !919
  call void @llvm.dbg.value(metadata i64 %5, metadata !913, metadata !DIExpression()), !dbg !920
  tail call void @_Z3barImEvT_(i64 noundef %5), !dbg !921
  %6 = getelementptr inbounds i8, ptr %3, i64 16, !dbg !922
  call void @llvm.dbg.value(metadata ptr %6, metadata !912, metadata !DIExpression()), !dbg !916
  br label %2, !dbg !917, !llvm.loop !923
}

define dso_local void @_Z4foo1l(i64 noundef %0) local_unnamed_addr #3 !dbg !932 {
  call void @llvm.dbg.value(metadata i64 %0, metadata !936, metadata !DIExpression()), !dbg !940
  %2 = and i64 %0, 15, !dbg !941
  br label %3, !dbg !941

3:                                                ; preds = %1, %3
  call void @llvm.dbg.value(metadata i64 poison, metadata !936, metadata !DIExpression()), !dbg !940
  call void @llvm.dbg.value(metadata i64 %2, metadata !937, metadata !DIExpression()), !dbg !942
  tail call void @_Z3barIlEvT_(i64 noundef %2), !dbg !943
  call void @llvm.dbg.value(metadata i64 poison, metadata !936, metadata !DIExpression(DW_OP_plus_uconst, 16, DW_OP_stack_value)), !dbg !940
  br label %3, !dbg !941, !llvm.loop !944
}

For foo0, even though the cast %4 = ptrtoint ptr %3 to i64, !dbg !918 is necessary, couldn’t LLVM infer that this is, still, loop invariant? What other things I am missing here? I thought the opaque pointers representation here could help.

Many thanks!

PS: I have asked this in Loop invariant expression using pointers not being optimized · Issue #59633 · llvm/llvm-project · GitHub as well

efriedma-quic · January 18, 2023, 6:11pm

The “and” instruction isn’t loop invariant in the traditional sense; the transform requires rewriting the PHI node into a different form.

Looks like this is just a missing optimization; the code that implements the transform for you “foo1” isn’t expecting a ptrtoint instruction.

markoshorro · January 25, 2023, 2:40pm

@efriedma-quic, thanks for your comments. I’m aware that this is not loop invariant, as you say, in the traditional sense, it is quite subtle, although I think this is something doable by the compiler.

I’d like to address this optimization, but I’m not sure where to start, is this something to tackle at the Loop Invariant Code Motion (-licm) pass level or is this lower-level like phi nodes?
Thanks

efriedma-quic · January 25, 2023, 6:07pm

If you use “-mllvm -print-after-all”, you’ll see that foo1 is getting optimized by IndVarSimplifyPass; I’d start there.

markoshorro · February 7, 2023, 12:22pm

Thanks, seems a good entry point for me.
I’ll let you know if I can finally get something done in this regard.
Cheers!

Topic		Replies	Views
A question about GetElementPtr common subexpression elimination/loop invariant code motion LLVM Dev List Archives	5	61	January 30, 2007
Loop invariant not being optimized LLVM Dev List Archives	4	80	November 19, 2016
Loop invariant values and scalarization overhead LLVM Dev List Archives	0	74	April 13, 2017
Back end loop invariant opt LLVM Dev List Archives	5	75	March 25, 2007
Bug in loop stores optimization LLVM Dev List Archives	1	73	July 31, 2018

Loop invariant expression using pointers not being optimized

Related Topics