Loops Prevent Function Pointer Inlining?

Hi - I've been trying to track down why a function pointer isn't being
inlined in some cases, and have nailed it down to the following repro:

struct s
{
    int (*cmp) (const void *, const void *);
    int its[2];
};

static int __uint_compare(const void *e1, const void *e2)
{
    const int *i1 = e1;
    const int *i2 = e2;
    return *i2 - *i1;
}

int main()
{
    struct s hp;
    hp.cmp = __uint_compare;
    hp.its[0] = 1;
    hp.its[1] = 2;

    for (int i = 0; i < 1; i++)
    {
        int cmp = hp.cmp(&hp.its[i], &hp.its[i + 1]);
        if(cmp < 0)
        {
            return 1;
        }
    }

    return 0;
}

I've compiled it (as test2.c) with the following commands, and have
attached the unoptimised and optimised outputs.

clang -O0 -S -emit-llvm -o test2.ll test2.c
opt -O3 -debug -S test2.ll > test2.opt.ll 2>log2

However, if I replace the main function with the code below (test.c),
the function call to __uint_compare is inlined and the whole function
is correctly replaced with a "ret i32 0".

int main()
{
    struct s hp;
    hp.cmp = __uint_compare;
    hp.its[0] = 1;
    hp.its[1] = 2;

    int cmp = hp.cmp(&hp.its[0], &hp.its[1]);
    if(cmp < 0)
    {
        return 1;
    }

    return 0;
}

As you can see, the first example always executes the for-loop exactly
once, so the semantics of both examples are identical. From looking at
the differences in the two (attached) log files, it seems like the
problem's in the GVN pass, as only the log for the second example has:

GVN iteration: 0
GVN removed: %5 = load i32 (i8*, i8*)** %1, align 8
GVN iteration: 1

Is this analysis correct? Or is some other pass getting tricked by the
for loop (maybe because the load & store are no longer in the same
basic block)? I was thinking of trying to fix it myself, but am not
sure exactly where in the codebase I should get started.

Thanks -

Nick

test.c (368 Bytes)

test.ll (2.72 KB)

test.opt.ll (716 Bytes)

test2.c (412 Bytes)

test2.ll (3.4 KB)

test2.opt.ll (2.35 KB)

log (32.6 KB)

log2 (48.3 KB)

Looks a bit similar to http://llvm.org/bugs/show_bug.cgi?id=20801

That'd be my guess. Maybe looking for where the debug messages are
printed out and follow the code up.

cheers,
--renato

I've CC'ed Chad Rosier as I think this behaviour is a side-effect of
his revert of IndVarSimplify.cpp (git
c6b1a7e577a0b9e9cff9f9b7ac35a2cde7c448d8, SVN 217962). The change
basically makes the IndVar pass change:

; <label>:4 ; preds = %6, %0
  %i.0 = phi i32 [ 0, %0 ], [ %11, %6 ]
  %5 = icmp eq i32 %i.0, 0
  br i1 %5, label %6, label %17

To:

; <label>:4 ; preds = %7, %0
  %indvars.iv = phi i64 [ %indvars.iv.next, %7 ], [ 0, %0 ]
  %5 = trunc i64 %indvars.iv to i32
  %6 = icmp eq i32 %5, 0
  br i1 %6, label %7, label %15

Whereas before the patch it would be:

; <label>:4 ; preds = %7, %0
  %indvars.iv = phi i64 [ %indvars.iv.next, %7 ], [ 0, %0 ]
  %5 = icmp eq i64 %indvars.iv, 0
  br i1 %5, label %6, label %15

This has a knock-on effect on the GVN pass; in the basic block that is
the body of the loop it only knows that "trunc i64 %indvars.iv to i32"
is zero, not that "%indvars.iv" is zero, so it can't convert the "add
i64 %indvars.iv, 1" to a constant 1, and so doesn't realise the body
of the loop is only ever executed once.

I'll keep digging...

Thanks,

Nick

Oh, I remember similar effects of truncate in the vectorizer. We
discussed a bit about that for other cases, maybe you can find it in
the list (probably end of last year, beginning of this year). I don't
remember the outcome, though. :frowning:

--renato

Nick,
The work done in r217953 was to widen the loop compare to eliminate the
trunc from in between the induction variable and the compare. The result
is that this enables LSR to consider more cases (i.e., LSR doesn't play
nice with the trunc).

So the code prior to r217953 looked like this:

  %indvars.iv = phi i64 [ %indvars.iv.next, %7 ], [ 0, %0 ]
  %5 = trunc i64 %indvars.iv to i32
  %6 = icmp eq i32 %5, 0
  br i1 %6, label %7, label %15

After my commit the compare was widened:

  %indvars.iv = phi i64 [ %indvars.iv.next, %7 ], [ 0, %0 ]
  %5 = icmp eq i64 %indvars.iv, 0
  br i1 %5, label %6, label %15

but then I realized the unsigned comparison wasn't handled correctly, so I
made a partial revert in r217962.

I'm currently working on the unsigned solution. As I pointed out in a
previous email, I'm pretty sure this is safe if the induction variable
steps by unit 1. Hopefully, I'll have to committed shortly.

I still don't follow your point (I'm not a GVN expert by any means), but I
do hope this helps. If you have any more questions, please let me know.

Chad

Thanks - that does help. I didn't realise the earlier code put the
`trunc` in too - that's what's preventing the GVN pass from removing
the loop entirely. I'm looking forward to your unsigned patch - then
the generated IR for my example will be much better :slight_smile:

Thanks - that does help. I didn't realise the earlier code put the
`trunc` in too - that's what's preventing the GVN pass from removing
the loop entirely. I'm looking forward to your unsigned patch - then
the generated IR for my example will be much better :slight_smile:

I'll try to have something up to review shortly.

Chad