Detecting reduction operations

I want to be able to detect reduction operations using a method
similar to that described here:

http://portal.acm.org/citation.cfm?id=237578.237581

(I am open to other suggestions if there is a better technique).

I am curious if anyone has done this with LLVM or if there are and
recommendations for where to start with my implementation. I am only
interested in identifying the reductions -- I will not need to do any
kind of transformation to exploit the parallelism. However, if this
sort of work might be useful to others, I can try to generalize the
interface based on any suggestions in case my work gets to the point
where it could be checked in.

Scott

To be more specific, it would be helpful to have some utilities for
finding dependencies (true, output, and anti-). Where is a good place
to start for this kind of analysis?

Thanks,
Scott

To be more specific, it would be helpful to have some utilities for
finding dependencies (true, output, and anti-). Where is a good place
to start for this kind of analysis?

Hi Scott,

Do you mean loop carried dependencies? There is some initial work on dependence analysis, but it is still pretty young. We also have support for dependence between memory operations that are not loop aware.

-Chris

Hi Scott,

Do you mean loop carried dependencies? There is some initial work on
dependence analysis, but it is still pretty young. We also have support for
dependence between memory operations that are not loop aware.

-Chris

I think the dependence analysis will have to be loop aware. For example:

bb:
        %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %bb ]
        %sum = phi i32 [ 0, %bb.nph ], [ %3, %bb ]
        %1 = getelementptr i32* %X, i64 %indvar
        %2 = load i32* %1, align 4
        %3 = add i32 %2, %sum
        %indvar.next = add i64 %indvar, 1
        %exitcond = icmp eq i64 %indvar.next, %tmp.
        br i1 %exitcond, label %bb2, label %bb

I would like to recognize that there is circular dependence (true and
anti-) between %3 and %sum and that the only operations that form this
dependence are associative+commutative (e.g. addition). In this
example, we can see that by just looking at the operands of the %sum
and %3 instructions. But things get more challenging when we toss in,
say, a constant scalar multiply into each iteration. Then the
dependencies have an intermediate operations between them:

bb:
        %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %bb ]
        %sum = phi i32 [ 0, %bb.nph ], [ %3, %bb ]
        %1 = getelementptr i32* %X, i64 %indvar
        %2 = load i32* %1, align 4
        %3 = mul i32 %2, 4
%3 = add i32 %2, %sum
        %indvar.next = add i64 %indvar, 1
        %exitcond = icmp eq i64 %indvar.next, %tmp.
        br i1 %exitcond, label %bb2, label %bb

1) %3 has a true dependency on %sum (this is trivial by just looking
at the operands of the add inst)
2) %sum has an anti

But things get more challenging when we toss in,
say, a constant scalar multiply into each iteration. Then the
dependencies have an intermediate operations between them:

bb:
       %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %bb ]
       %sum = phi i32 [ 0, %bb.nph ], [ %3, %bb ]
       %1 = getelementptr i32* %X, i64 %indvar
       %2 = load i32* %1, align 4
       %3 = mul i32 %2, 4
%3 = add i32 %2, %sum
       %indvar.next = add i64 %indvar, 1
       %exitcond = icmp eq i64 %indvar.next, %tmp.
       br i1 %exitcond, label %bb2, label %bb

1) %3 has a true dependency on %sum (this is trivial by just looking
at the operands of the add inst)
2) %sum has an anti

Whoops, ignore the second example.

Anyway, I clearly need to think this through a bit more. But it might
be helpful to start looking at the dependence analysis stuff that is
available. Memory dependence will not help in general I think because
the arithmetic could be all done in intermediate values.

Hi Scott,

Do you mean loop carried dependencies? There is some initial work on
dependence analysis, but it is still pretty young. We also have support for
dependence between memory operations that are not loop aware.

-Chris

I think the dependence analysis will have to be loop aware. For example:

Okay, so the short answer is that we don't have what you need out of the box.

The longer answer is that there are two completely different types of loop dependences: memory dependences (load/store dependencies) and SSA value dependences like you're talking about here.

If you want to detect simple reductions like this, just walk the SSA graph starting from the PHIs in the loop header. The code should look very similar to how LoopInfo identifies the canonical induction variable for a loop, it should not be difficult to write.

-Chris

ScalarEvolution is another place to look for examples. It discovers add
recurrences today and altering the analysis slightly to detect in-register
reductions shouldn't be hard to do.

                               -Dave