Scalar dependences in Dependence Analysis pass

Hi,

I am trying to understand the dependence analysis framework in LLVM. In particular, regarding scalar dependencies, I noticed that the precision of the dependences returned were less than optimal for some simple kernels like matrix multiplication.

1.
for (i = 0; i < 100; i++)
  for (j = 0; j < 100 j++)
    for (k = 0; k < 100; k++)
      C[i][j] += (A[i][k] * B[k][j])

As remarked on the project webpage (https://sites.google.com/site/parallelizationforllvm/), the scalar dependence for 'C' along the 'k' dimension is (0, 0, S). However, in this case, it is safe to consider this dependence to be (0, 0, 1), which is more precise.

Similarly,

2. For the Hyperbolic PDE-style kernel (taken from M.Wolf's PLDI 91 paper)
  for (i = 0; i < 100; i++) {
    for (j = 0; j < 101; j++) {
      A[j+1] = (1/3) * (A[j] + A[j+1] + A[j+2]);
    }
  }
The set of dependences would be { (0,1), (1,0), (1,-1) }. However, due to 'A' being a scalar along the 'i' dimension, the analysis reports { (S, 1), (S, 0), (S, -1) }.

I am curious to understand why this precision loss is happening and what are the ways to fix it. I would be happy to contribute this to the LLVM tree if I can fix it.

Thanks,

- Vaivaswatha

Hi,

The (0, 0, S) is a summary of the entire set of dependences, namely
(0, 0, 1), (0, 0, -1), (0, 0, 2), (0, 0, -2), (0, 0, 3), (0, 0, -3),I agree with this.

(0, 0, S) is more precise than (0, 0, 1)
(0, 0, 1) captures all the dependences that exist in the loop nest (more clarifications in the next para). What I mean to say is that, in this case it is safe for a loop transformation to assume that the only dependence in the loop nest is (0, 0, 1) and go ahead. So I would call it more precise since a constant distance dependence (0, 0, 1) allows more transforms than what a (0, 0, S) would allow.

The reason I claim that (0, 0, 1) captures all the dependences in the loop is this:
(0, 0, S) = { (0, 0, 1), (0, 0, -1), (0, 0, 2), (0, 0, -2), (0, 0, 3), (0, 0, -3), … } is really only capturing the memory dependences and not a value based dependence. That is the value computed in iteration (say) (0,0,0) is killed in iteration (0, 0, 1) and hence the dependence ends there. Iteration (0, 0, 2) is not dependent on iteration (0, 0, 0), although it is dependent on iteration (0, 0, 1).
So, the dependence (0, 0, 2) (and similarly others) is actually an over approximation. All that is required is that each iteration in the ‘k’ dimension be totally ordered, and the dependence (0, 0, 1) captures that.

These slides kind of helped me understand this idea of value based dependences.
http://www.cs.colostate.edu/~mstrout/CS553/slides/lecture18-valuedep.pdf
Note that it mentions Bill Pugh’s work.

I am now wondering if implementing the ideas in the paper “Detecting Value-Based Scalar Dependence” (Eric Stoltz and Michael Wolfe) is the way to go here, or if Bill Pugh’s work is more suited.

My reason for considering this to be an important feature is that, even for simple loop nests (such as the Hyperbolic PDE-style kernel I sent), we need to have constant distance dependences reported by the dependence analysis. Just direction vectors may not be sufficient to go ahead with many transforms.

Thanks,

  • Vaivaswatha

Hi,

I talked a bit with David Callahan about your question.

He argues that (for your 1st example, say) that (0, 0, S) is more precise than (0, 0, 1); indeed, that it is more accurate. The (0, 0, S) is a summary of the entire set of dependences, namely

(0, 0, 1), (0, 0, -1), (0, 0, 2), (0, 0, -2), (0, 0, 3), (0, 0, -3), …

Using only (0, 0, 1) ignores many dependences of the complete set.

Of course, you must interpret the S correctly when you’re attempting loop xforms and such.

Callahan notes that direction vectors and distance vectors were both developed as schemes for summarizing big sets of dependences for practical use in a compiler. Bill Pugh’s work pushed on the idea of manipulating the complete sets precisely, without summarizing.

Feel free to post this on the LLVM list if you like.

Hope it helps,
Preston