DependenceAnalysis and PR14241

Hey Preston,

I wanted to let you know that we found a really serious problem with
DependenceAnalysis in PR14241. In summary, DA seems to have a baked-in
assumption that the base pointer of the GEPs it inspects are loop
invariant. It appears to only do analysis on the subscripts.

This is especially important for LLVM because C++ code (compiled
through Clang) very frequently expresses loops as pointer loops rather
than indexed loops, and thus I would expect a large number of the
interesting dependence queries to actually require the subscript
analysis your pass currently does to additionally analyze any loop
variant base pointers. Such base pointers can essentially factor the
subscripts across two separate GEPs, so that may also be a bit tricky
to model correctly.

In trying to give you a nice, short test case, I also noticed that the
DA testing infrastructure has a deeply surprising limitation: it can
only handle dependence queries for stores in a loop which are
*followed* by a load. The PR in question (and I would imagine many
other common scenarios) actually take the form of a load followed by a
store. or some other more complex interaction of loads and stores.

I think it might make more sense to make your testing print routine
look at the full set of load/store pairs, regardless of sequencing.

Just to give you the flavor right away of the types of loop that led
to PR14241, and the utility of the testing strategy I mentioned, I
made a quick local patch to my tree. Then I have:

% cat PR14241.ll
; ModuleID = 'PR14241.ll'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @f(i32* %s, i64 %size) {
entry:
  %end.idx = add i64 %size, -1
  %end.ptr = getelementptr inbounds i32* %s, i64 %end.idx
  br label %while.body

while.body:
  %phi.ptr = phi i32* [ %s, %entry ], [ %next.ptr, %while.body ]
  %src.ptr = getelementptr inbounds i32* %phi.ptr, i64 1
  %val = load i32* %src.ptr, align 4
  %dst.ptr = getelementptr inbounds i32* %phi.ptr, i64 0
  store i32 %val, i32* %dst.ptr, align 4
  %next.ptr = getelementptr inbounds i32* %phi.ptr, i64 1
  %cmp = icmp eq i32* %next.ptr, %end.ptr
  br i1 %cmp, label %exit, label %while.body

exit:
  ret void
}

With the old DependenceAnalysis testing I get:
% ./bin/opt -S -o - PR14241.ll -analyze -basicaa -da
Printing analysis 'Basic Alias Analysis (stateless AA impl)':
Pass::print not implemented for pass: 'Basic Alias Analysis (stateless
AA impl)'!
Printing analysis 'Dependence Analysis' for function 'f':

With my patch to the testing output:
% ./bin/opt -S -o - PR14241.ll -analyze -basicaa -da
Printing analysis 'Basic Alias Analysis (stateless AA impl)':
Pass::print not implemented for pass: 'Basic Alias Analysis (stateless
AA impl)'!
Printing analysis 'Dependence Analysis' for function 'f':
da analyze - none!

Note the "none" conclusion. There most certainly is a dependence
between the load and the store though. ;]

Let me know if I can help with fixing this, I'm really excited about
having proper dependence analysis in LLVM!
-Chandler

Hi Chandler,

Thanks for writing.
Could you give me some C (or C++) for an illustrative example.
I think I understand your concern, but I’d like to be sure.

Thanks,
Preston

Hi Preston,

The bug report at http://llvm.org/bugs/show_bug.cgi?id=14241 has more details, including a C++ test case.

- Ben

I see, thanks Ben.

So yes, I certainly misunderstood (more likely “misunderstand”) how things work.

My initial guess is that a conservative fix is quick and small (make sure the underlying pointers are loop invariant, otherwise give up). A better approach would be to somehow turn code like the example into array references that can be analyzed. I’ll need to think about this and do some reading.

Regarding testing… I wrote it the way I did to allow me to focus on particular tests without being distracted by the n^2 explosion resulting from testing everything against everything. But I can see the problem of not being able to handle arbitrary test cases that come from real examples, so I guess it’ll need to be changed.

Thanks for your efforts,
Preston

Hi Preston,

I looked at this test case. I am not sure what you are exactly doing, but I have the feeling you start from the getelementptr instruction. If you directly pass the pointer that is pointed to by the loads and stores to SCEV, there should just be a single base pointer %s in the resulting SCEVS. You can then extract this base pointer, subtract it from the scev and analyze the remaining scev as subscript. The base pointer in this test case is loop invariant.

Cheers
Tobi

From: "Tobias Grosser" <tobias@grosser.es>
To: "preston briggs" <preston.briggs@gmail.com>
Cc: "Benjamin Kramer" <benny.kra@gmail.com>, "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Friday, November 2, 2012 12:56:53 PM
Subject: Re: [LLVMdev] DependenceAnalysis and PR14241

>
> My initial guess is that a conservative fix is quick and small
> (make
> sure the underlying pointers are loop invariant, otherwise give
> up). A
> better approach would be to somehow turn code like the example into
> array references that can be analyzed. I'll need to think about
> this and
> do some reading.

Hi Preston,

I looked at this test case. I am not sure what you are exactly doing,
but I have the feeling you start from the getelementptr instruction.
If
you directly pass the pointer that is pointed to by the loads and
stores
to SCEV, there should just be a single base pointer %s in the
resulting
SCEVS. You can then extract this base pointer, subtract it from the
scev
and analyze the remaining scev as subscript. The base pointer in this
test case is loop invariant.

Does const SCEV *getPointerBase(const SCEV *V) do this?

-Hal

It give you the base pointer. Yes.

Tobi

Here’s the current code (abstracted a bit)

const Instruction *Src,
const Instruction *Dst,

// make sure they are loads and stores, then

const Value *SrcPtr = getPointerOperand(Src); // hides a little casting, then Src->getPointerOperand
const Value *DstPtr = getPointerOperand(Dst); // ditto

// see how underlying objects alias, then

const GEPOperator *SrcGEP = dyn_cast(SrcPtr);
const GEPOperator *DstGEP = dyn_cast(DstPtr);

After that, most everything is done by disassembling the SrcGEP and DstGEP.
The conservative approach would be to make sure SrcPtr and DstPtr are both loop invariant.

I’m not sure what you’re suggesting. How to I pass one (both?) of these pointers to SCEV?

Thanks,
Preston

From: "Preston Briggs" <preston.briggs@gmail.com>
To: "Tobias Grosser" <tobias@grosser.es>
Cc: "Hal Finkel" <hfinkel@anl.gov>, "Benjamin Kramer" <benny.kra@gmail.com>, "LLVM Developers Mailing List"
<llvmdev@cs.uiuc.edu>
Sent: Friday, November 2, 2012 1:40:16 PM
Subject: Re: [LLVMdev] DependenceAnalysis and PR14241

Here's the current code (abstracted a bit)

const Instruction *Src,
const Instruction *Dst,
// make sure they are loads and stores, then

const Value *SrcPtr = getPointerOperand(Src); // hides a little
casting, then Src->getPointerOperand
const Value *DstPtr = getPointerOperand(Dst); // ditto
// see how underlying objects alias, then

const GEPOperator *SrcGEP = dyn_cast<GEPOperator>(SrcPtr);
const GEPOperator *DstGEP = dyn_cast<GEPOperator>(DstPtr);

After that, most everything is done by disassembling the SrcGEP and
DstGEP.
The conservative approach would be to make sure SrcPtr and DstPtr are
both loop invariant.

I'm not sure what you're suggesting. How to I pass one (both?) of
these pointers to SCEV?

const Instruction *Src;
const Instruction *Dst;
...
const Value *SrcPtr = getPointerOperand(Src);
const Value *DstPtr = getPointerOperand(Dst);
...
SCEV *SrcPtrSCEV = SE->getSCEV(SrcPtr);
SCEV *DstPtrSCEV = SE->getSCEV(DstPtr);
...
SCEV *SrcPtrBaseSCEV = SE->getPointerBase(SrcPtrSCEV);
SCEV *DstPtrBaseSCEV = SE->getPointerBase(DstPtrSCEV);
...
SCEV *SrcPtrOffset = SE->getMinusSCEV(SrcPtrSCEV, SrcPtrBaseSCEV);
SCEV *DstPtrOffset = SE->getMinusSCEV(DstPtrSCEV, DstPtrBaseSCEV);
[these expressions may be the answer you'd like multiplied by the size of the type]
...
if (isa<SCEVAddRecExpr>(SrcPtrOffset))
  cast<SCEVAddRecExpr>(SrcPtrOffset)->evaluateAtIteration(...)
// same with Dst

I think something like this will do what you need.

-Hal

I hadn’t reviewed this closely before, but it’s actually overkill to use DependenceAnalysis in LoopIdiomRecognize. LoopIdiomRecognize needs to do the work to recognize a memmove pattern either way, and while it’s nice to use memcpy instead of memmove, (a) that’s something that’s nice to do in general, not just in memmove calls recognized by LoopIdiomRecognizer, and (b) Converting memcpy to memmove requires simple alias analysis, not generalized loop dependence analysis.

It happens that LLVM’s current AliasAnalysis API isn’t quite flexible enough for this; it doesn’t handle dynamic sizes. However, one can manually perform the alias analysis with the kind of SCEV code being discussed on this thread.

Dan