How to make Polly ignore some non-affine memory accesses

Hello everyone,

I'm trying to use LLVM+Polly to obtain a polyhedral representation of
some loops to use later for passes I want to implement, but seems like
Polly will stop when reaching any statement that has non-affine access
functions of the loop bounds discarding the whole SCoP entirely.

What I would like to achieve is still getting some information for the
statements that , instead, have affine access functions, practically
"ignoring" the non-affine ones.

There is a way to do such a thing?

Thanks

Marcello

Hi Marcello,

this is not yet integrated into Polly, but a very valuable extension which I planned to integrate myself. In case you want to give it a try,
I am very glad to help you.

Basically, what you want is to adapt lib/Analysis/ScopDetection.cpp to
not reject non-affine constructs. In lib/Analysis/ScopInfo.cpp we would
need to adapt the translation into the polyhedral description. Instead of creating a map defining an affine must access (such as
Stmt[i,j] -> A[2 * i + j]), we create a map that defines a may access to all elements of the corresponding array. Something like 'Stmt[i,j] -> A'.

Cheers
Tobi

Hi Tobias,

thanks for the answer. I'll try to give a look to the code you pointed
me to , and I'll try to make the modification myself. I'm new to LLVM
and Polly, but the code of both seem clean and understandable, so I
hope to be able to do it myself. In case I'll ask here for support :slight_smile:

Marcello

Hi, I'd like to ask another thing about Polly and SCoP discarding.

I've noticed that Polly discards quite simple loops like:

for (int i = 1; i < 1000; i++) {}

or

for (int i= 0; i < 1000; i+=2) {}

is this an intended behavior or there is some way to make it accept
these kind of loops ?

Thanks,
Marcello

Hi Marcello,

can you provide complete test cases and explain how you run Polly. These loop constructs contain nothing Polly cannot handle, but other parts of your programs may block a detailed analysis. Furthermore, Polly relies on LLVM preparing passes to canonicalize the code. If they
are not run beforehand than Polly will hardly detect any loops.

In your case I propose you attach the complete C and LLVM-IR files and explain how you run Polly. I am sure I can help you with this.

Tobi

I add also the output of these commands:

[hades@artemis examples]$ ./compile_ex.sh super_simple_loop
Printing analysis 'Polly - Detect Scops in functions' for function 'main':

[hades@artemis examples]$

modifying it in :

#include <stdio.h>

int main()
{
         int A[1024];
         int j, k=10;
         for (j = 0; j < 1024; j++)
                 A[j] = k;

       return 0;
}

gives:

[hades@artemis examples]$ ./compile_ex.sh super_simple_loop
Printing analysis 'Polly - Detect Scops in functions' for function 'main':
Valid Region for Scop: %1 => %5

[hades@artemis examples]$

Hi,

for example this loop:

  #include<stdio.h>

  int main()
  {
          int A[1024];
          int j, k=10;
          for (j = 1; j< 1024; j++)
                  A[j] = k;

        return 0;
}

run with:

#!/bin/bash
source ../set_path.source
clang -S -emit-llvm $1.c -o $1.s
opt -S -mem2reg -loop-simplify -indvars $1.s> $1.preopt.ll
opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so -polly-detect -analyze $1.preopt.ll

Using the instructions found on the Polly website.

There are two reasons why it does not work with these instructions

1. Not the right preoptimization passes

The instructions on the Polly website use a very short sequence of preoptimization passes that just matches the simple example presented. To optimize real world programs that are not 100% standard you need a larger sequence of prepasses. In your example the loop iv does not start at '0' so further canonicalication is required.

The minimal sequence of optimization passes for your example is:
opt -mem2reg -instcombine -simplifycfg -loop-rotate -indvars

2. -enable-iv-rewrite

LLVM recently disabled by default the induction variable rewrite. This feature is necessary for the Polly preoptimizations. Hence, you need to reenable it with the above flag during preoptimization.

-> opt -mem2reg -instcombine -simplifycfg -loop-rotate -indvars -enable-iv-rewrite test.s -S > test.preopt.ll

In general I use an even longer sequence of preoptimizations. This sequence is built directly into Polly and is defined in RegisterPasses.cpp. To get the corresponding opt flags you can load Polly into clang and use the following command (-O3 is important):

> clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm -debug-pass=Arguments
Pass Arguments: -targetdata -no-aa -targetlibinfo -tbaa -basicaa -preverify -domtree -verify -mem2reg -instcombine -simplifycfg -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa -loop-rotate -instcombine -scalar-evolution -loop-simplify -lcssa -indvars -polly-prepare -postdomtree -domfrontier -regions -polly-region-simplify -scalar-evolution -loop-simplify -lcssa -indvars -postdomtree -domfrontier -regions -polly-detect -polly-independent -polly-analyze-ir -polly-scops -polly-dependences -polly-optimize-isl -polly-cloog -polly-codegen -simplifycfg -domtree -scalarrepl -early-cse -lower-expect

If you remove the first pass (-targetdata) and the passes after -polly-detect you get the list of preparing transformations (and the analysis they need).

Today I also further improved clang support. With the patch: 'Fix parsing of command line options for LLVM plugins' as posted today on cfe-commits, you can get the following new features:

1. Optimize a .c file automatically

Run: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm -enable-polly-viewer -mllvm -enable-iv-rewrite

Polly + all the preparing transformations are automatically scheduled.

2. Graphical SCoP viewer

Run: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm -enable-polly-viewer -mllvm -enable-iv-rewrite

Show for every function a graphviz graph that highlights the detected
SCoPs. Here we also show for every non-scop region, the reason why it
is not a SCoP. (This needs graphviz installed when run LLVM configure is run. You may also consider to install xdot.py [1] as a more convenient .dot file viewer. xdot.py will used automatically, if it is
in the PATH during 'configure' of LLVM)

3. OpenMP:

RUN: clang test.c -O3 -Xclang -load -Xclang lib/LLVMPolly.so -mllvm -enable-polly-openmp -lgomp

This will automatically OpenMP parallelize your code.

(Remember for all this LLVM/Clang/Polly need to be built together such
that they are in sync and .so loading works)

Cheers
Tobi

[1] http://code.google.com/p/jrfonseca/wiki/XDot

I was trying the new feature you introduce about printing out the
graphs, so I updated my version of llvm/clang/polly synchronizing them
to the last version, but I get this error launching clang (also , I
recently switched to MacOS X for development):

$ clang not_so_simple_loop.c -O3 -Xclang -load -Xclang
${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -mllvm -enable-polly-viewer
-mllvm -enable-iv-rewrite

error: unable to load plugin
'/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib':
'dlopen(/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib,
9): Symbol not found: __ZN4llvm10RegionInfo2IDE
  Referenced from:
/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib
  Expected in: flat namespace
in /Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib'
clang (LLVM option parsing): Unknown command line argument
'-enable-polly-viewer'. Try: 'clang (LLVM option parsing) -help'
clang (LLVM option parsing): Did you mean '-enable-ppc-preinc'?

$ clang -v
clang version 3.1 (trunk 142724)
Target: x86_64-apple-darwin11.2.0
Thread model: posix

Seems like it tries to load a symbol that it doesn't find ...
I have synchronized all clang/llvm/polly to the latest version and I
compiled them all together. Loading polly with "opt" works strangely
... :

opt -S -load ${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -mem2reg -no-aa
-targetlibinfo -tbaa -basicaa -preverify -domtree -verify -mem2reg
-instcombine -simplifycfg -tailcallelim -simplifycfg -reassociate
-domtree -loops -loop-simplify -lcssa -loop-rotate -instcombine
-scalar-evolution -loop-simplify -lcssa -indvars -polly-prepare
-postdomtree -domfrontier -regions -polly-region-simplify
-scalar-evolution -loop-simplify -lcssa -indvars -postdomtree
-domfrontier -regions -loop-simplify -indvars -enable-iv-rewrite
not_so_simple_loop.s

strange ...

I was trying the new feature you introduce about printing out the
graphs, so I updated my version of llvm/clang/polly synchronizing them
to the last version, but I get this error launching clang (also , I
recently switched to MacOS X for development):

$ clang not_so_simple_loop.c -O3 -Xclang -load -Xclang
${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -mllvm -enable-polly-viewer
-mllvm -enable-iv-rewrite

I documented the use of Polly in clang on a new website:
http://polly.grosser.es/example_load_Polly_into_clang.html

error: unable to load plugin
'/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib':
'dlopen(/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib,
9): Symbol not found: __ZN4llvm10RegionInfo2IDE
   Referenced from:
/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib
   Expected in: flat namespace
  in /Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib'
clang (LLVM option parsing): Unknown command line argument
'-enable-polly-viewer'. Try: 'clang (LLVM option parsing) -help'
clang (LLVM option parsing): Did you mean '-enable-ppc-preinc'?

$ clang -v
clang version 3.1 (trunk 142724)
Target: x86_64-apple-darwin11.2.0
Thread model: posix

Seems like it tries to load a symbol that it doesn't find ...
I have synchronized all clang/llvm/polly to the latest version and I
compiled them all together. Loading polly with "opt" works strangely
... :

opt -S -load ${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -mem2reg -no-aa
-targetlibinfo -tbaa -basicaa -preverify -domtree -verify -mem2reg
-instcombine -simplifycfg -tailcallelim -simplifycfg -reassociate
-domtree -loops -loop-simplify -lcssa -loop-rotate -instcombine
-scalar-evolution -loop-simplify -lcssa -indvars -polly-prepare
-postdomtree -domfrontier -regions -polly-region-simplify
-scalar-evolution -loop-simplify -lcssa -indvars -postdomtree
-domfrontier -regions -loop-simplify -indvars -enable-iv-rewrite
not_so_simple_loop.s

strange ...

Very strange. I have seen such problems previously and they were often
related to enabling rtti (runtime type info) or exceptions in Polly, but no clang or the other way around. I am not sure if this is the case her.

Another possibility is that clang does not use the RegionInfo stuff and
it gets dead code eliminated.

Can you run the following commands on the clang binary:

ldd /path/to/clang
objdump -xC /path/to/clang | grep RegionInfo

Can you also run them on the opt tool?

Are you building with cmake or autoconf? Do you use --enable-shared/BUILD_SHARED_LIBS. In case you do/don't it would be interesting to try to switch. Especially using shared libraries may help as it may block some dead code elimination.

Cheers
Tobi

[..]

Seems like it tries to load a symbol that it doesn't find ...
I have synchronized all clang/llvm/polly to the latest version and I
compiled them all together. Loading polly with "opt" works strangely
... :

opt -S -load ${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -mem2reg -no-aa
-targetlibinfo -tbaa -basicaa -preverify -domtree -verify -mem2reg
-instcombine -simplifycfg -tailcallelim -simplifycfg -reassociate
-domtree -loops -loop-simplify -lcssa -loop-rotate -instcombine
-scalar-evolution -loop-simplify -lcssa -indvars -polly-prepare
-postdomtree -domfrontier -regions -polly-region-simplify
-scalar-evolution -loop-simplify -lcssa -indvars -postdomtree
-domfrontier -regions -loop-simplify -indvars -enable-iv-rewrite
not_so_simple_loop.s

strange ...

Very strange. I have seen such problems previously and they were often
related to enabling rtti (runtime type info) or exceptions in Polly, but
no clang or the other way around. I am not sure if this is the case her.

Another possibility is that clang does not use the RegionInfo stuff and
it gets dead code eliminated.

I am even more convinced this is the case. Can you try the attached patch and report if it works for you?

Cheers
Tobi

0001-Add-LinkAllPasses-to-clang.patch (901 Bytes)

otool is equivalent to objtool on macosx

MacBook-Pro-di-Marcello:llvm-build Kariddi$ otool -tV
../git-prefix/bin/clang|grep RegionInfo
MacBook-Pro-di-Marcello:llvm-build Kariddi$

MacBook-Pro-di-Marcello:llvm-build Kariddi$ otool -tV
../git-prefix/bin/opt|grep RegionInfo
00000001000152d0 callq __ZN4llvm20createRegionInfoPassEv
00000001002bc65c callq __ZN4llvm24initializeRegionInfoPassERNS_12PassRegistryE
__ZN4llvm6RegionC1EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_:
000000010039e138 callq __ZN4llvm6RegionC2EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_
__ZN4llvm6RegionC2EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_:
000000010039f3b3 callq __ZNK4llvm10RegionInfo12getRegionForEPNS_10BasicBlockE
__ZNK4llvm10RegionInfo12getRegionForEPNS_10BasicBlockE:
000000010039fc34 callq __ZN4llvm10RegionInfo12setRegionForEPNS_10BasicBlockEPNS_6RegionE
__ZN4llvm10RegionInfo12setRegionForEPNS_10BasicBlockEPNS_6RegionE:
00000001003a022e callq __ZNK4llvm10RegionInfo12getRegionForEPNS_10BasicBlockE
00000001003a02f7 callq __ZN4llvm6RegionC1EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_
00000001003a050e callq __ZN4llvm6RegionC1EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_
__ZNK4llvm10RegionInfo19isCommonDomFrontierEPNS_10BasicBlockES2_S2_:
__ZNK4llvm10RegionInfo8isRegionEPNS_10BasicBlockES2_:
00000001003a0edf callq __ZNK4llvm10RegionInfo19isCommonDomFrontierEPNS_10BasicBlockES2_S2_
__ZNK4llvm10RegionInfo14insertShortCutEPNS_10BasicBlockES2_PNS_8DenseMapIS2_S2_NS_12DenseMapInfoIS2_EES5_EE:
__ZNK4llvm10RegionInfo14getNextPostDomEPNS_15DomTreeNodeBaseINS_10BasicBlockEEEPNS_8DenseMapIPS2_S6_NS_12DenseMapInfoIS6_EES8_EE:
__ZNK4llvm10RegionInfo15isTrivialRegionEPNS_10BasicBlockES2_:
__ZN4llvm10RegionInfo16updateStatisticsEPNS_6RegionE:
__ZN4llvm10RegionInfo12createRegionEPNS_10BasicBlockES2_:
00000001003a1437 callq __ZNK4llvm10RegionInfo15isTrivialRegionEPNS_10BasicBlockES2_
00000001003a14a5 callq __ZN4llvm6RegionC1EPNS_10BasicBlockES2_PNS_10RegionInfoEPNS_13DominatorTreeEPS0_
00000001003a1527 callq __ZN4llvm10RegionInfo16updateStatisticsEPNS_6RegionE
__ZN4llvm10RegionInfo20findRegionsWithEntryEPNS_10BasicBlockEPNS_8DenseMapIS2_S2_NS_12DenseMapInfoIS2_EES5_EE:
00000001003a15e9 callq __ZNK4llvm10RegionInfo14getNextPostDomEPNS_15DomTreeNodeBaseINS_10BasicBlockEEEPNS_8DenseMapIPS2_S6_NS_12DenseMapInfoIS6_EES8_EE
00000001003a162a callq __ZNK4llvm10RegionInfo8isRegionEPNS_10BasicBlockES2_
00000001003a1648 callq __ZN4llvm10RegionInfo12createRegionEPNS_10BasicBlockES2_
00000001003a16c6 callq __ZNK4llvm10RegionInfo14insertShortCutEPNS_10BasicBlockES2_PNS_8DenseMapIS2_S2_NS_12DenseMapInfoIS2_EES5_EE
__ZN4llvm10RegionInfo14scanForRegionsERNS_8FunctionEPNS_8DenseMapIPNS_10BasicBlockES5_NS_12DenseMapInfoIS5_EES7_EE:
00000001003a17d8 callq __ZN4llvm10RegionInfo20findRegionsWithEntryEPNS_10BasicBlockEPNS_8DenseMapIS2_S2_NS_12DenseMapInfoIS2_EES5_EE
__ZN4llvm10RegionInfo16getTopMostParentEPNS_6RegionE:
__ZN4llvm10RegionInfo16buildRegionsTreeEPNS_15DomTreeNodeBaseINS_10BasicBlockEEEPNS_6RegionE:

.... lots of other stuff.

Now I'll try the patch you posted and report, thanks :slight_smile:

Strange , with --enable-shared (I use auto tool by the way ...) it gives:

MacBook-Pro-di-Marcello:examples Kariddi$ ./compile_ex.sh not_so_simple_loop
clang (LLVM option parsing): Unknown command line argument
'-enable-polly-viewer'. Try: 'clang (LLVM option parsing) -help'
clang (LLVM option parsing): Did you mean '-enable-polly-vector'?

Seems like it doesn't compile the viewer option ...

This error message is because the option name was changed. Try -polly-show.

Furthermore, I plan to commit the change to clang, that solves the failure when loading clang. Though it will probably take a day until it gets through review.

Cheers
Tobi

Perfect, thank you very much :slight_smile:

Mmm, this code seems to kill polly:

#include <stdio.h>
#include <stdlib.h>

int main()
{
        char *B;
  int i,j,k,h;
  const int x = 0, y=0;
  B = (char *)malloc(sizeof(char)*1024*1024);

  for (i = 1; i < 1024; i++)
    for (j = 1; j < 1024; j++)
    {
      if (i+j > 1000)
        B[j] = i;
}
  printf("Random Value: %d", B[rand() % 1024*1024]);

  return 0;
}

running:

opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -polly-scops -analyze
code.preopt.ll

I get:

Printing analysis 'Polly - Create polyhedral description of Scops' for
region: 'for.body3 => for.inc.single_exit' in function 'main':
Invalid Scop!
0 libLLVM-3.1svn.dylib 0x0000000103fab905 _ZL15PrintStackTracePv + 53
1 libLLVM-3.1svn.dylib 0x0000000103fabf79 _ZL13SignalHandleri + 361
2 libsystem_c.dylib 0x00007fff94c8acfa _sigtramp + 26
3 libLLVM-3.1svn.dylib 0x0000000103526810
llvm::SmallVectorTemplateCommon<llvm::SCEV
const*>::operator(unsigned int) + 128
4 LLVMPolly.dylib 0x0000000106867af8
SCEVAffinator::getLoopDepth(llvm::Loop const*) + 72
5 LLVMPolly.dylib 0x00000001068676c7
SCEVAffinator::visitAddRecExpr(llvm::SCEVAddRecExpr const*) + 247
6 LLVMPolly.dylib 0x000000010686710b
llvm::SCEVVisitor<SCEVAffinator, isl_pw_aff*>::visit(llvm::SCEV
const*) + 283
7 LLVMPolly.dylib 0x0000000106866f71
SCEVAffinator::visit(llvm::SCEV const*) + 449
8 LLVMPolly.dylib 0x000000010686766c
SCEVAffinator::visitAddRecExpr(llvm::SCEVAddRecExpr const*) + 156
9 LLVMPolly.dylib 0x000000010686710b
llvm::SCEVVisitor<SCEVAffinator, isl_pw_aff*>::visit(llvm::SCEV
const*) + 283
10 LLVMPolly.dylib 0x0000000106866f71
SCEVAffinator::visit(llvm::SCEV const*) + 449
11 LLVMPolly.dylib 0x000000010685fd49
SCEVAffinator::getPwAff(polly::ScopStmt const*, llvm::SCEV const*,
llvm::Value const*) + 57
12 LLVMPolly.dylib 0x000000010685d076
polly::ScopStmt::buildConditionSet(polly::Comparison const&) const +
54
13 LLVMPolly.dylib 0x000000010685d414
polly::ScopStmt::addConditionsToDomain(isl_set*, polly::TempScop&,
llvm::Region const&) const + 196
14 LLVMPolly.dylib 0x000000010685d519
polly::ScopStmt::buildDomain(polly::TempScop&, llvm::Region const&)
const + 169
15 LLVMPolly.dylib 0x000000010685d90d
polly::ScopStmt::ScopStmt(polly::Scop&, polly::TempScop&, llvm::Region
const&, llvm::BasicBlock&, llvm::SmallVectorImpl<llvm::Loop*>&,
llvm::SmallVectorImpl<unsigned int>&) + 861
16 LLVMPolly.dylib 0x000000010685d59d
polly::ScopStmt::ScopStmt(polly::Scop&, polly::TempScop&, llvm::Region
const&, llvm::BasicBlock&, llvm::SmallVectorImpl<llvm::Loop*>&,
llvm::SmallVectorImpl<unsigned int>&) + 77
17 LLVMPolly.dylib 0x000000010685ef60
polly::Scop::buildScop(polly::TempScop&, llvm::Region const&,
llvm::SmallVectorImpl<llvm::Loop*>&, llvm::SmallVectorImpl<unsigned

&, llvm::LoopInfo&) + 624

18 LLVMPolly.dylib 0x000000010685eeab
polly::Scop::buildScop(polly::TempScop&, llvm::Region const&,
llvm::SmallVectorImpl<llvm::Loop*>&, llvm::SmallVectorImpl<unsigned

&, llvm::LoopInfo&) + 443

19 LLVMPolly.dylib 0x000000010685ebee
polly::Scop::Scop(polly::TempScop&, llvm::LoopInfo&,
llvm::ScalarEvolution&, isl_ctx*) + 462
20 LLVMPolly.dylib 0x000000010685ea15
polly::Scop::Scop(polly::TempScop&, llvm::LoopInfo&,
llvm::ScalarEvolution&, isl_ctx*) + 53
21 LLVMPolly.dylib 0x000000010685f8f3
polly::ScopInfo::runOnRegion(llvm::Region*, llvm::RGPassManager&) +
227
22 libLLVM-3.1svn.dylib 0x00000001034f4dc1
llvm::RGPassManager::runOnFunction(llvm::Function&) + 1185
23 libLLVM-3.1svn.dylib 0x0000000103a0f261
llvm::FPPassManager::runOnFunction(llvm::Function&) + 497
24 libLLVM-3.1svn.dylib 0x0000000103a0f5cd
llvm::FPPassManager::runOnModule(llvm::Module&) + 125
25 libLLVM-3.1svn.dylib 0x0000000103a0f855
llvm::MPPassManager::runOnModule(llvm::Module&) + 549
26 libLLVM-3.1svn.dylib 0x0000000103a0ffac
llvm::PassManagerImpl::run(llvm::Module&) + 172
27 libLLVM-3.1svn.dylib 0x0000000103a10491
llvm::PassManager::run(llvm::Module&) + 33
28 opt 0x00000001032dbb76 main + 6966
29 opt 0x00000001032ca5a4 start + 52
Stack dump:
0. Program arguments: opt -load
/Users/Kariddi/Documents/Sviluppo/Tesi/git-prefix/lib//LLVMPolly.dylib
-polly-scops -analyze strange_pointer.preopt.ll
1. Running pass 'Function Pass Manager' on module 'strange_pointer.preopt.ll'.
2. Running pass 'Region Pass Manager' on function '@main'
3. Running pass 'Polly - Create polyhedral description of Scops' on
basic block '%for.body3.single_entry'
./compile_ex.sh: line 9: 36824 Segmentation fault: 11 opt -load
${PATH_TO_POLLY_LIB}/LLVMPolly.dylib -polly-scops -analyze
$1.preopt.ll

Also I want to ask:

Why Bitcast instructions make polly discard scops?

Thanks
Marcello

Mmm I found out a very strange behavior (to me) of the SCEV analysis
of the loop bound of the external loop I posted.
When in ScopDetection it gets the SCEV of the external loop bound in
the "isValidLoop()" function with:
    const SCEV *LoopCount = SE->getBackedgeTakenCount(L);

It returns a SCEVCouldNotCompute, but if I change the "if" block
inside the loop from:
    if (i+j > 1000)
to:
    if (i > 1000)

that SCEV results valid.

Later in the ScopInfo pass it crashes analyzing the SCEV of the
comparison expression (in buildConditionSets ) . It's like if it
recognizes that "i" is a recurring expression that depends on the
execution of a loop, but can't derive that loop, and segfaults in
getLoopDepth ...

Seems like a bug of the SCEV engine to me., but I'm not sure :frowning:

Hi Marcello,

sorry for taking so long to reply. I just had a look at your test case
and found the problem. What happens is that Polly fails to accept the
outer loop (an unrelated issue), such that only the inner loop is detected as a scop (you can verify this with -view-scops). Now when
building the polyhedral representation we analyze the SCEV expressions
in the inner loop (the scop). Here the condition has an expression that
looks similar to this:

{1, +, 1024}<i> + {1,+,1024}<j>

We translate this expression. When reaching {1, +, 1024}<i> we do not
check if it is part of the scop and always handle it as a loop index.
This is incorrect and fails when we try to find the corresponding loop
in the scop. The right thing to do is to handle this expression as a parameter.

Consequently this is no SCEV bug, but an ordinary Polly bug. It most probably slipped in when I recently refactored the SCEV parsing. I attached a hackish patch that should fix the issue for exactly this
test case. Can you check it works for you?

I will commit a complete fix later on.

Cheers
Tobi

0001-Hack-around-parameter-handled-as-IV-issue.patch (1.87 KB)

Hi Tobias.

I worked on enabling Polly accepting non affine memory accesses and I
produced a patch.
I saw that there were a lot of updates in Polly recently, so I had to
redo a lot of the work I did and that slowed me quite a bit.
I tested the patch on some programs and it seems to work and not to
break anything, but you know the topic much better than me, so if you
find something wrong please tell me! (at least it shouldn't produce a
crash I believe ... I tested the patch quite a lot from that point of
view ...). The patch is in the attachment to this mail and was last
tested on LLVM SVN revision 144502 .

I also found a bug in polly or ISL/GMP libraries. If polly encounters
an array access of type:

A[i][i]

it enters an infinite loop (I believe it is an infinite loop because I
waited 5 minutes and it didn't terminate compiling a code that was
like 15 lines long). I tried to track down the problem with gdb , but
it seems to be in some gmp function called by ISL of which I'm not a
very expert of.
This is the backtrace of the manually terminated process in gdb:
(gdb) bt
#0 0x0000000101828422 in __gmpn_gcd_1 ()
#1 0x00000001018102ea in __gmpz_gcd ()
#2 0x0000000101cd0f33 in isl_aff_scale_down ()
#3 0x0000000101a5a175 in polly::MemoryAccess::MemoryAccess
(this=0x101911ed0, Access=@0x10190fc48, Statement=0x10190ffa0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:308
#4 0x0000000101a5b22a in polly::ScopStmt::buildAccesses
(this=0x10190ffa0, tempScop=@0x10190eb70, CurRegion=@0x10190eed0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:588
#5 0x0000000101a5bc08 in polly::ScopStmt::ScopStmt (this=0x10190ffa0,
parent=@0x10190fdf0, tempScop=@0x10190eb70, CurRegion=@0x10190eed0,
bb=@0x101905bd0, NestLoops=@0x7fff5fbfec20, Scatter=@0x7fff5fbfebe0)
at /Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:716
#6 0x0000000101a5b86d in polly::ScopStmt::ScopStmt (this=0x10190ffa0,
parent=@0x10190fdf0, tempScop=@0x10190eb70, CurRegion=@0x10190eed0,
bb=@0x101905bd0, NestLoops=@0x7fff5fbfec20, Scatter=@0x7fff5fbfebe0)
at /Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:699
#7 0x0000000101a5d460 in polly::Scop::buildScop (this=0x10190fdf0,
tempScop=@0x10190eb70, CurRegion=@0x10190eed0,
NestLoops=@0x7fff5fbfec20, Scatter=@0x7fff5fbfebe0, LI=@0x101909a40)
at /Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:1032
#8 0x0000000101a5d3ab in polly::Scop::buildScop (this=0x10190fdf0,
tempScop=@0x10190eb70, CurRegion=@0x10190ef40,
NestLoops=@0x7fff5fbfec20, Scatter=@0x7fff5fbfebe0, LI=@0x101909a40)
at /Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:1025
#9 0x0000000101a5d0de in polly::Scop::Scop (this=0x10190fdf0,
tempScop=@0x10190eb70, LI=@0x101909a40, ScalarEvolution=@0x10190b510,
Context=0x1019094b0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:918
#10 0x0000000101a5cf05 in polly::Scop::Scop (this=0x10190fdf0,
tempScop=@0x10190eb70, LI=@0x101909a40, ScalarEvolution=@0x10190b510,
Context=0x1019094b0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:907
#11 0x0000000101a5df23 in polly::ScopInfo::runOnRegion
(this=0x101909440, R=0x10190ef40, RGM=@0x10190ce80) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/polly/lib/Analysis/ScopInfo.cpp:1085
#12 0x00000001003b5641 in llvm::RGPassManager::runOnFunction
(this=0x10190ce80, F=@0x1019052a0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/Analysis/RegionPass.cpp:96
#13 0x00000001005e90d1 in llvm::FPPassManager::runOnFunction
(this=0x10190a240, F=@0x1019052a0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/VMCore/PassManager.cpp:1512
#14 0x00000001005e943d in llvm::FPPassManager::runOnModule
(this=0x10190a240, M=@0x1019067f0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/VMCore/PassManager.cpp:1534
#15 0x00000001005e96c5 in llvm::MPPassManager::runOnModule
(this=0x1019089e0, M=@0x1019067f0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/VMCore/PassManager.cpp:1588
#16 0x00000001005e9e1c in llvm::PassManagerImpl::run
(this=0x1019086a0, M=@0x1019067f0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/VMCore/PassManager.cpp:1672
#17 0x00000001005ea301 in llvm::PassManager::run (this=0x7fff5fbffa38,
M=@0x1019067f0) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/lib/VMCore/PassManager.cpp:1716
#18 0x00000001000127d6 in main (argc=7, argv=0x7fff5fbffb38) at
/Users/Kariddi/Documents/Sviluppo/Tesi/llvm/tools/opt/opt.cpp:706
(gdb)

And this is the code that causes the problem:

#include <stdio.h>
#include <stdlib.h>

int main()
{
       int A[1024][1024];
       int i,j,k,h;
       const int x = 0, y=0;
       for (i = 1; i < 1024; i++)
               for (j = 1; j < 1024 ; j++)
               {
                               A[i][j] = A[j][j];
               }
       printf("Random Value: %d", A[rand() % 1024*1024]);

       return 0;
}

I actually don't know if it is an ISL bug or an improper use by polly.
To compile isl I use gmp-5.0.2

Marcello

accept_affine.patch (5.35 KB)

Hi Tobias.

I worked on enabling Polly accepting non affine memory accesses and I
produced a patch.

Great.

I saw that there were a lot of updates in Polly recently, so I had to
redo a lot of the work I did and that slowed me quite a bit.

Ups, sorry! However, I believe without these changes detecting non-affine memory access functions would have been a lot more complicated.

I tested the patch on some programs and it seems to work and not to
break anything, but you know the topic much better than me, so if you
find something wrong please tell me! (at least it shouldn't produce a
crash I believe ... I tested the patch quite a lot from that point of
view ...). The patch is in the attachment to this mail and was last
tested on LLVM SVN revision 144502 .

OK. Review is inlined in the patch.

I also found a bug in polly or ISL/GMP libraries. If polly encounters
an array access of type:

A[i][i]

it enters an infinite loop (I believe it is an infinite loop because I
waited 5 minutes and it didn't terminate compiling a code that was
like 15 lines long). I tried to track down the problem with gdb , but
it seems to be in some gmp function called by ISL of which I'm not a
very expert of.

[remove backtrace & code]

I actually don't know if it is an ISL bug or an improper use by polly.
To compile isl I use gmp-5.0.2

I will look into this.

--- ./include/polly/ScopDetection.h 2011-11-13 02:37:59.000000000 +0100
+++ ../llvm2/tools/polly/./include/polly/ScopDetection.h 2011-11-13 01:11:17.000000000 +0100
@@ -145,6 +145,8 @@
    /// @return True if the call instruction is valid, false otherwise.
    static bool isValidCallInst(CallInst&CI);

+ const Value *GetBaseValue(const SCEV *S) const;

Why is this function needed? For me it looks like it implements
the same as

SCEVValue *Operand = dyn_cast<SCEVValue>(getPointerOperand())

if (!Operand)
   return invalid Value;

return Operand->getValue();

    /// @brief Check if a memory access can be part of a Scop.
    ///
    /// @param Inst The instruction accessing the memory.
--- ./include/polly/TempScopInfo.h 2011-11-13 02:37:59.000000000 +0100
+++ ../llvm2/tools/polly/./include/polly/TempScopInfo.h 2011-11-13 02:34:47.000000000 +0100
@@ -45,12 +45,13 @@
  private:
    unsigned ElemBytes;
    TypeKind Type;
+ bool is_affine;

I think IsAffine matches more the LLVM coding conventions.

  public:
    explicit IRAccess (TypeKind Type, const Value *BaseAddress,
- const SCEV *Offset, unsigned elemBytes)
+ const SCEV *Offset, unsigned elemBytes, bool affine)

'affine' should start with an uppercase letter. Polly itself has some inconsistent naming, but we started to follow the LLVM coding conventions since a while.

      : BaseAddress(BaseAddress), Offset(Offset),
- ElemBytes(elemBytes), Type(Type) {}
+ ElemBytes(elemBytes), Type(Type), is_affine(affine) {}

    enum TypeKind getType() const { return Type; }

@@ -60,7 +61,10 @@

    unsigned getElemSizeInBytes() const { return ElemBytes; }

+ bool isAffine() const { return is_affine; }
+
    bool isRead() const { return Type == READ; }
+
  };

  class Comparison {
--- ./lib/Analysis/ScopDetection.cpp 2011-11-13 02:38:06.000000000 +0100
+++ ../llvm2/tools/polly/./lib/Analysis/ScopDetection.cpp 2011-11-13 01:28:00.000000000 +0100
@@ -226,6 +226,24 @@
    return false;
  }

+const Value *ScopDetection::GetBaseValue(const SCEV *S) const {
+ if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S)) {
+ // In an addrec, assume that the base will be in the start, rather
+ // than the step.
+ return GetBaseValue(AR->getStart());
+ } else if (const SCEVAddExpr *A = dyn_cast<SCEVAddExpr>(S)) {
+ // If there's a pointer operand, it'll be sorted at the end of the list.
+ const SCEV *Last = A->getOperand(A->getNumOperands()-1);
+ if (Last->getType()->isPointerTy())
+ return GetBaseValue(Last);
+ } else if (const SCEVUnknown *U = dyn_cast<SCEVUnknown>(S)) {
+ // This is a leaf node.
+ return U->getValue();
+ }
+ // No Identified object found.
+ return 0;
+}

As remarked earlier, I believ this can be replaced by getPointerOperand().

  bool ScopDetection::isValidMemoryAccess(Instruction&Inst,
                                          DetectionContext&Context) const {
    Value *Ptr = getPointerOperand(Inst);
@@ -236,8 +254,12 @@
    BasePointer = dyn_cast<SCEVUnknown>(SE->getPointerBase(AccessFunction));

    if (!BasePointer)
- INVALID(AffFunc, "No base pointer");
-
+ {
+// INVALID(AffFunc, "No base pointer");
+ BaseValue = (Value*) GetBaseValue(AccessFunction);
+ }
+ else
+ {
    BaseValue = BasePointer->getValue();

I don't see a need for any change here. Both functions get the base pointer and we should fail, if we cannot derive a base pointer. Do you have a test case where getPointerBase() does not yield a valid base pointer, but GetBaseValue(AccessFunction) does?

    if (isa<UndefValue>(BaseValue))
@@ -245,8 +267,9 @@

    AccessFunction = SE->getMinusSCEV(AccessFunction, BasePointer);

- if (!isAffineExpr(&Context.CurRegion, AccessFunction, *SE, BaseValue))
- INVALID(AffFunc, "Bad memory address "<< *AccessFunction);
+ //if (!isAffineExpr(&Context.CurRegion, AccessFunction, *SE, BaseValue))
+ // INVALID(AffFunc, "Bad memory address "<< *AccessFunction);

Can you leave this and add a command line option
'-polly-allow-nonaffine' to allow non affine access functions.

+ }

    // FIXME: Alias Analysis thinks IntToPtrInst aliases with alloca instructions
    // created by IndependentBlocks Pass.
--- ./lib/Analysis/ScopInfo.cpp 2011-11-13 02:38:06.000000000 +0100
+++ ../llvm2/tools/polly/./lib/Analysis/ScopInfo.cpp 2011-11-13 02:36:12.000000000 +0100
@@ -310,9 +310,13 @@
    Type = Access.isRead() ? Read : Write;
    statement = Statement;

- isl_pw_aff *Affine = SCEVAffinator::getPwAff(Statement, Access.getOffset());
    BaseAddr = Access.getBase();

+ if (Access.isAffine())
+ {

This should be

if (Access.isAffine()) {

+
+ isl_pw_aff *Affine = SCEVAffinator::getPwAff(Statement, Access.getOffset());
+
    setBaseName();

    // Devide the access function by the size of the elements in the array.
@@ -334,6 +338,12 @@
    AccessRelation = isl_map_set_tuple_name(AccessRelation, isl_dim_out,
                                            getBaseName().c_str());
  }
+ else
+ {

this should be
     } else {

+ Type = MayWrite;

You are right, we should use may-accesses here. But why always setting the type to may-write? A read should remain a read (as there is no difference between read and may-read).

+ AccessRelation = isl_map_from_basic_map(createBasicAccessMap(Statement));
+ }
+}

  void MemoryAccess::realignParams() {
    isl_space *ParamSpace = statement->getParent()->getParamSpace();
--- ./lib/Analysis/TempScopInfo.cpp 2011-11-13 02:38:06.000000000 +0100
+++ ../llvm2/tools/polly/./lib/Analysis/TempScopInfo.cpp 2011-11-13 02:35:22.000000000 +0100
@@ -98,13 +98,24 @@
          dyn_cast<SCEVUnknown>(SE->getPointerBase(AccessFunction));

        assert(BasePointer&& "Could not find base pointer");
-
        AccessFunction = SE->getMinusSCEV(AccessFunction, BasePointer);
+
+ if (isAffineExpr(&R, AccessFunction, *SE, BasePointer->getValue()))
+ {

Put the '{' in the previous line.

+
        Functions.push_back(std::make_pair(IRAccess(Type,
                                                    BasePointer->getValue(),
- AccessFunction, Size),
+ AccessFunction, Size, true),
                                           &Inst));
      }
+ else
+ {

This should be

} else {

+ Functions.push_back(std::make_pair(IRAccess(Type,
+ BasePointer->getValue(),
+ AccessFunction, Size, false),
+&Inst));
+ }
+ }
    }

    if (Functions.empty())

Besides the comments above, the patch looks great. It is a nice improvement to Polly. Can you repost it after the comments are addressed? Also could you include a minimal test case in the patch, that verifies this features is working correctly.

Thanks
Tobi

P.S.: Please post patches to the mailing lists.

The recent rewrite of the SCEV validation fixed this bug.

Tobi