The scalar evolution pass doesn’t to anything when it runs except initialize some empty maps. The important one is the Value->SCEV map. SCEV is the class that holds an expression tree. Scalar evolution populates this map on-demand when the client asks for an expression via ScalarEvolution::getSCEV(Value).
IndVarSimplify and LoopStrengthReduce are example SCEV clients.
Just be careful to invalidate SCEV entries when you mutate the IR.
In general your approach is correct, but scalar evolution is in your case not able to derive an access function that is defined in terms of loop iterators. (If it would you would have something like {0, + 1}<loop_1> in your scev expression).
What I suspect is that you need to run some canonicalization passes before you actually run the scalar evolution pass. Most of the time these passes should be sufficient:
-correlated-propagation -mem2reg -instcombine -loop-simplify -indvars
But in Polly we use e.g.:
-basicaa -mem2reg -simplify-libcalls -simplifycfg -instcombine -tailcallelim -loop-simplify -lcssa -loop-rotate -lcssa -loop-unswitch -instcombine -loop-simplify -lcssa -indvars -loop-deletion -instcombine -polly-prepare -polly-region-simplify -indvars
If you send me the .ll file you run your tool on, I could take a closer look.
You are right, we need to run some other passes before running the scalar evolution pass. The sequence that I run for this example is -O3 -loop-simplify -reg2mem. This is why I did not obtain the expressions depending on the loop indices. So I removed the reg2mem pass and scalar evolution computes the correct functions.
However, I need to run the reg2mem pass (or any other that would eliminate the phi nodes) before calling my own passes. So probably we are going to run the scalar evolution on the code containing the phi nodes, run reg2mem and try to identify the original variables in the new code built after reg2mem.
OK. In Polly we developed a pass called, 'independent-blocks-pass'. It basically creates basic blocks, that can easily be rescheduled without stopping the scalar evolution analysis to work. Maybe something similar can help you. Details about this pass are available in my thesis.
It is extremely polyhedral. The basic idea is that all calculations that scalar evolution can analyze (including canonical induction variables), are kept in registers. Any other values are promoted to memory.
As we keep the information about operations scalar evolution can express in our polyhedral data structures, we do not care about them while reworking the CFG. When generating the changed structure, we create new expressions that calculate loop bounds and access functions.
In respect of PHI nodes, we keep the canonical induction variables as PHI nodes. Those are ignored during reworking the CFG and regenerated from the polyhedral description. All other PHI-node (are promoted to memory. Those are basically the inter basic block, scalar dependences.