[GSoC'16] Need details on New Transformations and Analyses

Hi everyone,

I am very interested in contributing to LLVM project in this year’s GSoC. I am new with LLVM, but this is used in the compiler course in my university. So, I am thinking to involve in LLVM development to have a better knowledge of the system. Currently, I am preparing the proposal.

One of the project that caught my eyes is “New Transformations and Analysis”. Several code transformations and analyses have been introduced in the compiler course that I am currently taking. That’s why I am thinking to involve in writing some new transformations and code analyses. But the list of transformations in the LLVM Open Projects web page seems too brief for me and I need more details on those stuffs.

Loop Dependence Analysis Infrastructure. I have looked in the source codes repo and I saw that there is a file named “DependenceAnalysis.cpp”. So, does that mean this analysis has been implemented?

Value range propagation pass. There was a discussion about this topic (https://groups.google.com/forum/#!topic/llvm-dev/XXqfemtDX74/discussion). Someone already proposed to do this pass for several years ago GSoC. But I can’t find the progress of the work. If no progress, then does it mean that the VRP based on Patterson’s paper need to be implemented although range analysis has been implemented?

Predictive Commoning. The presentation side by Arie Tal seems provide quite clear explanation and examples of the algorithm. I guess the implementation should be straightforward, isn’t it?

Type Inference (aka. Devirtualization) and Value assertions.

Can I get more details of these topics? Does the type inference mean the translation of auto keyword or something else? For value assertions, “unreachable” intrinsic seems has been implemented cause I can find the usage in some of the testcases.

Finally, for this project, must I propose to do all of these analyses and transformations in my GSoC proposal or can I just propose some of them? In addition, I am also looking for a mentor for guidance?

Looking forward for your comments and feedbacks.

Thank you.

Best regards,

Aries Thio.

I believe major progress has been made it, but haven’t been following it closely. I’d suggest talking to committers active in this file in the recent past to determine what useful work might be left of appropriate scope. This is largely stalled. The key problem is that between LazyValueInfo (constant ranges) and SCEV (symbolic ranges in loops), there’s fairly little profit to be had and range analysis is relatively expensive. I’d strongly discourage you from implementing a traditional range analysis for LLVM without deeply understanding the history here. The closest I know of to this in tree is LoadLoadElimination.cpp and (in some cases) the PRE code inside GVN.cpp. Building something like this on top of SCEV could be quite interesting. You should definitely talk to Adam Nemet (CC’d) about this. I believe the “value assertions” link may be stale. If I’m reading that correctly, it looks like the motivation for @llvm.assume. If you want further ideas, consider the list I just sent to llvm-dev a few moments ago titled “A couple ideas for possible GSoC projects”.

Hi Aries,

regarding the person who proposed a range analysis for GSOC some years ago, it is likely to have been me. I am one of the authors of a range analysis for LLVM:

https://code.google.com/archive/p/range-analysis/

The analysis is described here:
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6494996&tag=1

It’s pretty fast already (less than 2 seconds to run on SPEC’s gcc), with very good precision. I keep it up to date with the newer versions of LLVM, and eventually we might try to incorporate it into the trunk. Thus we consider the implementation finished.

One guy in last year’s GSOC proposed a floating-point range analysis, but I’m not sure what happened then. Send me an email in case you need to know anything about our project.

Good luck!

Victor.

We actually have two DA frameworks at the moment. The file you mention is only used currently by the LoopInterchange pass that is off by default. There is also the other framework that I’ve been working on called LoopAccessAnalysis that’s currently used by the LoopVectorizer, LoopLoadElimination, LoopDistribution and LICMLoopVersioning (the latter two are off by default).

Value range propagation pass. There was a discussion about this topic (https://groups.google.com/forum/#!topic/llvm-dev/XXqfemtDX74/discussion). Someone already proposed to do this pass for several years ago GSoC. But I can’t find the progress of the work. If no progress, then does it mean that the VRP based on Patterson’s paper need to be implemented although range analysis has been implemented?

This is largely stalled. The key problem is that between LazyValueInfo (constant ranges) and SCEV (symbolic ranges in loops), there’s fairly little profit to be had and range analysis is relatively expensive. I’d strongly discourage you from implementing a traditional range analysis for LLVM without deeply understanding the history here.

Predictive Commoning. The presentation side by Arie Tal seems provide quite clear explanation and examples of the algorithm. I guess the implementation should be straightforward, isn’t it?

The closest I know of to this in tree is LoadLoadElimination.cpp and (in some cases) the PRE code inside GVN.cpp. Building something like this on top of SCEV could be quite interesting. You should definitely talk to Adam Nemet (CC’d) about this

Yes, I think the memory operations can be handled in LoopLoadElimination. The algorithm is different from the above paper though. We use the loop-carried dependences to find opportunities to reuse loaded values from earlier iterations. Thus our approach is more similar to Loop Scalar Replacement by Steve Carr et al. It may be possible to extend this to expressions that are derived from these loads to also cover the additions in the mgrid example cited by the slides.

Adam

Hi everyone,

I am very interested in contributing to LLVM project in this year’s
GSoC. I am new with LLVM, but this is used in the compiler course in
my university. So, I am thinking to involve in LLVM development to
have a better knowledge of the system. Currently, I am preparing the
proposal.

One of the project that caught my eyes is “New Transformations and
Analysis”. Several code transformations and analyses have been
introduced in the compiler course that I am currently taking. That’s
why I am thinking to involve in writing some new transformations and
code analyses. But the list of transformations in the LLVM Open
Projects web page seems too brief for me and I need more details on
those stuffs.

*Loop Dependence Analysis Infrastructure. *I have looked in the source
codes repo and I saw that there is a file named
“DependenceAnalysis.cpp”. So, does that mean this analysis has been
implemented?

I believe major progress has been made it, but haven't been following it
closely. I'd suggest talking to committers active in this file in the
recent past to determine what useful work might be left of appropriate
scope.

**

*Value range propagation pass. *There was a discussion about this
topic
(Redirecting to Google Groups
<Redirecting to Google Groups).
Someone already proposed to do this pass for several years ago GSoC.
But I can’t find the progress of the work. If no progress, then does
it mean that the VRP based on Patterson’s paper need to be implemented
although range analysis has been implemented?

This is largely stalled. The key problem is that between LazyValueInfo
(constant ranges) and SCEV (symbolic ranges in loops), there's fairly
little profit to be had and range analysis is relatively expensive. I'd
strongly discourage you from implementing a traditional range analysis
for LLVM without deeply understanding the history here.

**

*Predictive Commoning. *The presentation side by Arie Tal seems
provide quite clear explanation and examples of the algorithm. I guess
the implementation should be straightforward, isn’t it?

The closest I know of to this in tree is LoadLoadElimination.cpp and (in
some cases) the PRE code inside GVN.cpp. Building something like this
on top of SCEV could be quite interesting. You should definitely talk
to Adam Nemet (CC'd) about this.

**

*Type Inference (aka. Devirtualization) and Value assertions. *

Can I get more details of these topics? Does the type inference mean
the translation of _auto_ keyword or something else? For value
assertions, “unreachable” intrinsic seems has been implemented cause I
can find the usage in some of the testcases.

I believe the "value assertions" link may be stale. If I'm reading
that correctly, it looks like the motivation for @llvm.assume.

This indeed sounds a lot like the @llvm.assume work. I did notice though that these assumptions aren't yet always used when available.

For example, the following code will still contain a fallback loop when compiled even though it is can be removed according to the provided assumptions:

int foo(int *A, int n) {
   int sum = 0;
   __builtin_assume(n>7 && n%8==0);
   for(int i=0; i < n; ++i)
     sum += A[i] + c;
   return sum;
}

Which was compiled with "clang -S -O3 test.c"

I guess that there are more similar cases where we're not using these assumptions yet. Maybe that's a nice project as well.

Cheers,
  Roel

Hi Adam,

Hi Hongbin,

Hi Adam,

Yes, that sounds reasonable. It will probably take some careful generalization of the DA within LAA (mostly the MemoryDepChecker class).

Adam