Recently, I found there’s a fascinating job about polyhedral compilation promoted in MLIR, which is called FPL. papers, codes
To be honest, polyhedron is always one of the topics I care about.

So, if there’s a demo or how can I use the FPL to optimize my own deep learning op or network?
For example, our group tried to use PPCG to optimize the performance of complex operators (using the default configuration, such as tile size, etc.), and found that the effect was not good. Report Link(Sorry for the Chinese report here. If necessary, I will translate it into English and reissue it again.).
Therefore, I very much hope that FPL can have better results in dealing with such operators, and I am urgent to have such an experiment.

By the way, I find that the project has not been completed so far. Do I have the opportunity to participate in the development of unfinished projects?

At the moment polyhedral compiler infrastructure is yet to be built on top of FPL. We always welcome contributions to the library. You must be familiar with the idea of transprecision from our paper which you linked. The current MLIR upstream uses fixed 64-bit integers for all operations, which could lead to overflows and incorrect results in the worst case. One feature we are looking to upstream is some kind of transprecision to avoid such overflows.

The library-level transprecision described in our paper involves exceptions and so cannot be upstreamed. However, what we could do is use what we call a matrix-level transprecision (also briefly described in the paper). Basically, the matrix class is templated on the integer type to be used and when an overflow occurs in some row operation it smoothly transitions to a matrix instantiation with a wider integer type. This was also described in more detail in our earlier paper. Our hope is that this would still achieve pretty good performance, while also being fully correct on large test cases.

In my opinion, the two key points that attract me the most are:

Reduced memory allocations and cache miss by using lightweight C++ and LLVM APIs. In my opinion, this is mainly concerned with locality optimization.

Due to the guarantee of the Transprecision algorithm, FPL is able to use low-precision calculations as much as possible to improve performance.

It’s really an attractive project for me.

So, if there’s a demo or how can I use the FPL to optimize my own deep learning op or network?
For example, our group tried to use PPCG to optimize the performance of complex operators (using the default configuration, such as tile size, etc.) and found that the effect was not good. Report Link

What I mentioned here is the optimization of auto-schedule in the deep learning compiler with the polyhedral compilation (take PPCG as an example). I expect and believe that FPL will play an important role in this area. Maybe some new methods for parallelism or solving problems like optimizing the tile size of the for-loop, etc. I don’t know. Just looking forward to it.

To be honest, I still need some time to deeply understand the two papers you provided.
I would appreciate it if there is a todo-list of this project or some simple task that I (as a beginner) could be involved in.

Sorry about the delay in responding. We don’t yet have a public to-do. One good beginner issue could be ordering division variables within a set. We support floor divisions of variables by introducing an existentially quantified variable and constraining it to take the value of a floor division. See the documentation of IntegerPolyhedron for more details. This is also described in the paper. Consider a set like

(x, y) : (exists p, q, r : p = floor((3q + 5r)/2) and q = floor((2p + 7)/3)) and r = floor((2x)/5)).

Many times we want to iterate through all the divisions and somehow make some updates or changes to them, but this becomes a problem when the earlier divisions depend on the later divisions. For example here p depends on q in its expression.

Our goal is to rearrange the division variables so that each one’s numerator only uses the numerator of the ones before it, but has zero coefficient for all the ones after it. For example, in this case we want to change the order p, q, r to r, q, p:

(x, y) : (exists r, q, p : r = floor((2x)/5)) and q = floor((2p + 7)/3)) and p = floor((3q + 5r)/2).

It’s fine to do this with a naive algorithm for now. You can make this a function in IntegerPolyhedron, and use IntegerPolyhedron::getLocalReprs, which will give a numerator, denominator division representation of the locals. Please don’t hesitate to ask any more questions or clarifications!

One good beginner issue could be ordering division variables within a set . We support floor divisions of variables by introducing an existentially quantified variable and constraining it to take the value of a floor division.

If I understand it correctly, the division operation is quite important in the Presburger arithmetic, as I notice the sentence in your paper - "the test set contains 50,223 test cases involving division variables, indicating that this aspect of Presburger arithmetic sees significant use in polyhedral compilation and that the test set has good coverage for it. ". The division operation accounts for about 5%.

So, yeah, I agree it’s a good beginner issue. Thanks for the advice!

I’ll do it under the suggestion and keep you updated if there’s progress.