[GSoC 2016] [Polly] Implementation of tiling, interchanging and unrolling of specific loops based on the algorithm for the analytical modeling

Hi Tobias,

I think that we could split a patch that contains an implementation of
tiling, interchanging and unrolling of specific loops into three
separate patches:

1. The first one adds a class that describes a processor model. It
also adds a new command line parameter that contains all necessary
parameters of a target architecture, which are used to construct
objects of the class.

2. The second one adds methods to the class to compute parameters for
instantiations of the matrix-matrix multiplication. It also implements
tiling, interchanging and unrolling of specific loops.

3. The third one replaces manual passing of parameters of a target
architecture with utilization of information from LLVM.

What do you think about it?

P.S.: I’m not sure whether all necessary parameters of a target
architecture are accessible from LLVM and how it’s better to get them
in our case. Should we ask these questions on the mailing list now?

If I’m not mistaken, we’re interested in the following parameters:

1. Size of double-precision floating-point number.

2. Number of double-precision floating-point numbers that can be hold
by a vector register.

3. Throughput of vector instructions per clock cycle.

4. Latency of instructions (i.e., the minimum number of cycles between
the issuance of two dependent consecutive instructions).

5. Paramaters of cache levels (size of cache lines, associativity
degrees, sizes).

Hi Roman,

Hi Hongbin,

thank you for the comment!

I think that we could split a patch that contains an implementation of
tiling, interchanging and unrolling of specific loops into three
separate patches:

1. The first one adds a class that describes a processor model. It
also adds a new command line parameter that contains all necessary
parameters of a target architecture, which are used to construct
objects of the class.

Instead of creating a new class, may be we could enhance some classes in
TargetTransformInfo.h of LLVM to achieve your goal?

In my opinion, it would be good to enhance some classes in
TargetTransformInfo.h or somewhere else instead of creating a new
class. Should we ask whether it’s possible on the mailing list?

Or this is done in step 3?

I think that on this step we could use all necessary parameters of a
target architecture are accessible from LLVM (e.g., the number of bits
necessary to hold the specified type) and manually pass only missing
information (e.g., size of cache and it’s associativity)