Dear LLVM contributors,

I work on the "Improvement of vectorization process in Polly". At the

moment I'm trying to implement tiling, interchanging and unrolling of

specific loops based on the following algorithm for the analytical

modeling [1]. It requires information about the following parameters

of a target architecture:

1. Size of double-precision floating-point number.

2. Number of double-precision floating-point numbers that can be hold

by a vector register.

3. Throughput of vector instructions per clock cycle.

4. Latency of instructions (i.e., the minimum number of cycles between

the issuance of two dependent consecutive instructions).

5. Paramaters of cache levels (size of cache lines, associativity

degrees, sizes).

Could you please advise me where I can find such information? If I'm

not mistaken, we can get the size of a cache line and the width of the

largest vector register (which probably helps to determine the second

parameter) from TargetTransformInfo.h.

I would be very grateful for your comments, feedback and ideas.

Refs.:

[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

1. Size of double-precision floating-point number.

By IEEE 754, its always 8 bytes.

Generally, use DataLayout::getTypeAllocSize/getTypeStoreSize to get a

type's size.

2. Number of double-precision floating-point numbers that can be hold

by a vector register.

TargetTransformInfo::getRegisterBitWidth (divided by 8 if double)

3. Throughput of vector instructions per clock cycle.

TargetTransformInfo::getArithmeticInstrCost

See LoopVectorizationCostModel::getInstructionCost for how to use it.

4. Latency of instructions (i.e., the minimum number of cycles between

the issuance of two dependent consecutive instructions).

I think latency and throughput cannot be queried separately. They are

combined as 'cost'.

5. Paramaters of cache levels (size of cache lines, associativity

degrees, sizes).

TargetTransformInfo::getCacheLineSize

That is, available for one level (probably L1 Data Cache) only. I

think the X86 backend doesn't even define it (returns 0)

For the information that is missing, I suggest to use command line

options to get the information directly from the user. In the long

term, we could add it TargetTransformInfo as well.

Michael

Perfect. That's what I would suggest as well.

Best,

Tobias

Because it is getRegister_Bit_Width, divide by 64, respectively by

DataLayout::getTypeSizeInBits()

Sorry for the mistake.

Michael

Thank you very much for the detailed information and ideas!