Loop Tiling Pass

megha · August 3, 2023, 4:29am

hello ,
We tried loop tiling logic for matrix multiplication C program and got a hike in performance with execution time.So, we created a loop tiling pass and successfully executed. But unable to see the same hike with performance, with the pass created. In loop ordering , it comes after “Print Module IR” pass, with -O3 optimization level. Can anyone help why it is not performing ?

rengolin · August 3, 2023, 9:06am

It’d be hard to help without the source code. But you can do a few things to guide your team:

Take a small example untiled and tiled, run through upstream llvm-opt and print the final IR.
Take the same examples, run through llmv-opt with your pass, compare the IR with the tiled version.

If the result of your pass on the same untiled loop does not produce the same (optimal) IR, look for the deltas as clues to what happened.

A number of things can be affecting your pass to make an effect:

The loop control structures aren’t built correctly and your pass is not actually doing what you want
Alias analysis can’t prove the access are not independent
The transform you do introduce inefficiencies elsewhere (loop structure)
Some canonicalization introduces bloat inside the inner loop
If you didn’t delete the original loop and somehow the code still goes through it

But without looking at the code and resulting IR it’s hard to be more specific.

Topic		Replies	Views
Loop tiling and loop ordering in MLIR MLIR	4	973	July 13, 2022
Tiling - need advice on L3 cache MLIR	20	1788	July 1, 2021
[RFC] New pass: LoopExitValues LLVM Dev List Archives	30	99	September 27, 2015
Loop-specific optimizations LLVM Dev List Archives	5	102	April 6, 2013
Issues with the Loop Interchange Pass LLVM Project clang , llvm	3	229	November 19, 2024

Loop Tiling Pass

Related topics