The godbolt link has a simple c benchmark which tries to measure the time in seconds for a function “dot_product” which does some FMA operations.
ref: Compiler Explorer
With -O1 I observe the loop deletion pass has deleted the loops in the function “dot_product” which performs the FMA operations.
Can some one explain why the loops are getting deleted? The memory allocated is also passed to free later.
If you delete the dot_product call, the program gets faster but the output does not change.
If you would print matrixC[0], then the loop probably not deleted.
Thank you fore replying. GCC compiler is not optimizing away these loops even at -Ofast. It sounded LLVM is aggressive in doing this at -O1.
Yes I verified that.
Can conclude it is an expected behavior? Then the “free” call, LLVM treats that only memory is deleted and no external user is going to access the pointees?
The loop body has no effect:
matrixA[i*N + j] = 2.0;
This is the root issue. free
is not the issue. Nobody consumes
the result of the loop.