Performance analysis for TSVC

tblah · February 1, 2024, 10:21am

Thank you for your investigations.

There was discussion about improving flang address calculation for array indices here [RFC] Changes to fircg.xarray_coor CodeGen to allow better hoisting

IIRC the latest is that we want to try adding the no-signed-wrap flag to these calculations so that LLVM is more free to re-arrange these calculations and hoist more of it out of the loop. Unfortunately, I got pulled onto different work and haven’t had time to try this. I will get back to it but it won’t be in the near future, so feel free to pick it up if it has a higher priority for you (but check first if this does apply to your example because I was looking at a different benchmark).

The extra subtractions in flang’s calculations are required because arrays in fortran can have different starting indices and these ranges need to be adapted to match llvm 0-based indices. The hope is that with more information, LLVM could hoist the subtractions out of loops: resulting in simpler address calculations inside of the loops.

Another option would be to re-order the mathematical operations generated by flang to calculate the address so that they are easier for LLVM to optimize. This was rejected because commenters felt that LLVM should be able to do this without help.

There is more information about NSW (no signed wrap) in the LLVM language reference entries for integer arithmetic operations e.g. LLVM Language Reference Manual — LLVM 19.0.0git documentation

Support is already in upstream MLIR dialects: [RFC] Integer overflow flags support in `arith` dialect
[mlir][LLVM] Add nsw and nuw flags by tblah · Pull Request #74508 · llvm/llvm-project · GitHub

I made a start here: [flang][CodeGen] add nsw to address calculations by tblah · Pull Request #74709 · llvm/llvm-project · GitHub, the main thing still to do is adding nsw to loop index calculations.

Topic		Replies	Views
[RFC] Enabling the HLFIR lowering by default Flang	12	1552	November 19, 2023
Status of Flang's Optimization Flang	11	1365	December 4, 2023
[RFC] Split ConvertExpr.cpp Flang	6	318	June 26, 2023
MLIR LLVM-IR dialect -- status and lowering questions Flang	8	152	December 10, 2019
[RFC][HLFIR] Optimized Bufferization for elemental array updates Flang	8	462	August 4, 2023

Performance analysis for TSVC

Related topics