This question is probably of a very elementary nature, but I just can’t figure it out. I am hoping you experts in here will be able to help.
In this Godbolt link Compiler Explorer can someone please tell me why I have the slli and srli intructions in the strange_square function? To avoid those instructions, I have to compute the more cumbersome expression mul(n, n-1) + n, rather than computing mul(n, n) directly, which seems odd to me.
This is probably an “optimisation” where LLVM sees that n*n is always positive and so “optimises” the sign-extending for the ABI to be a zero-extend, but then the backend doesn’t know the output is positive and so has to explicitly zero-extend. Cc @topperc.
Looks like the backend inserted a sign_extend before type legalization due to the function return. DAGCombiner saw the mul had the nsw flag so the multiply can’t overflow so the sign bit is assumed to be 0. This caused the sign extend to convert to zero extend. Then the type legalization promoted everything to 64 bits. This dropped the nsw flag so we have no way to reverse the transform.
I could add a DAGCombiner for (zext (mulnsw X, X)) before type legalization to turn it back into (sext (mul X, X)), dropping the nsw flag in the process.
Or I could convince DAGCombine that sext is cheaper than zext and that it shouldn’t do this replacement.
@rotateright @RKSimon @LebedevRI