getUserCost(): "Ext of i/fcmp results are mostly optimized away in codegen"


I wonder why the extension of i1 is per default considered free in getUserCost(). The comment says that these are "mostly optimized away in codegen", but I wonder how that can be: If the i1 is extended then a register must be loaded with either a 0 or (-)1 after the comparison, or?

I have made some test functions that seem to result in compare + conditional move on SystemZ (or a setcc on Intel). See On both of these targets this is clearly not free.

Are those simple tests I made special in any way so that they might call for some general rule in the generic implementation?

Any example of when this extension is actually free?


Hi Jonas,

I tried to trace the history of the code. I found it was introduced in r172998( in 2013. I don’t see any tests in the commit.
I think this code might be outdated. Maybe it was related to vector code.
There are tests for vector code. See test/Analysis/CostModel/SystemZ/cmp-ext.ll.
I don’t know if there are any tests for scalar code.
I checked the inline cost calculation. In case of the ARM targets the cost of the cast was reported as free but it was lowered into real instruction. Of course, they are very light and have no impact at O2/O3 because of very high inlining thresholds. However they will have impact at Os/Oz where thresholds are much lower.

Evgeny Astigeevich

Hi Evgeny,

I agree with you that this code does not seem to make any sense - at least not on SystemZ, X86 or ARM… I think those tests are for CostModel which use getCastInstrCost(), and I can’t find any tests at all for getUserCost(), but perhaps there are. Inliner is just one user. CodeMetrics uses getUserCost(), and in turn some loop passes uses it. For instance, the LoopRotation default threshold for the header size is 16, which makes this patch relevant, I think. I simply removed the code, and no tests failed. This seems reasonable to me, please take a look at the Phabricator post . Since there does not seem to be any tests for getUserCost(), I removed the one I had made also, which was using the LoopUnroller to get the loop size in an awkward way. Thanks Jonas