Now that 2.5 is about to branch, I'd like to bring up one of Scott's
favorite topics: certain optimizers widen or narrow arithmetic,
without regard for whether the type is legal for the target. In his
specific case, instcombine is turning an i32 multiply into an i64
multiply in order to eliminate a cast. This does simplify/reduce the
number of IR operations, but an i64 multiply is dramatically more
expensive than an i32 multiply on CellSPU.
There are a couple of different ways to look at this. ...
It would seem most effective to maintain the minimum a required precision
associated with each of the operands of a transform (as may often differ);
and simply let the target code generator selectively widen them as may be
most efficient, not before (although target specific attributes may enable
target neutral intermediate optimizers to be more ideally influenced).
Maintaining this canonical information enables mapping optimizations to
often further narrow intermediate operand minimum precision requirements,
and thereby potentially improve efficiency of target code generation;
particularly as may be useful for smaller native precision targets and/or
multiple operand vector units. IMHO.
(seemingly a bit late at this stage of the game, but possibly not?)