I think we discussed this last Thursday? My feeling is that each use of the multiply can be considered separately. If it can be combined, then we should do so. The multiply should be left in place and removed by a dead code elimination pass sometime later. This is what TOBEY does. If you want me to explain the XL method in more detail, come talk to me.
The target independent heuristic could certainly check all uses, patches welcome. We could also consider each case separately, as Kevin suggests, but that might not be optimal on targets with only one floating-point pipeline, so we'd need to make it opt-in. You should also look at the MachineCombiner Pass (added in r214832, currently only used by the ARM backend I think) the tries to solve this problem in a more-sophisticated way.