I’m trying expand the Float2Int pass in order to make it able to optimize
expressions like f * x > y, where x and y are integers (we’ll assume unsigned for
simplicity) and f is a floating point constant. The optimization would convert
the expression to something like:
(a * x)/b > y
where a and b are integers guessed by the compiler (currently using continued
fractions, but some other method could also be used). The exact expression
would depend on the comparison type and on the type of input integers.
Note that dividing by an integer constant should be a cheap operation
compared to FP multiplication and comparison as this would get lowered to a
multiply+subtract+shift sequence (and should certainly be true on cores with no
The change will add some machinery to Float2Int in order to be able to
match certain known patterns. All patterns will have exactly representable
inputs and outputs. In this case, the FCMP will have two inputs, x and y and
one output (the result of the comparison). This allows us to use the existing
Float2Int infrastructure to track the integer length that would be required
for the conversion.
The main difficulty I can see with this is validating the transformation. Ideally
the property that we want to prove for the guessed integers a and b would be
that (a*x)/b == round_to_zero(f * to_float(x)), for every integer x in a given
range. The problem with this is that we would need to come up with a reasonably
quick method of proving this.
My best guess (still need to work out the exact details) would be that we only
need to check the first and last b numbers of the range. A hand-wavy argument
for this would be that since errors increase on the extremities of the range, so if
an error appears anywhere it must also appear at the edges of the range.
I still need to prove this. If there are any floating-point experts that can help
with this, I would greatly appreciate it.
Any other comments about the overall merits of the idea and the approach are
of course also welcome!