[PATCH] math: Implement remainder(x, y)

Mostly ported from the amd-builtins branch.

The amd-builtins branch uses __amdil_improved_fdiv_f32 and FTZ which aren't available in generic CLC.

__amdil_improved_fdiv_f32 points to native_divide which does native_recip(y)*x.

Since we don't have native_divide or native_recip yet, I've just stuck an actual division here.

I've taken a shot at a replacement for FTZ(x), but feel free to suggest alternatives.

Tested via piglit on a Radeon HD 7850 using the tests just sent to that list.

v2: Use __builtin_canonicalizef(float) instead of custom flush-to-zero function

Signed-off-by: Aaron Watry <awatry@gmail.com>

This fails conformance for me:

errors.txt (18.6 KB)

This fails conformance for me:

Interestingly enough some (but not all) of the test inputs start to pass
when I forcefully enable subnormal support in libclc
(--enable-runtime-subnormal and also updating generic/lib/shared/
subnormal_config.cl to enable 32-bit subnormals).

I'll be playing with this a bit as time permits.


I’m guessing you need to somehow access one of the other division implementations. By default fdiv will be getting the !fpmath 2.5 ULP metadata. Can you try adding -cl-f432-correctly-rounded-divide-sqrt to the build of this file, or calling a wrapper IR function which avoids the metadata?

Although this probably doesn’t explain f64 failures