[PATCH] math: Implement remainder(x, y)

Mostly ported from the amd-builtins branch.

The amd-builtins branch uses __amdil_improved_fdiv_f32 and FTZ which aren't available in generic CLC.

__amdil_improved_fdiv_f32 points to native_divide which does native_recip(y)*x.

Since we don't have native_divide or native_recip yet, I've just stuck an actual division here.

I've taken a shot at a replacement for FTZ(x), but feel free to suggest alternatives.

Tested via piglit on a Radeon HD 7850 using the tests just sent to that list.

v2: Use __builtin_canonicalizef(float) instead of custom flush-to-zero function

Signed-off-by: Aaron Watry <awatry@gmail.com>

This fails conformance for me:

errors.txt (18.6 KB)

This fails conformance for me:

Interestingly enough some (but not all) of the test inputs start to pass
when I forcefully enable subnormal support in libclc
(--enable-runtime-subnormal and also updating generic/lib/shared/
subnormal_config.cl to enable 32-bit subnormals).

I'll be playing with this a bit as time permits.

--Aaron

I’m guessing you need to somehow access one of the other division implementations. By default fdiv will be getting the !fpmath 2.5 ULP metadata. Can you try adding -cl-f432-correctly-rounded-divide-sqrt to the build of this file, or calling a wrapper IR function which avoids the metadata?

Although this probably doesn’t explain f64 failures