[PATCH] math: Add fdim implementation

Based on the amd-builtin, but explicitly vectorized for all sizes (not just
float4), and includes a vectorized double implementation.

Passes piglit (float) tests on pitcairn.

Signed-off-by: Aaron Watry <awatry@gmail.com>

Based on the amd-builtin, but explicitly vectorized for all sizes
(not just
float4), and includes a vectorized double implementation.

I'm not a big fan of copying bit magic from amd-builtins. In this case
it only avoids branch with the same number of instructions (amdgcn,
there are few more scalar instructions so the branch version might end
up being faster).

that said. I think the patch is OK with few comments to make it more
friendly for quick eyeballing. the code follows 'naive' implementation
pretty closely. with few comments:

Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>

Passes piglit (float) tests on pitcairn.

Signed-off-by: Aaron Watry <awatry@gmail.com>
---
I did test the double implementation on my pitcairn as well, but
those
tests aren't in piglit.

I just copied/pasted the generated float tests, enabled the fp64
pragma,
and did s/float/double in the test file.

I hadn't planned on sending those to piglit unless someone really
wants them.

I agree, extending generators to generate double variants would be
preferable to introducing individual generated tests.

Jan

PS: sorry for a bit of a rant. I have attached my testing files if you
are interested.

fdim_test.tgz (3.05 KB)