Implement tgamma and some of its dependencies

This series started from my attempt to implement tgamma so that I can use
it eventually for hcc. The amd-builtin implementation of tgamma uses lgamma,
which itself uses lgamma_r. So, in the end, all 3 of these functions have
been ported/implemented.

Note: I'm fairly positive that _CLC_V_V_VP_VECTORIZE was previously unused.
      I don't know that it has ever been used, and I'm fairly confident
      about the address space addition. If anyone has concerns, please let
      me know.

This macro is currently unused, but I plan to use it shortly.

The previous form did casts of pointers without an address space, which
doesn't work so well for CL 1.x.

Signed-off-by: Aaron Watry <awatry@gmail.com

Ported from the amd-builtins branch, which is itself based on the
Sun Microsystems implementation.

Signed-off-by: Aaron Watry <awatry@gmail.com>

Just use lgamma_r and ignore the value returned in the second argument

Signed-off-by: Aaron Watry <awatry@gmail.com>

Signed-off-by: Aaron Watry <awatry@gmail.com>

This series started from my attempt to implement tgamma so that I can use
it eventually for hcc. The amd-builtin implementation of tgamma uses lgamma,
which itself uses lgamma_r. So, in the end, all 3 of these functions have
been ported/implemented.

LGTM for the series.

-Tom

I created and used the _CLC_V_V_VP_VECTORIZE in first iteration of
series to implement modf builtin based on the AMD builtin library.
However in the second try I used much simpler approach based on the
reference implementation that no longer used the macro. So I guess it
was probably pushed by accident. Feel free to fix it/use it anyway you
want.

And yeah, I'm quite sure that the address space part was broken...

Pavel

These pass the conformance test for me on Bonaire