[PATCH 1/2] Fix implementation of sqrt

Passing values less than 0 to the llvm.sqrt() intrinsic results in
undefined behavior, so we need to check the input and return NaN if
is is less than 0.

This patch does not compile for me, details inlined.

Other than that LGTM. I guess NaN is a good for implementation defined
value.

jan

Passing values less than 0 to the llvm.sqrt() intrinsic results in
undefined behavior, so we need to check the input and return NaN if
is is less than 0.
---
generic/include/clc/math/sqrt.h | 9 +++------
generic/include/math/clc_sqrt.h | 29 +++++++++++++++++++++++++++++
generic/include/math/clc_sqrt.inc | 23 +++++++++++++++++++++++
generic/lib/SOURCES | 2 ++
generic/lib/math/clc_sqrt.cl | 37 +++++++++++++++++++++++++++++++++++++
generic/lib/math/clc_sqrt.inc | 38 ++++++++++++++++++++++++++++++++++++++
generic/lib/math/sqrt.cl | 35 +++++++++++++++++++++++++++++++++++
7 files changed, 167 insertions(+), 6 deletions(-)
create mode 100644 generic/include/math/clc_sqrt.h
create mode 100644 generic/include/math/clc_sqrt.inc
create mode 100644 generic/lib/math/clc_sqrt.cl
create mode 100644 generic/lib/math/clc_sqrt.inc

having two files named clc_sqrt.inc with the same include path
(math/)created

create mode 100644 generic/lib/math/sqrt.cl

diff --git a/generic/include/clc/math/sqrt.h b/generic/include/clc/math/sqrt.h
index f69de84..ccde974 100644
--- a/generic/include/clc/math/sqrt.h
+++ b/generic/include/clc/math/sqrt.h
@@ -1,6 +1,3 @@
-#undef sqrt
-#define sqrt __clc_sqrt
-
-#define __CLC_FUNCTION __clc_sqrt
-#define __CLC_INTRINSIC "llvm.sqrt"
-#include <clc/math/unary_intrin.inc>
+#define __CLC_BODY <clc/math/sqrt.inc>

clc/math/sqrt.inc is not present nor added in this patch

---
amdgcn/lib/SOURCES | 1 +
amdgcn/lib/math/sqrt.cl | 59 +++++++++++++++++++++++++++++++++++++++++++++++++

I see you created new folder for amdgcn, is the implementation expected
to be different for fp64 enabled EG and NI asics?

> ---
> amdgcn/lib/SOURCES | 1 +
> amdgcn/lib/math/sqrt.cl | 59 +++++++++++++++++++++++++++++++++++++++++++++++++

I see you created new folder for amdgcn, is the implementation expected
to be different for fp64 enabled EG and NI asics?

No, I can move it into the r600 directory.

-Tom

> > ---
> > amdgcn/lib/SOURCES | 1 +
> > amdgcn/lib/math/sqrt.cl | 59
+++++++++++++++++++++++++++++++++++++++++++++++++
>
> I see you created new folder for amdgcn, is the implementation expected
> to be different for fp64 enabled EG and NI asics?
>

No, I can move it into the r600 directory.

with that change LGTM

Passing values less than 0 to the llvm.sqrt() intrinsic results in
undefined behavior, so we need to check the input and return NaN if
is is less than 0.

This broke compilation with LLVM 3.6:
In file included from ./generic/lib/math/clc_sqrt.cl:37:
In file included from ./generic/include/clc/math/gentype.inc:80:
./generic/lib/math/clc_sqrt_impl.inc:34:23: error: passing '__constant char *'
to parameter of type 'const char *' changes address space of pointer
  return val < ZERO ? __CLC_NAN : __clc_llvm_intr_sqrt(val);
                      ^~~~~~~~~
./generic/lib/math/clc_sqrt_impl.inc:27:33: note: expanded from macro
'__CLC_NAN'
#define __CLC_NAN __builtin_nan("")
                                ^~

EdB

Hi,

I've created a tag called 36_COMPATIBLE to mark the last
working commit with 3.6. Now that LLVM is on 3.8, I'm not
sure we want to keep supporting 3.6 in trunk.

-Tom