Geometric builtin pages v2

Hi,

Here is version 2 of my geometric builtin series. I've omitted
all the patches from v1 which received LGTM reviews.

-Tom

v2:
  - Move common code into a macro
  - Use the same constant for all vector types.

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
  - Remove unnecessary copyright.

This is a generic implementation which just calls sqrt. Targets should
override this if they want a faster implementation.

v2:
  - Alphabetize SOURCES

The new implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

This is a generic implementation which just calls rsqrt.
Targets should override this if they want a faster implementation.

v2:
  - Alphabettize SOURCES

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

The double versions shouldn’t have the f suffix on the constants

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
  - Remove f suffix from constant in double implementations.
  - Consolidate implementations using the .cl/.inc approach.

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.
---
generic/include/clc/clc.h | 1 +
generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
generic/include/clc/geometric/floatn.inc | 8 +++++
generic/lib/SOURCES | 1 +
generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
7 files changed, 137 insertions(+)
create mode 100644 generic/include/clc/geometric/fast_normalize.h
create mode 100644 generic/include/clc/geometric/fast_normalize.inc
create mode 100644 generic/lib/geometric/fast_normalize.cl
create mode 100644 generic/lib/geometric/fast_normalize.inc

diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
index 77e0af9..b4409a7 100644
--- a/generic/include/clc/clc.h
+++ b/generic/include/clc/clc.h
@@ -123,6 +123,7 @@
#include <clc/geometric/dot.h>
#include <clc/geometric/fast_distance.h>
#include <clc/geometric/fast_length.h>
+#include <clc/geometric/fast_normalize.h>
#include <clc/geometric/length.h>
#include <clc/geometric/normalize.h>

diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
new file mode 100644
index 0000000..cba69c2
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#define __CLC_BODY <clc/geometric/fast_normalize.inc>
+#include <clc/geometric/floatn.inc>
diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
new file mode 100644
index 0000000..3ef8f86
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.inc
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+
+_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
index fb7a9ae..f9e3d30 100644
--- a/generic/include/clc/geometric/floatn.inc
+++ b/generic/include/clc/geometric/floatn.inc
@@ -1,8 +1,11 @@
#define __CLC_FLOAT float
+#define __CLC_FP32

In generic/include/clc/math/gentype.inc __CLC_FPSIZE is defined to achieve something similar. Maybe we should keep this consistent across the codebase?

#define __CLC_FLOATN float
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN float2
#include __CLC_BODY
@@ -17,14 +20,18 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FP32

#ifdef cl_khr_fp64

#define __CLC_FLOAT double
+#define __CLC_FP64

Same as above.

Jeroen

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
  - Remove f suffix from constant in double implementations.
  - Consolidate implementations using the .cl/.inc approach.

v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.

v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}
---
generic/include/clc/clc.h | 1 +
generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
generic/include/clc/geometric/floatn.inc | 8 +++++
generic/lib/SOURCES | 1 +
generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
7 files changed, 137 insertions(+)
create mode 100644 generic/include/clc/geometric/fast_normalize.h
create mode 100644 generic/include/clc/geometric/fast_normalize.inc
create mode 100644 generic/lib/geometric/fast_normalize.cl
create mode 100644 generic/lib/geometric/fast_normalize.inc

diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
index 77e0af9..b4409a7 100644
--- a/generic/include/clc/clc.h
+++ b/generic/include/clc/clc.h
@@ -123,6 +123,7 @@
#include <clc/geometric/dot.h>
#include <clc/geometric/fast_distance.h>
#include <clc/geometric/fast_length.h>
+#include <clc/geometric/fast_normalize.h>
#include <clc/geometric/length.h>
#include <clc/geometric/normalize.h>

diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
new file mode 100644
index 0000000..cba69c2
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#define __CLC_BODY <clc/geometric/fast_normalize.inc>
+#include <clc/geometric/floatn.inc>
diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
new file mode 100644
index 0000000..3ef8f86
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.inc
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+
+_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
index fb7a9ae..2c8e2d8 100644
--- a/generic/include/clc/geometric/floatn.inc
+++ b/generic/include/clc/geometric/floatn.inc
@@ -1,8 +1,11 @@
#define __CLC_FLOAT float
+#define __CLC_FPSIZE 32

#define __CLC_FLOATN float
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN float2
#include __CLC_BODY
@@ -17,14 +20,18 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#ifdef cl_khr_fp64

#define __CLC_FLOAT double
+#define __CLC_FPSIZE 64

#define __CLC_FLOATN double
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN double2
#include __CLC_BODY
@@ -39,6 +46,7 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#endif

diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
index e01194f..9367c24 100644
--- a/generic/lib/SOURCES
+++ b/generic/lib/SOURCES
@@ -38,6 +38,7 @@ geometric/distance.cl
geometric/dot.cl
geometric/fast_distance.cl
geometric/fast_length.cl
+geometric/fast_normalize.cl
geometric/length.cl
geometric/normalize.cl
integer/abs.cl
diff --git a/generic/lib/geometric/fast_normalize.cl b/generic/lib/geometric/fast_normalize.cl
new file mode 100644
index 0000000..d3d7846
--- /dev/null
+++ b/generic/lib/geometric/fast_normalize.cl
@@ -0,0 +1,40 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <clc/clc.h>
+
+_CLC_OVERLOAD _CLC_DEF float fast_normalize(float p) {
+ return normalize(p);
+}
+
+#ifdef cl_khr_fp64
+
+#pragma OPENCL EXTENSION cl_khr_fp64 : enable
+
+_CLC_OVERLOAD _CLC_DEF double fast_normalize(double p) {
+ return normalize(p);
+}

Shouldn’t this be:

#include <clc/clc.h>

#ifdef cl_khr_fp64
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#endif

#define __CLC_BODY <fast_normalize.inc>
#include <clc/geometric/floatn.inc>

Jeroen

>
> This implementation was ported from the AMD builtin library
> and has been tested with piglit, OpenCV, and the ocl conformance tests.
>
> v2:
> - Remove f suffix from constant in double implementations.
> - Consolidate implementations using the .cl/.inc approach.
>
> v3:
> - Use __CLC_FPSIZE instead of __CLC_FP{32,64}
> ---
> generic/include/clc/clc.h | 1 +
> generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
> generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
> generic/include/clc/geometric/floatn.inc | 8 +++++
> generic/lib/SOURCES | 1 +
> generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
> generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
> 7 files changed, 137 insertions(+)
> create mode 100644 generic/include/clc/geometric/fast_normalize.h
> create mode 100644 generic/include/clc/geometric/fast_normalize.inc
> create mode 100644 generic/lib/geometric/fast_normalize.cl
> create mode 100644 generic/lib/geometric/fast_normalize.inc
>
> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
> index 77e0af9..b4409a7 100644
> --- a/generic/include/clc/clc.h
> +++ b/generic/include/clc/clc.h
> @@ -123,6 +123,7 @@
> #include <clc/geometric/dot.h>
> #include <clc/geometric/fast_distance.h>
> #include <clc/geometric/fast_length.h>
> +#include <clc/geometric/fast_normalize.h>
> #include <clc/geometric/length.h>
> #include <clc/geometric/normalize.h>
>
> diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
> new file mode 100644
> index 0000000..cba69c2
> --- /dev/null
> +++ b/generic/include/clc/geometric/fast_normalize.h
> @@ -0,0 +1,24 @@
> +/*
> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#define __CLC_BODY <clc/geometric/fast_normalize.inc>
> +#include <clc/geometric/floatn.inc>
> diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
> new file mode 100644
> index 0000000..3ef8f86
> --- /dev/null
> +++ b/generic/include/clc/geometric/fast_normalize.inc
> @@ -0,0 +1,24 @@
> +/*
> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +
> +_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
> diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
> index fb7a9ae..2c8e2d8 100644
> --- a/generic/include/clc/geometric/floatn.inc
> +++ b/generic/include/clc/geometric/floatn.inc
> @@ -1,8 +1,11 @@
> #define __CLC_FLOAT float
> +#define __CLC_FPSIZE 32
>
> #define __CLC_FLOATN float
> +#define __CLC_SCALAR
> #include __CLC_BODY
> #undef __CLC_FLOATN
> +#undef __CLC_SCALAR
>
> #define __CLC_FLOATN float2
> #include __CLC_BODY
> @@ -17,14 +20,18 @@
> #undef __CLC_FLOATN
>
> #undef __CLC_FLOAT
> +#undef __CLC_FPSIZE
>
> #ifdef cl_khr_fp64
>
> #define __CLC_FLOAT double
> +#define __CLC_FPSIZE 64
>
> #define __CLC_FLOATN double
> +#define __CLC_SCALAR
> #include __CLC_BODY
> #undef __CLC_FLOATN
> +#undef __CLC_SCALAR
>
> #define __CLC_FLOATN double2
> #include __CLC_BODY
> @@ -39,6 +46,7 @@
> #undef __CLC_FLOATN
>
> #undef __CLC_FLOAT
> +#undef __CLC_FPSIZE
>
> #endif
>
> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
> index e01194f..9367c24 100644
> --- a/generic/lib/SOURCES
> +++ b/generic/lib/SOURCES
> @@ -38,6 +38,7 @@ geometric/distance.cl
> geometric/dot.cl
> geometric/fast_distance.cl
> geometric/fast_length.cl
> +geometric/fast_normalize.cl
> geometric/length.cl
> geometric/normalize.cl
> integer/abs.cl
> diff --git a/generic/lib/geometric/fast_normalize.cl b/generic/lib/geometric/fast_normalize.cl
> new file mode 100644
> index 0000000..d3d7846
> --- /dev/null
> +++ b/generic/lib/geometric/fast_normalize.cl
> @@ -0,0 +1,40 @@
> +/*
> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include <clc/clc.h>
> +
> +_CLC_OVERLOAD _CLC_DEF float fast_normalize(float p) {
> + return normalize(p);
> +}
> +
> +#ifdef cl_khr_fp64
> +
> +#pragma OPENCL EXTENSION cl_khr_fp64 : enable
> +
> +_CLC_OVERLOAD _CLC_DEF double fast_normalize(double p) {
> + return normalize(p);
> +}

Shouldn’t this be:

#include <clc/clc.h>

#ifdef cl_khr_fp64
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#endif

#define __CLC_BODY <fast_normalize.inc>
#include <clc/geometric/floatn.inc>

I did it this way because the scalar implementations are different. The
other way to do this would be to move the scalar implementations into
the .inc file and guard them with #ifdef __CLC_SCALAR. Which do you prefer?

-Tom

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.

v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}
---
generic/include/clc/clc.h | 1 +
generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
generic/include/clc/geometric/floatn.inc | 8 +++++
generic/lib/SOURCES | 1 +
generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
7 files changed, 137 insertions(+)
create mode 100644 generic/include/clc/geometric/fast_normalize.h
create mode 100644 generic/include/clc/geometric/fast_normalize.inc
create mode 100644 generic/lib/geometric/fast_normalize.cl
create mode 100644 generic/lib/geometric/fast_normalize.inc

diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
index 77e0af9..b4409a7 100644
--- a/generic/include/clc/clc.h
+++ b/generic/include/clc/clc.h
@@ -123,6 +123,7 @@
#include <clc/geometric/dot.h>
#include <clc/geometric/fast_distance.h>
#include <clc/geometric/fast_length.h>
+#include <clc/geometric/fast_normalize.h>
#include <clc/geometric/length.h>
#include <clc/geometric/normalize.h>

diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
new file mode 100644
index 0000000..cba69c2
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#define __CLC_BODY <clc/geometric/fast_normalize.inc>
+#include <clc/geometric/floatn.inc>
diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
new file mode 100644
index 0000000..3ef8f86
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.inc
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+
+_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
index fb7a9ae..2c8e2d8 100644
--- a/generic/include/clc/geometric/floatn.inc
+++ b/generic/include/clc/geometric/floatn.inc
@@ -1,8 +1,11 @@
#define __CLC_FLOAT float
+#define __CLC_FPSIZE 32

#define __CLC_FLOATN float
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN float2
#include __CLC_BODY
@@ -17,14 +20,18 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#ifdef cl_khr_fp64

#define __CLC_FLOAT double
+#define __CLC_FPSIZE 64

#define __CLC_FLOATN double
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN double2
#include __CLC_BODY
@@ -39,6 +46,7 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#endif

diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
index e01194f..9367c24 100644
--- a/generic/lib/SOURCES
+++ b/generic/lib/SOURCES
@@ -38,6 +38,7 @@ geometric/distance.cl
geometric/dot.cl
geometric/fast_distance.cl
geometric/fast_length.cl
+geometric/fast_normalize.cl
geometric/length.cl
geometric/normalize.cl
integer/abs.cl
diff --git a/generic/lib/geometric/fast_normalize.cl b/generic/lib/geometric/fast_normalize.cl
new file mode 100644
index 0000000..d3d7846
--- /dev/null
+++ b/generic/lib/geometric/fast_normalize.cl
@@ -0,0 +1,40 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <clc/clc.h>
+
+_CLC_OVERLOAD _CLC_DEF float fast_normalize(float p) {
+ return normalize(p);
+}
+
+#ifdef cl_khr_fp64
+
+#pragma OPENCL EXTENSION cl_khr_fp64 : enable
+
+_CLC_OVERLOAD _CLC_DEF double fast_normalize(double p) {
+ return normalize(p);
+}

Shouldn’t this be:

#include <clc/clc.h>

#ifdef cl_khr_fp64
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#endif

#define __CLC_BODY <fast_normalize.inc>
#include <clc/geometric/floatn.inc>

I did it this way because the scalar implementations are different. The
other way to do this would be to move the scalar implementations into
the .inc file and guard them with #ifdef __CLC_SCALAR. Which do you prefer?

Ah, sorry, I missed that. The current choice is fine.

Is there any particular reason why the scalar version is different?

Jeroen

>
>>
>>>
>>> This implementation was ported from the AMD builtin library
>>> and has been tested with piglit, OpenCV, and the ocl conformance tests.
>>>
>>> v2:
>>> - Remove f suffix from constant in double implementations.
>>> - Consolidate implementations using the .cl/.inc approach.
>>>
>>> v3:
>>> - Use __CLC_FPSIZE instead of __CLC_FP{32,64}
>>> ---
>>> generic/include/clc/clc.h | 1 +
>>> generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
>>> generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
>>> generic/include/clc/geometric/floatn.inc | 8 +++++
>>> generic/lib/SOURCES | 1 +
>>> generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
>>> generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
>>> 7 files changed, 137 insertions(+)
>>> create mode 100644 generic/include/clc/geometric/fast_normalize.h
>>> create mode 100644 generic/include/clc/geometric/fast_normalize.inc
>>> create mode 100644 generic/lib/geometric/fast_normalize.cl
>>> create mode 100644 generic/lib/geometric/fast_normalize.inc
>>>
>>> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
>>> index 77e0af9..b4409a7 100644
>>> --- a/generic/include/clc/clc.h
>>> +++ b/generic/include/clc/clc.h
>>> @@ -123,6 +123,7 @@
>>> #include <clc/geometric/dot.h>
>>> #include <clc/geometric/fast_distance.h>
>>> #include <clc/geometric/fast_length.h>
>>> +#include <clc/geometric/fast_normalize.h>
>>> #include <clc/geometric/length.h>
>>> #include <clc/geometric/normalize.h>
>>>
>>> diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
>>> new file mode 100644
>>> index 0000000..cba69c2
>>> --- /dev/null
>>> +++ b/generic/include/clc/geometric/fast_normalize.h
>>> @@ -0,0 +1,24 @@
>>> +/*
>>> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>>> + * of this software and associated documentation files (the "Software"), to deal
>>> + * in the Software without restriction, including without limitation the rights
>>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>>> + * copies of the Software, and to permit persons to whom the Software is
>>> + * furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
>>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>>> + * THE SOFTWARE.
>>> + */
>>> +
>>> +#define __CLC_BODY <clc/geometric/fast_normalize.inc>
>>> +#include <clc/geometric/floatn.inc>
>>> diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
>>> new file mode 100644
>>> index 0000000..3ef8f86
>>> --- /dev/null
>>> +++ b/generic/include/clc/geometric/fast_normalize.inc
>>> @@ -0,0 +1,24 @@
>>> +/*
>>> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>>> + * of this software and associated documentation files (the "Software"), to deal
>>> + * in the Software without restriction, including without limitation the rights
>>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>>> + * copies of the Software, and to permit persons to whom the Software is
>>> + * furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
>>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>>> + * THE SOFTWARE.
>>> + */
>>> +
>>> +
>>> +_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
>>> diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
>>> index fb7a9ae..2c8e2d8 100644
>>> --- a/generic/include/clc/geometric/floatn.inc
>>> +++ b/generic/include/clc/geometric/floatn.inc
>>> @@ -1,8 +1,11 @@
>>> #define __CLC_FLOAT float
>>> +#define __CLC_FPSIZE 32
>>>
>>> #define __CLC_FLOATN float
>>> +#define __CLC_SCALAR
>>> #include __CLC_BODY
>>> #undef __CLC_FLOATN
>>> +#undef __CLC_SCALAR
>>>
>>> #define __CLC_FLOATN float2
>>> #include __CLC_BODY
>>> @@ -17,14 +20,18 @@
>>> #undef __CLC_FLOATN
>>>
>>> #undef __CLC_FLOAT
>>> +#undef __CLC_FPSIZE
>>>
>>> #ifdef cl_khr_fp64
>>>
>>> #define __CLC_FLOAT double
>>> +#define __CLC_FPSIZE 64
>>>
>>> #define __CLC_FLOATN double
>>> +#define __CLC_SCALAR
>>> #include __CLC_BODY
>>> #undef __CLC_FLOATN
>>> +#undef __CLC_SCALAR
>>>
>>> #define __CLC_FLOATN double2
>>> #include __CLC_BODY
>>> @@ -39,6 +46,7 @@
>>> #undef __CLC_FLOATN
>>>
>>> #undef __CLC_FLOAT
>>> +#undef __CLC_FPSIZE
>>>
>>> #endif
>>>
>>> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
>>> index e01194f..9367c24 100644
>>> --- a/generic/lib/SOURCES
>>> +++ b/generic/lib/SOURCES
>>> @@ -38,6 +38,7 @@ geometric/distance.cl
>>> geometric/dot.cl
>>> geometric/fast_distance.cl
>>> geometric/fast_length.cl
>>> +geometric/fast_normalize.cl
>>> geometric/length.cl
>>> geometric/normalize.cl
>>> integer/abs.cl
>>> diff --git a/generic/lib/geometric/fast_normalize.cl b/generic/lib/geometric/fast_normalize.cl
>>> new file mode 100644
>>> index 0000000..d3d7846
>>> --- /dev/null
>>> +++ b/generic/lib/geometric/fast_normalize.cl
>>> @@ -0,0 +1,40 @@
>>> +/*
>>> + * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>>> + * of this software and associated documentation files (the "Software"), to deal
>>> + * in the Software without restriction, including without limitation the rights
>>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>>> + * copies of the Software, and to permit persons to whom the Software is
>>> + * furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
>>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>>> + * THE SOFTWARE.
>>> + */
>>> +
>>> +#include <clc/clc.h>
>>> +
>>> +_CLC_OVERLOAD _CLC_DEF float fast_normalize(float p) {
>>> + return normalize(p);
>>> +}
>>> +
>>> +#ifdef cl_khr_fp64
>>> +
>>> +#pragma OPENCL EXTENSION cl_khr_fp64 : enable
>>> +
>>> +_CLC_OVERLOAD _CLC_DEF double fast_normalize(double p) {
>>> + return normalize(p);
>>> +}
>>
>> Shouldn’t this be:
>>
>> #include <clc/clc.h>
>>
>> #ifdef cl_khr_fp64
>> #pragma OPENCL EXTENSION cl_khr_fp64 : enable
>> #endif
>>
>> #define __CLC_BODY <fast_normalize.inc>
>> #include <clc/geometric/floatn.inc>
>>
>
> I did it this way because the scalar implementations are different. The
> other way to do this would be to move the scalar implementations into
> the .inc file and guard them with #ifdef __CLC_SCALAR. Which do you prefer?

Ah, sorry, I missed that. The current choice is fine.

Is there any particular reason why the scalar version is different?

It's because the function operates on the vector component wise:

a * (1 / sqrt(a.x^2 + a.y^2 + a.z^2))

So the scalar case is trivial.

-Tom

This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.

v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.

v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}
---
generic/include/clc/clc.h | 1 +
generic/include/clc/geometric/fast_normalize.h | 24 ++++++++++++++
generic/include/clc/geometric/fast_normalize.inc | 24 ++++++++++++++
generic/include/clc/geometric/floatn.inc | 8 +++++
generic/lib/SOURCES | 1 +
generic/lib/geometric/fast_normalize.cl | 40 ++++++++++++++++++++++++
generic/lib/geometric/fast_normalize.inc | 39 +++++++++++++++++++++++
7 files changed, 137 insertions(+)
create mode 100644 generic/include/clc/geometric/fast_normalize.h
create mode 100644 generic/include/clc/geometric/fast_normalize.inc
create mode 100644 generic/lib/geometric/fast_normalize.cl
create mode 100644 generic/lib/geometric/fast_normalize.inc

diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
index 77e0af9..b4409a7 100644
--- a/generic/include/clc/clc.h
+++ b/generic/include/clc/clc.h
@@ -123,6 +123,7 @@
#include <clc/geometric/dot.h>
#include <clc/geometric/fast_distance.h>
#include <clc/geometric/fast_length.h>
+#include <clc/geometric/fast_normalize.h>
#include <clc/geometric/length.h>
#include <clc/geometric/normalize.h>

diff --git a/generic/include/clc/geometric/fast_normalize.h b/generic/include/clc/geometric/fast_normalize.h
new file mode 100644
index 0000000..cba69c2
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#define __CLC_BODY <clc/geometric/fast_normalize.inc>
+#include <clc/geometric/floatn.inc>
diff --git a/generic/include/clc/geometric/fast_normalize.inc b/generic/include/clc/geometric/fast_normalize.inc
new file mode 100644
index 0000000..3ef8f86
--- /dev/null
+++ b/generic/include/clc/geometric/fast_normalize.inc
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+
+_CLC_OVERLOAD _CLC_DECL __CLC_FLOATN fast_normalize(__CLC_FLOATN p);
diff --git a/generic/include/clc/geometric/floatn.inc b/generic/include/clc/geometric/floatn.inc
index fb7a9ae..2c8e2d8 100644
--- a/generic/include/clc/geometric/floatn.inc
+++ b/generic/include/clc/geometric/floatn.inc
@@ -1,8 +1,11 @@
#define __CLC_FLOAT float
+#define __CLC_FPSIZE 32

#define __CLC_FLOATN float
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN float2
#include __CLC_BODY
@@ -17,14 +20,18 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#ifdef cl_khr_fp64

#define __CLC_FLOAT double
+#define __CLC_FPSIZE 64

#define __CLC_FLOATN double
+#define __CLC_SCALAR
#include __CLC_BODY
#undef __CLC_FLOATN
+#undef __CLC_SCALAR

#define __CLC_FLOATN double2
#include __CLC_BODY
@@ -39,6 +46,7 @@
#undef __CLC_FLOATN

#undef __CLC_FLOAT
+#undef __CLC_FPSIZE

#endif

diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
index e01194f..9367c24 100644
--- a/generic/lib/SOURCES
+++ b/generic/lib/SOURCES
@@ -38,6 +38,7 @@ geometric/distance.cl
geometric/dot.cl
geometric/fast_distance.cl
geometric/fast_length.cl
+geometric/fast_normalize.cl
geometric/length.cl
geometric/normalize.cl
integer/abs.cl
diff --git a/generic/lib/geometric/fast_normalize.cl b/generic/lib/geometric/fast_normalize.cl
new file mode 100644
index 0000000..d3d7846
--- /dev/null
+++ b/generic/lib/geometric/fast_normalize.cl
@@ -0,0 +1,40 @@
+/*
+ * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include <clc/clc.h>
+
+_CLC_OVERLOAD _CLC_DEF float fast_normalize(float p) {
+ return normalize(p);
+}
+
+#ifdef cl_khr_fp64
+
+#pragma OPENCL EXTENSION cl_khr_fp64 : enable
+
+_CLC_OVERLOAD _CLC_DEF double fast_normalize(double p) {
+ return normalize(p);
+}

Shouldn’t this be:

#include <clc/clc.h>

#ifdef cl_khr_fp64
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#endif

#define __CLC_BODY <fast_normalize.inc>
#include <clc/geometric/floatn.inc>

I did it this way because the scalar implementations are different. The
other way to do this would be to move the scalar implementations into
the .inc file and guard them with #ifdef __CLC_SCALAR. Which do you prefer?

Ah, sorry, I missed that. The current choice is fine.

Is there any particular reason why the scalar version is different?

It's because the function operates on the vector component wise:

a * (1 / sqrt(a.x^2 + a.y^2 + a.z^2))

So the scalar case is trivial.

Of course.

Ok, LGTM.

Jeroen

These first 3 all look fine to me