New OpenMP runtime release

For your Christmas hacking pleasure.

This release aligns with Intel(r) Composer XE 2013 SP1 Product Update 2

New features
* The library can now be built with clang (though with some
  limitations since clang does not support 128 bit floats, so the library built with
  clang cannot be used with a different compiler that does have such support.)
* Support for Vtune analysis of load imbalance
* Code contribution from Steven Noonan to build the runtime for ARM*
  architecture processors
* First implementation of runtime API for OpenMP cancellation

Bug Fixes
* Fixed hang on Windows (only) when using KMP_BLOCKTIME=0

-- Jim

James Cownie <james.h.cownie@intel.com>
SSG/DPD/TCAR (Technical Computing, Analyzers and Runtimes)
Tel: +44 117 9071438

For your Christmas hacking pleasure.

This release aligns with Intel(r) Composer XE 2013 SP1 Product Update 2

New features
* The library can now be built with clang (though with some
  limitations since clang does not support 128 bit floats, so the library built with
  clang cannot be used with a different compiler that does have such support.)
* Support for Vtune analysis of load imbalance
* Code contribution from Steven Noonan to build the runtime for ARM*
  architecture processors

Cool. Thanks, Jim!

It looks like there's still a minor correction that needs to happen. I
didn't get an opportunity to test the merged changes until they went
public, and there's actually a compile error in there now:

  ../../src/z_Linux_asm.s: Assembler messages:
  ../../src/z_Linux_asm.s:1590: Error: junk at end of line, first unrecognized character is `,'

With a quick bit of Googling, I found this:

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=422971

In that bug report's comments, Daniel Jacobowitz clarifies that '@' is
an ARM comment marker, so using '%' is the preferred annotation. I've
compile and run-tested this on both x86_64 and ARM, and both work at
this point:

diff --git a/runtime/src/z_Linux_asm.s b/runtime/src/z_Linux_asm.s
index 1f1ba1b..bba3b14 100644
--- a/runtime/src/z_Linux_asm.s
+++ b/runtime/src/z_Linux_asm.s
@@ -1587,5 +1587,5 @@ __kmp_unnamed_critical_addr:

#if defined(__linux__)
-.section .note.GNU-stack,"",@progbits
+.section .note.GNU-stack,"",%progbits
#endif

Now to conquer the build issues on Mac OS X...

That change is almost certainly incorrect (we've had to fix that same problem in libunwind and openblas before)

Take a look at this and see if something similar works for you
https://github.com/xianyi/OpenBLAS/blob/develop/driver/others/memory.c

OpenBLAS/driver/others/memory.c: asm ("\t.section\t.ctors,\"aw\",@progbits; .align 8; .quad gotoblas_init; .section .text");

If you really can't support this for some crazy reason then please hide it further under an ARM specific #if

Thanks

For your Christmas hacking pleasure.

This release aligns with Intel(r) Composer XE 2013 SP1 Product Update 2

New features
* The library can now be built with clang (though with some
   limitations since clang does not support 128 bit floats, so the library built with
   clang cannot be used with a different compiler that does have such support.)
* Support for Vtune analysis of load imbalance
* Code contribution from Steven Noonan to build the runtime for ARM*
   architecture processors

Cool. Thanks, Jim!

It looks like there's still a minor correction that needs to happen. I
didn't get an opportunity to test the merged changes until they went
public, and there's actually a compile error in there now:

        ../../src/z_Linux_asm.s: Assembler messages:
        ../../src/z_Linux_asm.s:1590: Error: junk at end of line, first unrecognized character is `,'

With a quick bit of Googling, I found this:

     http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=422971

In that bug report's comments, Daniel Jacobowitz clarifies that '@' is
an ARM comment marker, so using '%' is the preferred annotation. I've
compile and run-tested this on both x86_64 and ARM, and both work at
this point:

diff --git a/runtime/src/z_Linux_asm.s b/runtime/src/z_Linux_asm.s
index 1f1ba1b..bba3b14 100644
--- a/runtime/src/z_Linux_asm.s
+++ b/runtime/src/z_Linux_asm.s
@@ -1587,5 +1587,5 @@ __kmp_unnamed_critical_addr:

  #if defined(__linux__)
-.section .note.GNU-stack,"",@progbits
+.section .note.GNU-stack,"",%progbits
  #endif

That change is almost certainly incorrect (we've had to fix that same problem in libunwind and openblas before)

With a bit more searching around, I think the correct answer is actually this:

#if defined (__linux__)
#if KMP_ARCH_ARM
.section .note.GNU-stack,"",%progbits
#else
.section .note.GNU-stack,"",@progbits
#endif
#endif

Based on binutils documentation:
https://sourceware.org/binutils/docs-2.18/as/Section.html

"Note on targets where the @ character is the start of a comment (eg ARM) then
another character is used instead. For example the ARM port uses the %
character."

So on x86, @ is correct, while on ARM % is correct.

Better patch:

diff --git a/runtime/src/z_Linux_asm.s b/runtime/src/z_Linux_asm.s
index 1f1ba1b..0a96d38 100644
--- a/runtime/src/z_Linux_asm.s
+++ b/runtime/src/z_Linux_asm.s
@@ -1587,5 +1587,9 @@ __kmp_unnamed_critical_addr:

#if defined(__linux__)
+#if KMP_ARCH_ARM
+.section .note.GNU-stack,"",%progbits
+#else
.section .note.GNU-stack,"",@progbits
#endif
+#endif

Take a look at this and see if something similar works for you
https://github.com/xianyi/OpenBLAS/blob/develop/driver/others/memory.c

OpenBLAS/driver/others/memory.c: asm ("\t.section\t.ctors,\"aw\",@progbits; .align 8; .quad gotoblas_init; .section .text");

That also uses the @, which makes ARM 'as' sad.

2 nits while we're here

#1 (typically?) .S indicates the asm file needs to be pre-processed and .s doesn't. I'd suggest to rename the file

#2 It may arguably be better long term design to have something like
runtime/src/arm/z_Linux_asm.s
runtime/src/amd64/z_Linux_asm.s

this may add some duplication, but avoids yucky macro hell since that file is likely to be a bane of portability. I wonder if Hal/Jeff (who worked on the BGQ port) have any comments

(Sorry for top post. On phone.)

What I’m proposing is just the minimal change to get the thing building as designed.

I would agree that refactoring this would be a good idea, but that would be a logically separate change. Let’s not conflate the two topics or gate the patch on refactoring. Can we start another thread about refactoring? I have some ideas about the subject (particularly on cleaning up the build system), but haven’t acted on them yet.

I do care about the file rename (minimal change), but adding a directory hierarchy is less an issue (non-blocker)

Thanks for the ARM port and sorry about the nits