Some more patches

These patches have more meat to them.

two_task_teams.patch - The tt_state flag is removed and replaced by an array of two task_team pointers in the team struct. Each thread’s local th_task_team points to one of those two task_teams on the team struct, and the thread’s th_task_state indicates which one. With this change, inside a barrier, the master thread can now change out the task_team on the team struct at any time, because the threads will switch to look at the other task_team. This gives us some flexibility in implementing barrier algorithms that don’t have separate gather and release phases, like dissemination barriers.

backtrace_fix_cfi.patch - Inline assembly breaks back trace for OpenMP applications. This change adds CFI directives for stack tracing.

omp_is_initial_device.patch - omp_is_initial_device() was implemented based on operating system:

We use _Offload_get_device_number() on Linux.

And on other operating systems, omp_is_initial_device() always returns 1.

debugger_struct_update.patch – updates the kmp_omp_struct_info_t for debuggers. Kmp_omp_struct_info_t is a structure that was prepared and exported as a static symbol

to be used for interfacing with debuggers. In this change, kmp_omp_struct_info_t structure was fixed to reflect the current OpenMP RTL structures.

loop_out_of_bounds_fix.patch – small patch to fix a loop-out-of-bounds error for Fortran.

Applying patches (all independent):

$ patch –p0 < two_task_teams.patch

$ patch –p0 < backtrace_fix_cfi.patch

$ patch –p0 < omp_is_initial_device.patch

$ patch –p0 < debugger_struct_update.patch

$ patch –p0 < loop_out_of_bounds_fix.patch

– Johnny

two_task_teams.patch (37.6 KB)

backtrace_fix_cfi.patch (6.7 KB)

omp_is_initial_device.patch (7.86 KB)

debugger_struct_update.patch (6.07 KB)

loop_out_of_bounds_fix.patch (516 Bytes)

There has been no replies to these. Are we ok to commit?

– Johnny

Hi Johnny,

First, it would be really nice if, in the future, you could post full-context patches to reviews.llvm.org. It makes it much easier (for me, at least) to figure out what is going on. See: http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Comments on patches inline...

From: openmp-dev-bounces@cs.uiuc.edu
[mailto:openmp-dev-bounces@cs.uiuc.edu] On Behalf Of Peyton,
Jonathan L
Sent: Friday, January 30, 2015 5:42 PM
To: openmp-dev@dcs-maillist2.engr.illinois.edu
Subject: [Openmp-dev] Some more patches

These patches have more meat to them.

two_task_teams.patch - The tt_state flag is removed and replaced by
an array of two task_team pointers in the team struct. Each thread's
local th_task_team points to one of those two task_teams on the team
struct, and the thread's th_task_state indicates which one. With
this change, inside a barrier, the master thread can now change out
the task_team on the team struct at any time, because the threads
will switch to look at the other task_team. This gives us some
flexibility in implementing barrier algorithms that don’t have
separate gather and release phases, like dissemination barriers.

I've not really looked at the algorithm in detail, so here are a few cosmetic remarks:

- else {
+ //else
         // All threads have reported in, and no tasks were spawned

Remove commented-out else and adjust the indentation of the comment below it

+ for (tt_idx=0; tt_idx<2; ++tt_idx) {
+ // We don't know which of the two task teams workers are waiting on, so deactivate both.
+ kmp_task_team_t *task_team = team->t.t_task_team[tt_idx];
                 if ( ( task_team != NULL ) && TCR_SYNC_4(task_team->tt.tt_active) ) {

we should fixup the indentation here (a follow-up commit is fine if it makes the change clearer in this patch) -- there are a number of other places where this is true too.

+ // Signal worker threads (esp. the extra ones) to stop looking for tasks while spin waiting.
+ // The task teams are reference counted and will be deallocated by the last worker thread.
                 KMP_DEBUG_ASSERT( hot_team->t.t_nproc > 1 );
                 TCW_SYNC_4( task_team->tt.tt_active, FALSE );
                 KMP_MB();

Indentation is odd here too.

From by brief look, it does not seem like this change affects where memory barriers need to be placed; is that accurate?

Please go ahead and commit.

backtrace_fix_cfi.patch - Inline assembly breaks back trace for
OpenMP applications. This change adds CFI directives for stack
tracing.

LGTM. However, I don't understand this choice:

+# if __MIC__ || __MIC2__
+# define KMP_LABEL(x) L_##x // local label
+# else
+# define KMP_LABEL(x) .L_##x // local label hidden from backtraces
+# endif // __MIC__ || __MIC2__

this, at least, deserves an explanation. If there's no reason for it, I propose that we remove the MIC special case.

omp_is_initial_device.patch - omp_is_initial_device() was implemented
based on operating system:

We use _Offload_get_device_number() on Linux.

And on other operating systems, omp_is_initial_device() always
returns 1.

LGTM.

debugger_struct_update.patch – updates the kmp_omp_struct_info_t for
debuggers. Kmp_omp_struct_info_t is a structure that was prepared
and exported as a static symbol

to be used for interfacing with debuggers. In this change,
kmp_omp_struct_info_t structure was fixed to reflect the current
OpenMP RTL structures.

LGTM.

loop_out_of_bounds_fix.patch – small patch to fix a
loop-out-of-bounds error for Fortran.

LGTM. However, please add a comment explaining what is going on. Where else is depth being modified? How does setting it to 1 avoid a loop-out-of-bounds error? Generally speaking, we should say where else depth is being modified (is it only in init() and in __kmp_get_hierarchy?

Thanks again,
Hal

5 patches committed; details/comments below.

1. two_task_teams.patch - The usage of tt_state flag is replaced by an array of two task_team pointers.
  Committed, rev. 228718.

- else {
+ //else
         // All threads have reported in, and no tasks were spawned

Remove commented-out else and adjust the indentation of the comment below it

The comment below "//else" describes things in case the preceding if condition is not true,
thus removing "//else" and changing indentation is not recommended.

I've fixed the indentation problems visible in the rest of the patch (8 cases or so).

2. backtrace_fix_cfi.patch - Added CFI directives to asm code in order to have correct backtraces in gdb.
  Committed, rev. 228721.

+# if __MIC__ || __MIC2__
+# define KMP_LABEL(x) L_##x // local label
+# else
+# define KMP_LABEL(x) .L_##x // local label hidden from backtraces
+# endif // __MIC__ || __MIC2__

We cannot remove MIC case because .L* syntax is rejected by MIC assembler.
For Linux all labels unified to use .L* syntax that allows to see correct backtraces in gdb.

I added comment on the format of local labels (separate commit - rev. 228727).

3. omp_is_initial_device.patch - OpenMP 4.0 standard function omp_is_initial_device() implemented.
  Committed, rev. 228730.

4. debugger_struct_update.patch - Updated the kmp_omp_struct_info_t structure used by debuggers.
  Committed, rev. 228734.

5. loop_out_of_bounds_fix.patch - Fixed memory corruption problem.
  Committed, rev. 228736.

I also included comment into this patch:

+ /* Added explicit initialization of the depth here to prevent usage of dirty value
+ observed when static library is re-initialized multiple times (e.g. when
+ non-OpenMP thread repeatedly launches/joins thread that uses OpenMP). */
+ depth = 1;

Thanks,
Andrey