OT: intel darwin losing primary target status

I realize this is off-topic for the list, but I thought
all the darwin developers here might want to be aware of
this. The current regressions in gcc trunk regarding
exception handling has been escalated to a P1 in order to
attract darwin developers to the issue...

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41260#c31

If these regressions aren't fixed before gcc 4.5's release,
it appears the *-*-darwin will be removed from the primary
target list for FSF gcc. This would be rather unfortunate
since it would eventually compromise the quality of fortran
compilers that darwin users have access to. Hopefully the
current darwin maintainers listed for FSF gcc can find some
approach acceptable to their management where the other
FSF gcc developers can be guided through debugging and
fixing this regression.
                Jack

This may be that the libgcc_s.dylib based unwinder is incompatible with the darwin unwinder. You cannot mix and match the two. One of the lines from the bugzilla comments shows:
/sw/lib/gcc4.5/lib/libgcc_s.1.dylib (compatibility version 1.0.0,
being used. That will not work. All of the libgcc_s.dylib functionality has been subsumed into libSystem.dylib on SnowLeopard (darwin10). The gcc compiler that shipped with SnowLeopard leaves the -lgcc_s off the link line when targeting SnowLeopard. If there is a newer libgcc_s with new functions added, then the link line needs to change to "-lSystem -lgcc_s", that way the linker will find the most routines in libSystem.dylib and only the new functions from libgcc_s.dylib. Thus all linkage units will use the same unwinder.

-Nick

Nick,
    How exactly do you envision this being done? Looking at the contents
of config/darwin.h, I see...

/* Support -mmacosx-version-min by supplying different (stub) libgcc_s.dylib
   libraries to link against, and by not linking against libgcc_s on
   earlier-than-10.3.9.

   Note that by default, -lgcc_eh is not linked against! This is
   because in a future version of Darwin the EH frame information may
   be in a new format, or the fallback routine might be changed; if
   you want to explicitly link against the static version of those
   routines, because you know you don't need to unwind through system
   libraries, you need to explicitly say -static-libgcc.

   If it is linked against, it has to be before -lgcc, because it may
   need symbols from -lgcc. */
#undef REAL_LIBGCC_SPEC
#define REAL_LIBGCC_SPEC \
   "%{static-libgcc|static: -lgcc_eh -lgcc; \
      shared-libgcc|fexceptions|fgnu-runtime: \
       %:version-compare(!> 10.5 mmacosx-version-min= -lgcc_s.10.4) \
       %:version-compare(>= 10.5 mmacosx-version-min= -lgcc_s.10.5) \
       -lgcc; \
      :%:version-compare(>< 10.3.9 10.5 mmacosx-version-min= -lgcc_s.10.4) \
       %:version-compare(>= 10.5 mmacosx-version-min= -lgcc_s.10.5) \
       -lgcc}"

Would it be as simple as adding...

%:version-compare(>= 10.6 mmacosx-version-min= -lgcc_s.10.5) \
       -lSystem -lgcc}"

or could we even use...

%:version-compare(>= 10.6 mmacosx-version-min= -lgcc_s.10.5) \
       -lSystem -lgcc_s}"

I think the second case would solve the outstanding issue of the
TLS emutls not being linked in on darwin...

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39888

which would definitely be nice and take a lot of the testsuite
out of the unsupported result mode.
   If anyone would like to propose a specific patch, I would be
more than happy to test it against current gcc trunk.
                    Jack

Jack,

I think there is an extra dimension to darwin that might be confusing things. Darwin uses two-level-name-space. That means that at build time the linker records where it found each dylib (SO) symbol. (It records the path the dylib supplied as its "install name" - not just the leafname as SO_NEEDED does.)

On a SnowLeopard system you *can* link against /usr/lib/libgcc_s.10.5.dylib, but the linker will not record any symbols coming from it. In fact, the link order does not matter. That is because /usr/lib/libgcc_s.10.5.dylib has magic symbols in it that say if you are targeting 10.6 then _Unwind_Resume (and other other symbols) are not in that dylib, so the linker looks elsewhere and finds them in libSystem.B.dylib. In other words, the compiler changes to SnowLeopard to omit /re-order the linking with -lgcc_s when targeting 10.6 was just an optimization and not required.

So, when these test cases are run, is the binary linked against /usr/lib/libgcc_s.10.5.dylib? or against some just built libgcc_s.10.5.dylib? or against some just build libgcc_s.dylib? If either of the latter, then if you changed the FSF build of libgcc_s for darwin to have the right magic symbols, then when targeting 10.6, the linker will ignore those dylibs and record that the symbols should be found in /usr/lib/libSystem.B.dylib at runtime.

That does mean you will *not* be implicitly testing the libgcc just built (since at runtime the OS implementation of libgcc functions in libSystem.dylib will be used), but I think this test suite is supposed to be testing the compiler. So that should be OK.

It also means any new symbols introduced in main line libgcc *will* be recorded as coming from /custom/path/libgcc_s.1.dylib. Which again is what you want (as long as the functions are independent and don't need to be used in sets). And very similar to the libgcc_ext.dylib idea from http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3988

-Nick

I'll double check but I believe that the testsuite always links the
libgcc_s.10.5.dylib built by the FSF gcc build. The problem with the
emutls symbols if I recall correctly is that those are not exposed
through libgcc_s.10.5 but rather libgcc_s directly. For now, it would
be better to ignore the emults issues and focus on what needs to be
done to fix the unwinding. I guess you are saying we need to move
the unwinders symbols out into a libgcc_ext and use that in the
linkage on 10.6 or later so the FSF unwinder is used instead of
the system one?
                  Jack

The important thing is that only one unwinder is used. The _Unwind_Context data structure is different between the darwin and FSF implementations, so you can't pass it between two different implementations. Since darwin uses two-level namespace and swapping in a new libgcc_s.dylib at runtime is not going to effect the various OS dylibs that are looking for _Unwind_* routines in libSystem.dylib.

Even if you managed to get the test suite to run where everything is consistently using the just built libgcc_s.dylib, how will this work for end users of the gcc-4.5+ who want to ship an app built with gcc-4.5+? Their code needs to use the OS unwinder.

So it seems to me you should be testing the compiler against the OS unwinder - not against the just built unwinder.

Has something changed in the FSF unwinder that clients of gcc will want?

-Nick

P.S. One minor pet peeve of mine is that the exception handling is well layered (e.g. _cxa_* layer on top the _Unwind_* layer (which on darwin is now built upon the libunwind layer)). Except for one glaring problem. The compiler mostly emits calls to _cxa_* functions, but it makes one call to _Unwind_Resume. Ah!! And those are in different dylibs! If instead the compiler made some call like __cxa_resume() in libstdc++.dylib which turned around and called _Unwind_Resume() in libgcc_s.dylib, then much of the problem above would never happen.

The important thing is that only one unwinder is used. The
_Unwind_Context data structure is different between the darwin and FSF
implementations, so you can't pass it between two different
implementations. Since darwin uses two-level namespace and swapping in
a new libgcc_s.dylib at runtime is not going to effect the various OS
dylibs that are looking for _Unwind_* routines in libSystem.dylib.

Even if you managed to get the test suite to run where everything is
consistently using the just built libgcc_s.dylib, how will this work for
end users of the gcc-4.5+ who want to ship an app built with gcc-4.5+?
Their code needs to use the OS unwinder.

So it seems to me you should be testing the compiler against the OS
unwinder - not against the just built unwinder.

Has something changed in the FSF unwinder that clients of gcc will want?

-Nick

P.S. One minor pet peeve of mine is that the exception handling is well
layered (e.g. _cxa_* layer on top the _Unwind_* layer (which on darwin is
now built upon the libunwind layer)). Except for one glaring problem.
The compiler mostly emits calls to _cxa_* functions, but it makes one
call to _Unwind_Resume. Ah!! And those are in different dylibs! If
instead the compiler made some call like __cxa_resume() in
libstdc++.dylib which turned around and called _Unwind_Resume() in
libgcc_s.dylib, then much of the problem above would never happen.

I am not sure if some of these problems were latent and only
exposed by the change, but the mass regressions in the g++
and libjava testsuites were triggered by the commit...

Author: rth
New Revision: 147995

URL: gcc.gnu.org Git - gcc.git/commit
Log:
        * cfgcleanup.c (try_crossjump_to_edge): Only skip past
        NOTE_INSN_BASIC_BLOCK.
        * cfglayout.c (duplicate_insn_chain): Copy epilogue insn marks.
        Duplicate NOTE_INSN_EPILOGUE_BEG notes.
        * cfgrtl.c (can_delete_note_p): Allow NOTE_INSN_EPILOGUE_BEG
        to be deleted.
        * dwarf2out.c (struct cfa_loc): Change indirect field to bitfield,
        add in_use field.
        (add_cfi): Disable check redefining cfa away from drap.
        (lookup_cfa_1): Add remember argument; handle remember/restore.
        (lookup_cfa): Pass remember argument.
        (cfa_remember): New.
        (compute_barrier_args_size_1): Remove sibcall check.
        (dwarf2out_frame_debug_def_cfa): New.
        (dwarf2out_frame_debug_adjust_cfa): New.
        (dwarf2out_frame_debug_cfa_offset): New.
        (dwarf2out_frame_debug_cfa_register): New.
        (dwarf2out_frame_debug_cfa_restore): New.
        (dwarf2out_frame_debug): Handle REG_CFA_* notes.
        (dwarf2out_begin_epilogue): New.
        (dwarf2out_frame_debug_restore_state): New.
        (dw_cfi_oprnd1_desc): Handle DW_CFA_remember_state,
        DW_CFA_restore_state.
        (output_cfi_directive): Likewise.
        (convert_cfa_to_fb_loc_list): Likewise.
        (dw_cfi_oprnd1_desc): Handle DW_CFA_restore.
        * dwarf2out.h: Update.
        * emit-rtl.c (try_split): Don't split RTX_FRAME_RELATED_P.
        (copy_insn_1): Early out for null.
        * final.c (final_scan_insn): Call dwarf2out_begin_epilogue
        and dwarf2out_frame_debug_restore_state.
        * function.c (prologue, epilogue, sibcall_epilogue): Remove.
        (prologue_insn_hash, epilogue_insn_hash): New.
        (free_after_compilation): Adjust freeing accordingly.
        (record_insns): Create hash table if needed; push insns into
        hash instead of array.
        (maybe_copy_epilogue_insn): New.
        (contains): Search hash table instead of array.
        (sibcall_epilogue_contains): Remove.
        (thread_prologue_and_epilogue_insns): Split eh_return insns
        and mark them as epilogues.
        (reposition_prologue_and_epilogue_notes): Rewrite epilogue
        scanning in terms of basic blocks.
        * insn-notes.def (CFA_RESTORE_STATE): New.
        * jump.c (returnjump_p_1): Accept EH_RETURN.
        (eh_returnjump_p_1, eh_returnjump_p): New.
        * reg-notes.def (CFA_DEF_CFA, CFA_ADJUST_CFA, CFA_OFFSET,
        CFA_REGISTER, CFA_RESTORE): New.
        * rtl.def (EH_RETURN): New.
        * rtl.h (eh_returnjump_p, maybe_copy_epilogue_insn): Declare.

        * config/bfin/bfin.md (UNSPEC_VOLATILE_EH_RETURN): Remove.
        (eh_return_internal): Use eh_return rtx; split w/ epilogue.

        * config/i386/i386.c (gen_push): Update cfa state.
        (pro_epilogue_adjust_stack): Add set_cfa argument. When true,
        add a CFA_ADJUST_CFA note.
        (ix86_dwarf_handle_frame_unspec): Remove.
        (ix86_expand_prologue): Update cfa state.
        (ix86_emit_restore_reg_using_pop): New.
        (ix86_emit_restore_regs_using_pop): New.
        (ix86_emit_leave): New.
        (ix86_emit_restore_regs_using_mov): Add CFA_RESTORE notes.
        (ix86_expand_epilogue): Add notes for unwinding the epilogue.
        * config/i386/i386.h (struct machine_cfa_state): New.
        (ix86_cfa_state): New.
        * config/i386/i386.md (UNSPEC_EH_RETURN): Remove.
        (eh_return_internal): Merge from eh_return_<mode>,
        use eh_return rtx, split w/ epilogue.

Nick,
   I've always built FSF gcc as a stock build (which defaults to the
FSF libgcc for linking). The last time I can recall seeing anyone
build FSF gcc against the system libgcc was related to the thread...

http://gcc.gnu.org/ml/gcc/2008-11/msg00280.html

I believe this was being done with --with-slibdir='\$\${prefix}/ lib'.
Tonight I'll try building like that under SL and see what happens. It
is very non-standard though for how everyone builds FSF gcc on darwin.
                  Jack

Nick,
   So is this basically a depreciation of libgcc for darwin10 and
later? I was wondering about that very issue awhile ago (as in
what exactly what relationship clang would have to libgcc). If
so, perhaps the correct answer is to see if FSF gcc would accept
changing the default build for darwin10 and later to not use
the FSF libgcc and instead move any additional symbols into a
libgcc-ext.
                        Jack

Nick,
  So is this basically a depreciation of libgcc for darwin10 and
later?

Hi Jack,

I'm not sure what you mean by depreciation here. Some perspective:

Darwin (like windows) has it's own system exception handling mechanisms and GCC shouldn't try to replace it. Darwin has always been extremely conservative about ABI/API changes: we don't want to break our customer apps. Any changes to libgcc that would break "old" unwinder functionality would be unacceptable on our platform, regardless of whether the unwinder is part of libsystem or not.

libgcc is still useful for adding other functionality (like emultls as you mentioned) as well as other arithmetic support libraries. I do NOT think that "libgcc shouldn't be used on darwin", I just don't think the EH pieces should be.

I was wondering about that very issue awhile ago (as in
what exactly what relationship clang would have to libgcc). If
so, perhaps the correct answer is to see if FSF gcc would accept
changing the default build for darwin10 and later to not use
the FSF libgcc and instead move any additional symbols into a
libgcc-ext.

-Chris

I dug into this. Based on the .s files in bugzilla, the latest gcc is now adding dwarf unwind info to describe the function epilog. If you run dwarfdump --eh-frame on the .o files made with the new compiler, you'll see extra dwarf unwind instructions at the end like:

                 ...
                 DW_CFA_advance_loc4 (64) #<-- advance to near end of function
                 DW_CFA_restore (rbp)
                 DW_CFA_def_cfa (rsp, 8)
                 DW_CFA_nop

The linker's conversion to compact unwind "runs" the dwarf unwind info for a function and then records the state at the end. Adding unwind info for the epilog breaks this. In the long term, I can add heuristics to the linker to detect that what looks like unwind info for the epilog and stop processing the dwarf instructions.

The short term fix for gcc is to *not* add epilog unwind information for Darwin.

Epilog unwind information is never needed for exception processing. Its only use is for debugging or sampling when you want to asynchronously make a stack back trace.

-Nick

I thought of another work around. The FSF gcc driver can implicitly add -no_compact_unwind to the link line. This tells the linker to not produce compact unwind information from the dwarf unwind info in .o files. Then at runtime the darwin unwinder will fallback and use the slow dwarf unwind info.

-Nick

Nick,
   Thanks! I have asked Richard to propose a patch to disable the
additional epilog info on darwin. While we are on the topic of
eh issues on darwin, could you take a look at PR37012? On darwin9/10.
we have the following remaining failures in gcc-4.4.1 at -m32...

FAIL: g++.dg/torture/stackalign/eh-alloca-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-3.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/eh-alloca-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-1.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-2.C -O3 -g execution test
FAIL: g++.dg/torture/stackalign/throw-3.C -O3 -g execution test

These eh failures are really weird because they are triggered by the
-g option. Without -g, the resulting test cases pass their execution
tests fine. The problem doesn't exist for -m64 (with or without -g).
I posted assembly files for these testcases at the various
combinations (-m32 -O3, -m32 -O3 -g, -m64 -O3 -g, -m64 -O3)...

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37012#c48
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37012#c49
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37012#c50
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37012#c51

Perhaps if you diff the assembly files, you might recognize the
problem in that one as well.
              Jack

Nick,
   I can confirm that passing "-Wl,-no_compact_unwind" to the failing
testcase for g++.dg/torture/stackalign/eh-vararg-2.C eliminates the
run-time error. I'd run the entire testsuite with that approach but
I don't know how to suppress the comma in...

make -k check RUNTESTFLAGS="--target_board=unix'{-Wl,-no_compact_unwind}'"

so that it runs as a single test passing "-Wl,-no_compact_unwind".
             Jack

Nick,
   FYI, executing...

make -k check RUNTESTFLAGS="--target_board=unix/-Wl,-no_compact_unwind"

reveals that this approach in fact eliminates all of the eh regressions
in gcc trunk. Unfortunately, it plays havoc with the gcc.dg/pch and
g++.dg/pch test cases causing hundreds of new failures of the form...

FAIL: ./common-1.h -O0 -g (test for excess errors)
FAIL: gcc.dg/pch/common-1.c -O0 -g
FAIL: gcc.dg/pch/common-1.c -O0 -g assembly comparison

Executing on host: /sw/src/fink.build/gcc45-4.4.999-20090914/darwin_objdir/gcc/xgcc -B/sw/src/fink.build/gcc45-4.4.999-20090914/darwin_objdir/gcc/ ./common-1.h -O0 -g -Wl,-no_compact_unwind -o common-1.h.gch (timeout = 300)
Undefined symbols:
  "_main", referenced from:
      start in crt1.10.5.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
compiler exited with status 1
output is:
Undefined symbols:
  "_main", referenced from:
      start in crt1.10.5.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

FAIL: ./common-1.h -O0 -g (test for excess errors)
Excess errors:
Undefined symbols:
  "_main", referenced from:
      start in crt1.10.5.o
ld: symbol(s) not found

pch file 'common-1.h.gch' missing
FAIL: gcc.dg/pch/common-1.c -O0 -g
assembly file 'common-1.s' missing
FAIL: gcc.dg/pch/common-1.c -O0 -g assembly comparison

                 Jack

Perhaps try using -Xlinker?

- Daniel

Nick,
   The second approach of passing -no_compact_unwind produced excellent
results...

http://gcc.gnu.org/ml/gcc-testresults/2009-09/msg01761.html

...eliminating all of the eh regressions in gcc trunk as well as a
few failures in the libjava testsuite present since gcc 4.4. I've
submitted this change for gcc 4.5...

http://gcc.gnu.org/ml/gcc-patches/2009-09/msg01327.html

Hopefully FSF won't break backward compatibility with the non-compact
unwind any time soon.
        Jack
ps How does the llvm project intend to handle this issue on the linux
side for llvm-gcc-4.2 and clang? Eventually the libgcc used by the
llvm projects and FSF gcc will fork irreversibly, no? I guess one would
have to resort to a llvm compiler-plugin for FSF gcc and not attempt
to mix code with clang/llvm-gcc-4.2 in that case.