Porting libcxxabi / Unwind to Windows / MingW 32 bit

Hello,

I’m trying to make libcxxabi / Unwind work on Windows / MingW 32 bit. Attached is a patch to make it compile without warnings and run without crashing.

However the functionality isn’t working yet: upon exception, the IP seems to be completely out of range the FDE tables. This may be related to the memory model / compilation options???

Help is welcome.

Yaron

unwind-mingw.diff (6.18 KB)

Yes, the one from Apple.

Yaron

Take a look at UnwindRegisterRestore.s. When an exception is thrown, unw_getcontext() is called which saves all registers and the return address into a register struct. Then the FDE info is used to step through each frame and modify the register set. At which step is the IP register out of bounds?

It might be that the C files need to be compiled with -fexceptions so that they have unwind tables.

-Nick

Hi Nick,

I looked at the disassembly, unw_getcontext() gets the correct IP into the context and later setInfoBasedOnIPRegister gets it right into “pc”. This is OK.

The trouble is, the whole address space is nowhere near the fde values. For example,

p = 0x66179070 <libunwind::DwarfFDECachelibunwind::LocalAddressSpace::_initialBuffer>
pc = 0x6604e7d9
fde = 0x40801c0
p->ip_start = 0x4080010
p->ip_end = 0x408010c

p is consistent with pc and local vars and addresses in unwind.dll (0x66YYYYYY)
fde is consistent with the ip ranges and with the caller of unwind.dll (the throwing function) address space (0x40800YY)

they live in different memory ranges, the fde addresses are in caller range whereas the unwind addresses are in the DLL range so when it tries to find its IP in the fde tables it can’t find.

It’s as if the DLL is supposed to load in other memory range, some linker option required?

Yaron

Yaron,

Each separately linked image (program, DLLs, etc) has its own set of FDEs (in its .eh_frame) section. The method LocalAddressSpace::findUnwindSections() needs to be updated in an OS specific way to: 1) find the image for a given pc value, and 2) find the .eh_frame section bounds for that image.

-Nick

Hi,

You are exactly correct, LocalAddressSpace::findUnwindSections does nothing #if !APPLE so the output value is random.

LocalAddressSpace::findOtherFDE and LocalAddressSpace::findFunctionName also currently do nothing. Are these also required for regular exceptions?

Yaron

Hi,

You are exactly correct, LocalAddressSpace::findUnwindSections does nothing #if !APPLE so the output value is random.

LocalAddressSpace::findOtherFDE and LocalAddressSpace::findFunctionName also currently do nothing. Are these also required for regular exceptions?

No.

findOtherFDE() is only need if the OS has an alternate way to locate FDEs such as those for JITed code.

findFunctionNam() is used in debug builds to print out the name of functions as the unwinding happens.

-Nick

I see. It is different than the gcc runtime libgcc.

libgcc always uses the set of FDE which was registered by __register_frame. This data is private to every process since unwind runs in user space as part of the process.

Here, the OS first locates which set of FDE is relevant.
Does this mean that in OS-X unwind runs in system space and holds FDE data for all processes?

Yaron

Reviewing the code, the problem is not with processes. Without any Dwarf support from Windows, MingW (Windows) libgcc load the EH frame by itself. First it:

/* Stick a label at the beginning of the frame unwind info so we can
register/deregister it with the exception handling library code. */
static EH_FRAME_SECTION_CONST char EH_FRAME_BEGIN[]

attribute((used, section(EH_FRAME_SECTION_NAME), aligned(4)))
= { };

then the linker merges all EH frames together with the above being first so that in runtime EH_FRAME_BEGIN points to the start of the EH frame.

Finally at runtime the CRT init code registers the frame:

register_frame_fn (EH_FRAME_BEGIN, &obj);

The root of our problem is that EH_FRAME_BEGIN is registered with the MingW (gcc) exception runtime library but not with our libunwind so it does not know how to unwind its own functions. That’s why the IP is out of range the FDE tables.

So, findUnwindSections needs to know about EH_FRAME_BEGIN . I’ll try this.

Yaron

I see. It is different than the gcc runtime libgcc.

libgcc always uses the set of FDE which was registered by __register_frame. This data is private to every process since unwind runs in user space as part of the process.

Here, the OS first locates which set of FDE is relevant.
Does this mean that in OS-X unwind runs in system space and holds FDE data for all processes?

On OSX, to to find the FDE for a given pc, libunwind does the equivalent of dladdr() to find the base address of the DSO (or program) containing that pc. Then it (calls a function) which walks the mach-o data structures from the base address to find the __eh_frame section. There is no up front registration (__register_frame) of each __eh_frame section, it is done lazily.

-Nick

Yes, it’s nice when the OS does the work.

With MingW, I see a problem using our unwind library.

EH_FRAME_BEGIN is a local symbol with no external visibility
outside the libgcc DLL. There is no API exposing it either.
Only code compiled inside libgcc can access it.

One solution is to (re)place our unwind library inside gcclib
itself so our unwind gets EH frame from initialization just
like gcc unwind does. Another would be to add API to libgcc exposing it.
Both are unlikely to happen.

The right solution would be to have a full replacement to libgcc based on our
unwind code and use it instead. In addition to unwinding library, libgcc also
supplies conversion and arithmetic functions so we’ll need alternative versions
of these too.

Yes, it’s nice when the OS does the work.

With MingW, I see a problem using our unwind library.

EH_FRAME_BEGIN is a local symbol with no external visibility
outside the libgcc DLL. There is no API exposing it either.
Only code compiled inside libgcc can access it.

One solution is to (re)place our unwind library inside gcclib
itself so our unwind gets EH frame from initialization just
like gcc unwind does. Another would be to add API to libgcc exposing it.
Both are unlikely to happen.

The right solution would be to have a full replacement to libgcc based on our
unwind code and use it instead. In addition to unwinding library, libgcc also
supplies conversion and arithmetic functions so we’ll need alternative versions
of these too.

Isn’t it what compiler-rt does ?

http://compiler-rt.llvm.org


cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

– Jean-Daniel

Yes, it's nice when the OS does the work.

With MingW, I see a problem using our unwind library.
__EH_FRAME_BEGIN__ is a local symbol with no external visibility
outside the libgcc DLL. There is no API exposing it either.
Only code compiled inside libgcc can access it.

In fact, no. We can (with linker support, surely), put all the EH info
into the dedicated section in the PE file. And do pretty similar
things, but walking the sections of PE file.

The right solution would be to have a full replacement to libgcc based on
our
unwind code and use it instead. In addition to unwinding library, libgcc
also
supplies conversion and arithmetic functions so we'll need alternative
versions
of these too.

http://compiler-rt.llvm.org/ ? :slight_smile:

Hi Anton,

On Windows, MingW, libgcc arrives pre-compiled as shared (DLL) and a static library. I’m not sure if the ld linker could help when starting from a compiled binary DLL.
In any case, instead of hacking around libgcc, completely replacing it with libcxxabi/Unwind + compiler-rt (thanks Jean!) seems to be the better alternative.

Yaron

I don't think I got a reply last time I asked this:

Please can we move libcxxabi/Unwind into the compiler-rt umbrella? The code there implements the language-agnostic part of the unwind library and so is useful for any language using the Itanium exception model. Having it in libc++abi is likely to cause confusion when people try to mix libgcc_eh.so / libgcc_s and libc++abi, both of which will be trying to provide the same symbols and have their own mechanisms for registering exceptions.

We'd very much like to replace the libgcc family in FreeBSD 11 (and have the replacement as an option some time in the 10.x timeframe) and having all of the required code in a single place would make our life easier. The compiler-rt project seems the obvious place for it.

David

Hi David,

It’s a good idea to have compiler-rt + unwind project in one project as a libgcc replacement. We can name the project “libclang”.

While at it, the rest of libcxxabi (the _cxa functions) may be merged into libcxx as a complete replacement for libstdc++.

This project structure matches the gcc library organization and thus will be easier for people (and toolchains) to use them.

Yaron

Hi David,

It's a good idea to have compiler-rt + unwind project in one project as a libgcc replacement. We can name the project "libclang".

THe libclang name is already taken, but so is libcompiler_rt for this purpose...

While at it, the rest of libcxxabi (the _cxa functions) may be merged into libcxx as a complete replacement for libstdc++.

We already ship the combination of libc++ and libcxxrt on FreeBSD as a complete replacement for libstdc++.

This project structure matches the gcc library organization and thus will be easier for people (and toolchains) to use them.

Actually, the gcc structure has libstdc++ and libsupc++ as separate libraries. On FreeBSD, we shipped them as separate dynamic libraries in 9.x so that you can stick libcxxrt under libstdc++ and mix code that uses libc++ and libstdc++. OS X did something similar. This setup allows people to use a newer libstdc++ (which we don't ship in the base system, but which you can get from ports) in a library used by an application that uses libc++, without causing errors (unless STL symbols are used on library interfaces).

Merging libc++ and libc++abi would require anyone downstream who cared about interoperability to go and unmerge them again.

David

libstdc++ and libsupc++ are merged on MingW Windows distributions, I assumed it’s the default structure of gcc libs. Where it’s not so it would indeed make no sense to merge them.
It’s best to follow the existing conventions to make it easier for the users replacing the libs.

Yaron