Handling of the x18 register in Wine on AArch64

Hi,

I'm sending this discussion to both wine-devel and llvm-dev, to try to keep the discussion open for both sides, to try to find a workable compromise. This was preliminarily discussed on llvm-dev already a few weeks ago.

One of the major unresolved issues with Wine for AArch64 is how to handle the platform specific register x18. (https://bugs.winehq.org/show_bug.cgi?id=38780) As Windows on AArch64 is commercially available for over a year now, it'd be nice to get this issue settled by at least some sort of compromise.

Background:

On AArch64, x18 is a platform specific register; some platforms treat it as reserved for platform specific use, while others (Linux) don't and treat it as any free temporary register. On Windows, it is used to hold the TEB pointer.

When calling Wine builtin functions from the Windows native code, the Wine builtin functions can clobber the x18 register, as the Wine code is built for the ABI and calling conventions used on Linux. This part is easy to work around by compiling Wine with the flag -ffixed-x18, which makes the compiler treat this register as reserved, avoiding clobbering it.

However, as the Wine code may end up calling functions in the surrounding host environment (mainly the libc, but potentially also other feature libraries that are used), and these libraries have not been built with the -ffixed-x18 flag.

There are a few different options for going about this matter:

1) Rebuild the surrounding host environment (the whole linux distribution) with -ffixed-x18. This would be the perfect solution, but is highly impractical for general use of Wine by regular users in their existing setups. (Convincing major distributions to start configuring their compilers in this way also doesn't seem to be happening.)

2) Force relays on entry to Wine builtins, where the relay can restore x18 to the right TEB pointer on return. This is pretty simple and straightforward, but not very elegant. One benefit is that this doesn't need to back up the original value of x18 but can always refetch the right TEB pointer value and set it. A patch to this effect have been sent before (https://source.winehq.org/patches/data/137759) but wasn't picked up.

One major limitation is that while this ensures the right value in x18 on return from the Wine functions, it doesn't help restoring x18 to the right value before calling callback functions.

I've run with this approach for nearly two years, without running into any issues with it (with the limited amount of code I run in wine; I don't think any of the command line executables I run in wine use callbacks though).

3) Enhance the compiler to automatically back up x18 on entry to functions marked with __attribute__((ms_abi)) and restore it on return. This would have the same effect as 2) above (but more elegantly), but also with the same limitations wrt callbacks.

(Using this with Wine requires changing where x18 is initialized, because when starting a process, x18 can get clobbered after signal_init_thread is called, before handing control over to the native code. A patch for this has been sent at https://source.winehq.org/patches/data/164651.)

A proof of concept/RFC patch for LLVM to implement this feature has been sent at https://reviews.llvm.org/D61892.

4) Enhance the compiler to back up x18 in every function (if necessary) to restore it after every function call. (Except function calls to functions in the same translation unit, which can be expected to not clobber the register.) This should achieve almost everything; x18 is maintained containing the right value throughout the Wine code as far as possible, like when doing callbacks.

If callbacks to native code is made from within a callback from an external library (e.g. qsort in libc), x18 can be lost though. I guess this isn't a common pattern for libc functions at least, but I have no idea if it's more common for other external feature libraries that Wine uses.

A proof of concept/RFC patch for LLVM to implement this feature has been sent at https://reviews.llvm.org/D61894. Do note that this approach can be pretty controversial to upstream to LLVM.

This also requires the same Wine patch as 3), but for a different reason. When signal_init_thread is called to initialize x18, the caller will restore the register on return. The most robust way to intitialize it is right before handing control over to native code.

5) Enclose every callback call in Wine with a wrapper/thunk that sets up the register correctly. This would be a perfect solution, but is practically unfeasible. As far as I know, this is the approach that was used for Win16 back in the day, calling WOWCallback16Ex every time Wine code should call back into Win16 code. Given the size of Wine today and the number of different places where callbacks are made (where the function pointers are called without any extra wrapping), this is unfeasible (and I have a very hard time seeing such a patch accepted into Wine).

What are the potential acceptable compromises for taking this matter forward?

// Martin

Martin Storsjö <martin@martin.st> writes:

1) Rebuild the surrounding host environment (the whole linux
distribution) with -ffixed-x18. This would be the perfect solution,
but is highly impractical for general use of Wine by regular users in
their existing setups. (Convincing major distributions to start
configuring their compilers in this way also doesn't seem to be
happening.)

Any chance that this could be made the compiler default, so that distros
wouldn't need to do anything?

5) Enclose every callback call in Wine with a wrapper/thunk that sets
up the register correctly. This would be a perfect solution, but is
practically unfeasible. As far as I know, this is the approach that
was used for Win16 back in the day, calling WOWCallback16Ex every time
Wine code should call back into Win16 code. Given the size of Wine
today and the number of different places where callbacks are made
(where the function pointers are called without any extra wrapping),
this is unfeasible (and I have a very hard time seeing such a patch
accepted into Wine).

I don't think it's feasible to do this at the Windows/Wine boundary, but
with the PE cross-compilation support, we could conceivably build most
of Wine as PE and add wrappers at the PE/Unix boundary.

The wrappers could then be generated, or we could use a variant of your
option 4) that would have the compiler save/restore x18 when calling a
non-ms_abi function from an ms_abi one.

> 1) Rebuild the surrounding host environment (the whole linux
> distribution) with -ffixed-x18. This would be the perfect solution,
> but is highly impractical for general use of Wine by regular users in
> their existing setups. (Convincing major distributions to start
> configuring their compilers in this way also doesn't seem to be
> happening.)
Any chance that this could be made the compiler default, so that distros
wouldn't need to do anything?

The register usage is specified by ABI. So, even if the compiler will
reserve x18, then there might be lots of legacy / hand-written code
around.

I think getting most code built with that flag is probably the worst
possible solution, whether by changing the compiler default or
convincing distros to do it. No single project should dictate (or even
try to dictate) the ABI of an entire platform.

Cheers.

Tim.

Martin Storsjö <martin@martin.st> writes:

5) Enclose every callback call in Wine with a wrapper/thunk that sets
up the register correctly. This would be a perfect solution, but is
practically unfeasible. As far as I know, this is the approach that
was used for Win16 back in the day, calling WOWCallback16Ex every time
Wine code should call back into Win16 code. Given the size of Wine
today and the number of different places where callbacks are made
(where the function pointers are called without any extra wrapping),
this is unfeasible (and I have a very hard time seeing such a patch
accepted into Wine).

I don't think it's feasible to do this at the Windows/Wine boundary, but
with the PE cross-compilation support, we could conceivably build most
of Wine as PE and add wrappers at the PE/Unix boundary.

Hmm, that could work... Am I following things correctly that this is, in general, a direction that Wine is heading in (compiling more of Wine with a PE cross compiler, as the tests already have moved over)?

The wrappers could then be generated, or we could use a variant of your
option 4) that would have the compiler save/restore x18 when calling a
non-ms_abi function from an ms_abi one.

Hmm, only saving/restoring, when calling a non-ms_abi function from an ms_abi one could be a good optimization of that approach.

(Currently it does have a rather significant overhead; that patch grows lib64/wine from 450 to 455 MB.) That would require that every place where a callback is called is all within ms_abi functions though. Is that the case currently? (Currently most of wine internals have unix calling conventions, and only the publicly visible entry points have ms_abi, right?)

// Martin

Martin Storsjö <martin@martin.st> writes:

Martin Storsjö <martin@martin.st> writes:

5) Enclose every callback call in Wine with a wrapper/thunk that sets
up the register correctly. This would be a perfect solution, but is
practically unfeasible. As far as I know, this is the approach that
was used for Win16 back in the day, calling WOWCallback16Ex every time
Wine code should call back into Win16 code. Given the size of Wine
today and the number of different places where callbacks are made
(where the function pointers are called without any extra wrapping),
this is unfeasible (and I have a very hard time seeing such a patch
accepted into Wine).

I don't think it's feasible to do this at the Windows/Wine boundary, but
with the PE cross-compilation support, we could conceivably build most
of Wine as PE and add wrappers at the PE/Unix boundary.

Hmm, that could work... Am I following things correctly that this is,
in general, a direction that Wine is heading in (compiling more of
Wine with a PE cross compiler, as the tests already have moved over)?

Yes, that's the goal.

The wrappers could then be generated, or we could use a variant of your
option 4) that would have the compiler save/restore x18 when calling a
non-ms_abi function from an ms_abi one.

Hmm, only saving/restoring, when calling a non-ms_abi function from an
ms_abi one could be a good optimization of that approach.

(Currently it does have a rather significant overhead; that patch
grows lib64/wine from 450 to 455 MB.) That would require that every
place where a callback is called is all within ms_abi functions
though. Is that the case currently?

That's not the case currently, but it's what a PE build would address,
since everything would automatically be ms_abi except a few designated
places where we explicitly call down to Unix libraries.