32-bit pointers and calls from 64-bit code

Resending to the new mailing list address…

Hi Charles,

It sounds like you already have some of this implemented? How different is it from -mx32 ? It sounds kinda different, but with some overlap?

Thanks,

JF

Resending to the new mailing list address…

From: Charles Davis <cdavis5x@gmail.com>
Date: Thu, Aug 2, 2018 at 5:59 PM
Subject: 32-bit pointers and calls from 64-bit code
To: Clang Developers List <cfe-dev@cs.uiuc.edu>

Hello,

With Apple’s impending 32-bit deprecation, my new employer is having a bit of an existential crisis. We think we’ve found a way forward, but we need a little help from the compiler to accomplish this.

This is a pretty big project to be throwing together like this, but good luck.

Basically, what we want to do is have code that is technically x86-64 but that can use 32-bit pointers and that can make calls to 32-bit code.

Sure, this is a relatively well-understood problem. The most well-known modern example is Linux’s support for “x32”.

In a separate thread on llvmdev, I asked about LLVM changes we need for this. Here are the corresponding C language extensions we need:

  • (Possibly) a preprocessor macro to be defined when we’re building 64-bit code to be called by 32-bit code. This code, while still technically being 64-bit, needs to look and act like 32-bit code to our clients–so we’d like to define i386 et al. as though we were building real 32-bit code.

i386 generally means, well, the 32-bit Intel target; muddying this by pretending that your variant target is i386 seems like a bad decision. Can you ask your clients to switch to LP64 or some other portable pointer-size macro instead?

  • Microsoft’s __ptr32 and __ptr64 keywords–to distinguish 32-bit and 64-bit pointers.

I don’t know if we actually support those keywords, but conceptually they’re just address spaces so it’s not a problem.

  • A pragma to choose the default pointer size–with this, we can avoid littering our code with lots of __ptr32 keywords.

Doable. You’ll want something that works like the ARC auditing pragmas and tries to force the pragma to stay bounded to a single header, I think.

  • Attributes for various 32-bit calling conventions. We can’t use the existing ones, because they are defined not to do anything in 64-bit code. I’ll probably define new ones that are just the old ones suffixed with ‘32’ (e.g. attribute((cdecl32))).

Why do you want this? You’re recompiling all of your code in this new mode, and the default x86_64 CC is more efficient than all of the specialized i386 CCs. Why not just ignore the attributes?

  • An attribute to declare that a function pointer must be called far, and to declare the segment selector to use when calling it. We need this to be able to transition to a 32-bit code segment. I’m currently leaning towards (ab)using Microsoft’s __based extension (which originally supported something like this, I believe) for this purpose.

Are you really messing around with segments, or are you just trying to be able to to distinguish 32-bit from 64-bit function pointers?

Is there something specifically pushing you towards using the MS extensions for these things?

Are you planning to actually map all your memory into the first 4GB of address space?

John.

Wait, I just put a few things together. Are you planning to perform a far call to existing, i386-compiled code in the same process? That is not going to work without operating system support, and I strongly doubt that macOS has that support — probably not today, and especially not after i386 support is removed.

Something like an x32 ABI can help if you have the ability to recompile the code and are just concerned about it not being 64-bit-safe or can’t afford the memory overhead of 64-bit pointers, but it won’t let you actually interoperate with i386 code.

John.

Resending to the new mailing list address...

From: Charles Davis <cdavis5x@gmail.com>
Date: Thu, Aug 2, 2018 at 5:59 PM
Subject: 32-bit pointers and calls from 64-bit code
To: Clang Developers List <cfe-dev@cs.uiuc.edu>

Hello,

With Apple's impending 32-bit deprecation, my new employer is having a bit
of an existential crisis. We think we've found a way forward, but we need a
little help from the compiler to accomplish this.

This is a pretty big project to be throwing together like this, but good
luck.

Basically, what we want to do is have code that is technically x86-64 but
that can use 32-bit pointers and that can make calls to 32-bit code.

Sure, this is a relatively well-understood problem. The most well-known
modern example is Linux's support for "x32".

In a separate thread on llvmdev, I asked about LLVM changes we need for
this. Here are the corresponding C language extensions we need:

   - (Possibly) a preprocessor macro to be defined when we're building
   64-bit code to be called by 32-bit code. This code, while still technically
   being 64-bit, needs to look and act like 32-bit code to our clients--so
   we'd like to define __i386__ et al. as though we were building real
   32-bit code.

__i386__ generally means, well, the 32-bit Intel target; muddying this by
pretending that your variant target is i386 seems like a bad decision.
Can you ask your clients to switch to __LP64__ or some other portable
pointer-size macro instead?

   - Microsoft's __ptr32 and __ptr64 keywords--to distinguish 32-bit and
   64-bit pointers.

I don't know if we actually support those keywords, but conceptually
they're just address spaces so it's not a problem.

   - A pragma to choose the default pointer size--with this, we can avoid
   littering our code with lots of __ptr32 keywords.

Doable. You'll want something that works like the ARC auditing pragmas
and tries to force the pragma to stay bounded to a single header, I think.

   - Attributes for various 32-bit calling conventions. We can't use the
   existing ones, because they are defined not to do anything in 64-bit code.
   I'll probably define new ones that are just the old ones suffixed with '32'
   (e.g. __attribute__((cdecl32))).

Why do you want this? You're recompiling all of your code in this new
mode, and the default x86_64 CC is more efficient than all of the
specialized i386 CCs. Why not just ignore the attributes?

   - An attribute to declare that a function pointer must be called far,
   and to declare the segment selector to use when calling it. We need this to
   be able to transition to a 32-bit code segment. I'm currently leaning
   towards (ab)using Microsoft's __based extension (which originally
   supported something like this, I believe) for this purpose.

Are you really messing around with segments, or are you just trying to be
able to to distinguish 32-bit from 64-bit function pointers?

Wait, I just put a few things together. Are you planning to perform a far
call to existing, i386-compiled code in the same process?

Well... Not exactly.

We plan to use the Hypervisor framework to create a VM with 32-bit and
64-bit code segments. We'll then make the entirety of the host process's
memory visible inside the guest--the intent is to reduce the number of
expensive VM exits. But inside this VM... yes, we intend to make far calls
directly to existing i386 code. (We also intend to get called by existing
i386 code.) We're trying not to implement an entire OS for this purpose--we
want to use the host OS to support as much as we can, which is why we're
mapping the host process memory into the VM's physical memory.

That is not going to work without operating system support, and I
strongly doubt that macOS has that support — probably not today, and
especially not after i386 support is removed.

I thought so too, but it turns out you don't need a call gate to do this.
All you need is the 32-bit segment selector--and on most OSes, that almost
never changes, since it's an entry in the GDT. We're well aware that macOS
won't even have a 32-bit CS in 10.15--that's why we want to use the
Hypervisor framework.

Something like an x32 ABI can help if you have the ability to recompile
the code and are just concerned about it not being 64-bit-safe or can't
afford the memory overhead of 64-bit pointers, but it won't let you
actually interoperate with i386 code.

John.

Chip

I don’t know anything about the Hypervisor framework, but what happens if you take an interrupt while in 32-bit code?

John.

Oops, forgot to reply all…

I think I understand the architecture here a little better, thanks. So you have a guest VM running in a host process, and within the guest you have a minimal kernel and a user process that includes both some uncontrolled i386-compiled payload code and some amount of VM-aware support code that you do control. Since you want the guest to map the host process’s full address space, you need at least some of the guest support code to be 64-bit, but it also needs to be able to interoperate directly with the payload code; hence the special target which generates 64-bit code but can perform layout, accesses, etc. with 32-bit data pointers and which can also perform a far call when calling a 32-bit function pointer.

Address spaces definitely seem like the right language tool for this.

John.