MinGW/MSVC++ uses different ABI for sret

Let's go directly to the example

struct S {
  double dummy1;
  double dummy2;
};

S bar();

S foo() {
  return bar();
}

This is the result of g++ -c -S -O2 (focus on the final `ret'):

__Z3foov:
LFB0:
  pushl %ebp
LCFI0:
  movl %esp, %ebp
LCFI1:
  pushl %ebx
LCFI2:
  subl $20, %esp
LCFI3:
  movl 8(%ebp), %ebx
  movl %ebx, (%esp)
  call __Z3barv
  pushl %eax
  movl %ebx, %eax
  movl -4(%ebp), %ebx
  leave
  ret $4

This is the result of cl -O2 -c -Fa (again, focus on the final `ret')

PUBLIC ?foo@@YA?AUS@@XZ ; foo
EXTRN ?bar@@YA?AUS@@XZ:PROC ; bar
; Function compile flags: /Ogtpy
; COMDAT ?foo@@YA?AUS@@XZ
_TEXT SEGMENT
$T2548 = -16 ; size = 16
$T2546 = 8 ; size = 4
?foo@@YA?AUS@@XZ PROC ; foo, COMDAT
; File c:\dev\exp\bar.cpp
; Line 8
  sub esp, 16 ; 00000010H
; Line 9
  lea eax, DWORD PTR $T2548[esp+16]
  push eax
  call ?bar@@YA?AUS@@XZ ; bar
  mov ecx, DWORD PTR $T2546[esp+16]
  mov edx, DWORD PTR [eax]
  mov DWORD PTR [ecx], edx
  mov edx, DWORD PTR [eax+4]
  mov DWORD PTR [ecx+4], edx
  mov edx, DWORD PTR [eax+8]
  mov eax, DWORD PTR [eax+12]
  mov DWORD PTR [ecx+8], edx
  mov DWORD PTR [ecx+12], eax
  mov eax, ecx
; Line 10
  add esp, 20 ; 00000014H
  ret 0
?foo@@YA?AUS@@XZ ENDP ; foo

Please note how g++ pops 4 bytes from the stack on return, while cl
doesn't. This is reflected on the call to `bar' too, where the callee
takes that into account.

LLVM generates code that follows the gcc behaviour. The result is that
after LLVM code calls a VC++ function that returns a struct, the stack
is corrupted. The "solution" is to not mark external VC++ functions as
sret in any case, but this breaks if the external function was compiled
by gcc, or if you pass a LLVM callback that returns a struct to a VC++
function, etc.

I filed a bug yesterday ( http://llvm.org/bugs/show_bug.cgi?id=5046 )
and Anton kindly explained that LLVM is doing the right thing as per the
ABI (the GCC ABI, I'll add).

1. Is there a LLVM way of dealing with this without using separate code
for VC++ and GCC?

2. Is there a document that thoroughly explains the ABI used by VC++?
The documentation on MSDN is quite vague
( Argument Passing and Naming Conventions | Microsoft Docs )

3. Is a bug that LLVM does not distinguish among GCC and VC++ sret
handling?

4. Why the heck GCC and VC++ follow different ABIs on the same
platform?

The last question is rhetoric.

I filed a bug yesterday ( http://llvm.org/bugs/show_bug.cgi?id=5046 )
and Anton kindly explained that LLVM is doing the right thing as per the
ABI (the GCC ABI, I'll add).

1. Is there a LLVM way of dealing with this without using separate code
for VC++ and GCC?

I'm not sure what you mean... LLVM can distinguish between MinGW and
MSVC targets. If we want to, it shouldn't be too hard to make
X86TargetLowering::LowerFormalArguments and
X86TargetLowering::LowerCall account for the difference.

3. Is a bug that LLVM does not distinguish among GCC and VC++ sret
handling?

Probably... see 36834 – structure return ABI for windows targets differs from native MSVC .

-Eli

Hello Eli.

Eli Friedman <eli.friedman@gmail.com> writes:

I filed a bug yesterday ( http://llvm.org/bugs/show_bug.cgi?id=5046 )
and Anton kindly explained that LLVM is doing the right thing as per the
ABI (the GCC ABI, I'll add).

1. Is there a LLVM way of dealing with this without using separate code
for VC++ and GCC?

I'm not sure what you mean... LLVM can distinguish between MinGW and
MSVC targets. If we want to, it shouldn't be too hard to make
X86TargetLowering::LowerFormalArguments and
X86TargetLowering::LowerCall account for the difference.

Yes, automatically switching to the MSVC ABI when the target triple is
*-pc-win32 seems the Right Thing.

3. Is a bug that LLVM does not distinguish among GCC and VC++ sret
handling?

Probably... see 36834 – structure return ABI for windows targets differs from native MSVC .

Great! Googling two hours and didn't found that report nor the
documents linked from it :-/

I'll file a LLVM bug report about this issue.

BTW, it's even worse, as aggregates passed by value are, well... passed
by value, contrary to the 386 unix ABI which uses pointers. I'm afraid
that this has no so easy solution as the sret issue. Is there any LLVM
target where aggregates are "really" passed by value?

Thanks Eli.

Óscar Fuentes <ofv@wanadoo.es> writes:

BTW, it's even worse, as aggregates passed by value are, well... passed
by value, contrary to the 386 unix ABI which uses pointers. I'm afraid
that this has no so easy solution as the sret issue. Is there any LLVM
target where aggregates are "really" passed by value?

This is not entirely true. MSVC never uses pointers for passing a
class/struct by value; GCC uses a pointer if the class/struct is not a
POD.

This is listed on http://www.agner.org/optimize/calling_conventions.pdf
page 19.

Hi,

is there any news about this problem?
How hart it will be to resolve this compatibility problem with Visual Studio
functions ?

This is really great problem for my project with should call may different
function compiled by Visual Studio!!!

Here is also problem that is looking similar.
http://llvm.org/bugs/show_bug.cgi?id=5046

DevOllvm <DevOmem@web.de> writes:

Hi,

is there any news about this problem?
How hart it will be to resolve this compatibility problem with Visual Studio
functions ?

Fixing the sret issue is easy: LLVM shall simply ignore it for
*-pc-win32 targets. You can workaround the problem on your compiler
by not using the sret attribute. That's what I did.

Fixing the problem with C++ non-POD classes as function arguments will
be much harder to fix. Most probably it will require to extend LLVM to
accept and call copy constructors and destructors on `byval' arguments.

This is really great problem for my project with should call may different
function compiled by Visual Studio!!!

Here is also problem that is looking similar.
http://llvm.org/bugs/show_bug.cgi?id=5046

The bug id about sret is 5058. The other about C++ non-POD function
arguments is 5064. Please note that the workaround suggested there is
not correct: apart from the other potential issues I mention, it is
doing exactly the same as `byval' does now.