x86_64-pc-win32 ABI var arg code gen bug? Is the bitcode correct? Or is it the code gen?

It looks like for x86_64-pc-win32 the compiler does not generate the correct code? It looks like the spill of the argument registers to the 32-byte callers shadow space is not in the bitcode?

I have some code (attached as v.c):

int
ShellPrintHiiEx (
int Col,
int Row,
const char *Language,
const void *HiiFormatStringId,
const void *HiiFormatHandle,

)
{
VA_LIST Marker;
int Value;

VA_START (Marker, HiiFormatHandle);
Value = ReturnMarker (Marker);
VA_END(Marker);

return Value;
}

clang -ccc-host-triple x86_64-pc-win32 -emit-llvm -S v.c

declare void @llvm.va_start(i8*) nounwind

declare void @llvm.va_end(i8*) nounwind

define i32 @ShellPrintHiiEx(i32 %Col, i32 %Row, i8* %Language, i8* %HiiFormatStringId, i8* %HiiFormatHandle, …) nounwind {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i8*, align 8
%4 = alloca i8*, align 8
%5 = alloca i8*, align 8
%Marker = alloca i8*, align 8
%Value = alloca i32, align 4
store i32 %Col, i32* %1, align 4
store i32 %Row, i32* %2, align 4
store i8* %Language, i8** %3, align 8
store i8* %HiiFormatStringId, i8** %4, align 8
store i8* %HiiFormatHandle, i8** %5, align 8
%6 = bitcast i8** %Marker to i8*
call void @llvm.va_start(i8* %6)
%7 = load i8** %Marker, align 8
%8 = call i32 @ReturnMarker(i8* %7)
store i32 %8, i32* %Value, align 4
%9 = bitcast i8** %Marker to i8*
call void @llvm.va_end(i8* %9)
%10 = load i32* %Value, align 4
ret i32 %10
}

So for x86_64-pc-win32 Col (%rcx), Row (%rdx), Language (%r8), and HiiFormatStringId (%r9) should be spilled to the 32-byte space allocated on the callers stack? Looks like they are being spilled locally?

Does this mean the bitcode needs to be generated differently for x86_64-pc-win32, or does magic occur when code is generated and there is a bug in that chunk of code?

clang -ccc-host-triple x86_64-pc-win32 -S v.c

.globl ShellPrintHiiEx
.align 16, 0x90
ShellPrintHiiEx: # @ShellPrintHiiEx

BB#0:

pushq %rbp
.Ltmp4:
movq %rsp, %rbp
.Ltmp5:
subq $80, %rsp
.Ltmp6:
movq 48(%rbp), %rax
movl %ecx, -4(%rbp)
movl %edx, -8(%rbp)
movq %r8, -16(%rbp)
movq %r9, -24(%rbp)
movq %rax, -32(%rbp)
leaq 48(%rbp), %rax
movq %rax, -40(%rbp)
movq %rax, %rcx
callq ReturnMarker
movl %eax, -44(%rbp)
addq $80, %rsp
popq %rbp
ret

Col (%rcx), Row (%rdx), Language (%r8), and HiiFormatStringId (%r9), are spilled to wrong location.

Thanks,

Andrew Fish

v.c (3.64 KB)

Andrew,

That is not a clang issue.

I think, in practice, {rcx, rdx, r8, r9} might not need to be spilled
to "home area" in that case,
because va_arg would not touch former 4 args.
Lemme know if you had issues.

I know it must be suboptimal, "home area" would be vacant in any cases afaik.
It would be better to 4 args were spilled into the home area.
To work on this, it might be harder, I guess, thank you.

...Takumi

I'm seeing a code gen issue with x86_64-pc-win32-darwin for this test case. So I was looking at the assembly for code gen and ABI issues. I agree that for this test case the spill is code that could optimized out.

It looks like my bug was related to the sizeof (va_list) being incorrect for my triple, probably was set to struct __va_list_tag and not char*. I tried a local fix, but I think this fix was broken. I noticed that the top of tree has fixed the issue.

Thank you for the quick response, it was helpful for me to fully understand what is going on.

Thanks,

Andrew