In the case of X86-64, llvm-gcc does use aggregate return (for the
interesting cases which return things in registers) and it does do the
I don't follow. By "aggregate return" do you mean "structs as first class
values?" That is, llvm-gcc generates a return of a struct by value?
Yes, consider:
struct foo { double X; long Y; };
struct foo test(double *P1, long *P2) {
struct foo F;
F.X = *P1;
F.Y = *P2;
return F;
}
we compile this to:
%struct.foo = type { double, i64 }
define %struct.foo @test(double* %P1, i64* %P2) nounwind {
entry:
load double* %P1, align 8 ; <double>:0 [#uses=1]
load i64* %P2, align 8 ; <i64>:1 [#uses=1]
%mrv3 = insertvalue %struct.foo undef, double %0, 0 ; <%struct.foo> [#uses=1]
%mrv4 = insertvalue %struct.foo %mrv3, i64 %1, 1 ; <%struct.foo> [#uses=1]
ret %struct.foo %mrv4
}
which was previously (before first class aggregates got enabled yesterday):
define %struct.foo @test(double* %P1, i64* %P2) nounwind {
entry:
load double* %P1, align 8 ; <double>:0 [#uses=1]
load i64* %P2, align 8 ; <i64>:1 [#uses=1]
ret double %0, i64 %1
}
and both produce this machine code:
_test:
movq (%rsi), %rax
movsd (%rdi), %xmm0
ret
right thing. However, returning a {i64, i64, i64, i64} by value and
having it automatically be returned "by pointer" is less interesting,
What do you mean by "less interesting?"
There are already other ways to handle this, rather than returning the entire aggregate by value. For example, we compile:
struct foo { double X; long Y, Z; };
struct foo test(double *P1, long *P2) {
struct foo F;
F.X = *P1;
F.Y = *P2;
return F;
}
into:
%struct.foo = type { double, i64, i64 }
define void @test(%struct.foo* noalias sret %agg.result, double* %P1, i64* %P2) nounwind {
entry:
load double* %P1, align 8 ; <double>:0 [#uses=1]
load i64* %P2, align 8 ; <i64>:1 [#uses=1]
getelementptr %struct.foo* %agg.result, i32 0, i32 0 ; <double*>:2 [#uses=1]
store double %0, double* %2, align 8
getelementptr %struct.foo* %agg.result, i32 0, i32 1 ; <i64*>:3 [#uses=1]
store i64 %1, i64* %3, align 8
ret void
}
which has no first class aggregates. When the struct is very large (e.g. containing an array) you REALLY REALLY do not want to use first-class aggregate return, you want to return explicitly by pointer so the memcpy is explicit in the IR.
AFAIK, llvm-gcc/g++ does an *extremely* good job of matching the
X86-64 ABI on mainline.
But that's all implemented within llvm-gcc. LLVM codegen right now
does not implement the ABI correctly.
Getting the ABI right requires the front-end to do target-specific work. Without exposing the entire C (and every other language) type through to the code generator, there is no good solution for this. We are working to incrementally improve things though. Thinking the code generator will just magically handle all your ABI issues for you is wishful thinking 
-Chris