Deviation from x86-64 calling convention

I was implementing the classification algorithm specified in in my compiler, and I found a corner case where clang’s calling convention appears to deviate from the letter of the specification. The spec says that if a chunk of an argument type is classified X87UP but the preceding chunk is not classified X87, the entire argument should be passed in memory. However, clang appears to pass the argument chunk as if it were classified SSE instead. For example, the following snippet:

union nutty_t {
int x;
long double y;

union nutty_t foo(union nutty_t x) { return x; }

gets compiled to:

%0 = type { i64, double }
%union.nutty_t = type { x86_fp80 }

define %0 @foo(i64 %x.coerce0, double %x.coerce1) nounwind uwtable ssp {
%1 = alloca %union.nutty_t, align 16
%x = alloca %union.nutty_t, align 16
%2 = bitcast %union.nutty_t* %x to %0*
%3 = getelementptr %0* %2, i32 0, i32 0
store i64 %x.coerce0, i64* %3
%4 = getelementptr %0* %2, i32 0, i32 1
store double %x.coerce1, double* %4
%5 = bitcast %union.nutty_t* %1 to i8*
%6 = bitcast %union.nutty_t* %x to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %5, i8* %6, i64 16, i32 16, i1 false)
%7 = getelementptr %union.nutty_t* %1, i32 0, i32 0
%8 = bitcast x86_fp80* %7 to %0*
%9 = load %0* %8, align 1
ret %0 %9

According to my understanding, the type should be passed byval and returned sret instead.


Yes... unfortunately, that case wasn't clear in the original standard,
and versions of gcc from before that clarification implement generate
the same code you're seeing from clang. clang uses the old gcc
behavior on OSX, and the spec-compliant behavior on other platforms.
If you're interested in the relevant code in clang, search for
honorsRevision0_98 in lib/CodeGen/TargetInfo.cpp.


Thanks for the reference, Eli. Out of curiosity, is this platform-specific behavior documented anywhere other than the gcc and clang source? Apple’s documentation states that Mac OS X follows AMD’s spec, without qualification.


For the following code,

class A {
   int x;

void f() {
   A a(1);

FunctionDecl::isInlined() returns true for A's ctor A(int), because it is a CXXMethodDecl and it is not out-of-line.
This is the implementation of isOutOfLine()

virtual bool isOutOfLine() const {
     return getLexicalDeclContext() != getDeclContext();

I suspect the logic is not correct for a CXXMethodDecl.

Decl::isOutOfLine is missing documentation, so it's hard to be sure what the
intent is -- it may be intended to apply to this declaration of the function
(for which it gives correct answers) rather than to the function in general.
You can get the behavior you want by asking the definition of the function
(and assuming that functions with no definition are out-of-line).

- Richard

于 2011/12/27 1:00, Richard Smith 写道:

We’re aware of the discrepancy and are making an effort to resolve it.