Expected return value when making Objective C method call on nil object

I was wondering if there is an agreed behaviour for Clang compiled
Objective C when making a method call on a nil object? If so, does
this define exactly what should be returned in all cases, or are any
cases left as "undefined"?

With releases 3.1, 3.2 and 3.3 I've been finding some variation
between very subtly different simple cases when the method called
returns a C structure by value.

For example, given this class:

    typedef struct {
        int x;
        int y;
        int z;
    } Vec3;

    @interface Fuzz : NSObject {
        Vec3 data;
    }
    @property Vec3 data;
    @end

    @implementation Fuzz
    @synthesize data;
    @end

This code will leave the variable "vec" uninitialised:

    Fuzz * obj = nil;
    Vec3 vec = obj.data;

Whereas this code will initialise the fields of "vec" all to zero:

    Fuzz * obj = nil;
    Vec3 vec;
    vec = obj.data;

I've studied the assembly code generated for x86 and ARM both in
unoptimised and optimised builds, and it would appear that return
value optimisation is applied regardless in the first example (even
with -O0).

This results in "vec" remaining uninitialised as though it generates
an initialised temporary on the stack, it is never copied into "vec"
nor is "vec" explicitly initialised to zero if obj is nil.

If obj is not nil, obviously the returned values go straight into vec
as one would expect with RVO.

Nick Tuckett <nick.tuckett@...> writes:

I was wondering if there is an agreed behaviour for Clang compiled
Objective C when making a method call on a nil object? If so, does
this define exactly what should be returned in all cases, or are any
cases left as "undefined"?

With releases 3.1, 3.2 and 3.3 I've been finding some variation
between very subtly different simple cases when the method called
returns a C structure by value.

For example, given this class:

    typedef struct {
        int x;
        int y;
        int z;
    } Vec3;

     <at> interface Fuzz : NSObject {
        Vec3 data;
    }
     <at> property Vec3 data;
     <at> end

     <at> implementation Fuzz
     <at> synthesize data;
     <at> end

This code will leave the variable "vec" uninitialised:

    Fuzz * obj = nil;
    Vec3 vec = obj.data;

Whereas this code will initialise the fields of "vec" all to zero:

    Fuzz * obj = nil;
    Vec3 vec;
    vec = obj.data;

I've studied the assembly code generated for x86 and ARM both in
unoptimised and optimised builds, and it would appear that return
value optimisation is applied regardless in the first example (even
with -O0).

This results in "vec" remaining uninitialised as though it generates
an initialised temporary on the stack, it is never copied into "vec"
nor is "vec" explicitly initialised to zero if obj is nil.

If obj is not nil, obviously the returned values go straight into vec
as one would expect with RVO.

I have been investigating this. It only happens in the -fgnu-runtime,
the Next runtime appears to handle it correctly in both the double and
triple line cases. (In so far as the aggregate is cleared when called with
a nil pointer).

The fact it works correctly in the three line case with gnu runtime appears
to be an unintended side effect of uncertainty about pointer aliasing.
Specifically AggExprEmitter::EmitMoveFromReturnSlot in CGExprAgg.cpp
will emit directly into the return slot unless the pointer could be aliased,
as it could be in the 3 line case.

This optimisation is fine, except in the case of ObjC nil receivers.

BTW: I notice that even in cases where the return by value is elided into
the return value, space is redundantly reserved and cleared for a copy
of the aggregate (gnu runtime), which is then never used. Clearly the
assumption is that it is supposed to always be used.