Comparison of pointers between Objc subclasses

Hi,

I'm not sure if this has already been discussed earlier, sorry if that's the case.

I noticed that clang (-fsyntax-only) produces the following warning:

MyClass.m:15:11: warning: comparison of distinct pointer types ('NSMutableArray *' and 'NSArray *')
if (array != newArray)
     ~~~~~ ^ ~~~~~~~~

Since NSMutableArray is a subclass of NSArray, the warning doesn't seem really appropriate.

Thanks,
Thomas

#import <Foundation/Foundation.h>

@interface MyClass
{
  NSMutableArray *array;
}
- (void)setArray:(NSArray *)newArray;
@end

@implementation MyClass
- (void)setArray:(NSArray *)newArray
{
  if (array != newArray)
  {
    [array release];
    array = [newArray mutableCopy];
  }
}
@end

Probably caused by the fact that ASTContext::typesAreCompatible isn't
commutative for ObjC types; clang currently considers (Base, Derived)
to be compatible, while (Derived, Base) isn't. I actually have a
patch that will fix that, but I'm not sure about some of the
expectations for type compatibility and ObjC.

Considering two ObjC pointers compatible in the C99 sense makes a lot
of things legal which probably shouldn't be legal: for example,
redeclaring a pointer to a base type as a pointer to a derived type is
legal (illegal in C++), assigning a base type pointer to a derived
type pointer is legal (requires a static_cast in C++), and assigning a
pointer to a derived type pointer to a pointer to a base type pointer
is legal (requires a reinterpret_cast in C++). Because of situations
like these, C++ doesn't have type compatibility in the same sense;
legal combinations of operands are defined on a case-by-case basis.
What exactly are the rules supposed to be for ObjC? (Or, if the rules
aren't written down anywhere, what does gcc do?)

-Eli

At least for GCC:

   id generic;
   NSObject *base = nil;
   NSArray *array = nil;
   NSMutableArray *subArray = nil;

   base = array; // OK
   array = base; // Warning: assignment from distinct Obj-C Type

   array = subArray; // OK
   subArray = array; // Warning: assignment from distinct Obj-C Type

   generic = array; // OK
   array = generic; // OK

Are pointer type compatibility checks for assignments and conversions being handled in the same way? I can understand why assigning a base type to a derived type is dangerous, but it seems to me that comparisons like these should be legal. I'm only raising this point because the code snippet with the warning involved a comparison, not an assignment.

Assignment (from C99 6.5.16.1):
One of the following shall hold:
<snip>
both operands are pointers to qualified or unqualified versions of
compatible types, and the type pointed to by the left has all the
qualifiers of the type pointed to by the right;
<snip>

Comparison (from C99 6.5.9):
One of the following shall hold:
<snip>
both operands are pointers to qualified or unqualified versions of
compatible types;
<snip>

And currently, we (inconsistently) consider ObjC base types and
derived types compatible in the C99 sense, which is implemented in
ASTContext::typesAreCompatible.

-Eli

Okay... would you mind checking few more potentially interesting cases?
Redeclaration (i.e. nsarray*a;nsmutablearray*a;)
Relational operators (<= and friends)
Subtraction (i.e. nsarray-nsmutablearray)
Assigning pointer to pointer (i.e. nsmutablearray**=nsarray**, and vice versa)

I'd do it myself, but I don't know enough objc to write a testcase myself.

-Eli

Thanks Eli.

C99 obviously doesn't stipulate the compatibility of pointers to Objective-C classes, which gets back to the original issue. In this regards, C99 6.5.9 might not even apply when comparing Objective-C types ("compatibility" is defined by an extended type system that is well beyond the scope of C99). This of course was the original issue. Because there is a subclassing relationship here between NSArray and NSMutableArray (which is a subtyping relationship from a PL theory standpoint), it makes sense to me that the pointer types are compatible, but I'm not an Objective-C lawyer.

Incidentally, comparing a pointer to a derived type is with a pointer to a base type is okay in C++ (at least as far as C++ classes and g++ -pedantic is concerned; I'm not certain what the standard says):

  $ cat t.cpp

  class A {};
  class B : public A {};

  bool f(A* a, B* b) { return a == b; }

  $ g++ -pedantic -fsyntax-only t.cpp
  $

This example appears to be essentially the same as the one Thomas presented in his original email, except that we are dealing with C++ classes, not Objective-C classes. Incidentally, the following Objective-C example also passes with gcc -pedantic:

  $ cat t.m

  @interface A { int x; } @end
  @interface B : A { int y; } @end

  int f(A* a, B* b) {
    return a == b;
  }

  $ gcc -pedantic -fsytnax-only t.m
  $

If we remove the subtyping relationship between B and A, however, gcc issues a warning:

  $ cat t2.m

  @interface A { int x; } @end
  @interface B { int y; } @end

  int f(A* a, B* b) {
    return a == b;
  }

  $ gcc -fsyntax-only -pedantic t2.m
  t2.m: In function ‘f’:
  t2.m:5: warning: comparison of distinct Objective-C types lacks a cast
  $

Now lets look at assignments (t3.m is the same as t.m except the comparison is now an assignment). Notice that B subclasses A:

  $ cat t3.m

  @interface A { int x; } @end
  @interface B : A { int y; } @end

   A* f(A* a, B* b) {
     return a = b;
   }

  $ gcc -pedantic -fsyntax-only t3.m
  $

What happens if we assign a to b (where B still subclasses A)?

  $ cat t4.m

  @interface A { int x; } @end
  @interface B : A { int y; } @end

   A* f(A* a, B* b) {
    return b = a;
   }

  $ gcc -pedantic -fsyntax-only t4.m
  t4.m: In function ‘f’:
  t4.m:5: warning: assignment from distinct Objective-C type

GCC's diagnostics are horrible, but I think this experiment illustrates that two Objective-C pointer types can be compared if one pointee subclasses the other (a subtyping relationship), and in an assignment two types are compatible if the type of the right expression subtypes the left expression. This makes sense because B* can be treated as an A*, but an A* cannot always be treated as a B* (since B is more specialized, and that A* might actually be a C*, where C subclasses A, but is neither a parent or subclass of B).

So, I think the answers we are looking for are outside the letter of C99 standard. The Objective-C and C++ object type systems are a completely different animal.

Okay, so in that case typesAreCompatible should say that non-identical
ObjC types aren't compatible, and the necessary handling for
comparison/assignment/conditionals/whatever else is needed for ObjC
pointers should move into SemaExpr.

-Eli

That sounds like a reasonable plan. Incidentally, what does:

   __builtin_type_compatible_p

do for these pointer types? I've never even used this builtin, so I'm not quite certain how to use it in practice. I only ask because we use ASTContext::typesAreCompatible to implement Expr::isIntegerConstantExpr for TypesCompatibleExpr.

If we can cross-check GCC's behavior with this builtin, that should give a definitive answer on whether or not NSArray* and NSMutableArray* should be considered compatible types (and hence put the logic in ASTContext::typesAreCompatible) or, if not, put (as you suggest) the necessary handling for ObjC pointers in SemaExpr.

Comparisons between ObjC pointers should contain an implicit cast to id. I tweaked CodeGen to do this and sent the patch a month or so ago, but it wasn't an ideal solution since the problem is really in Sema, and it only fixed a few cases where Sema was doing almost the right thing but omitting the implicit cast.

Currently there are lots of Objective-C type-system related bugs in clang. I hope to have some time in the next few weeks to be able to take a look at it. The Objective-C type system doesn't map very cleanly to the LLVM type system, unfortunately, and there are lots of places where implicit casts need to be emitted but aren't.

David

Are pointer type compatibility checks for assignments and conversions
being handled in the same way? I can understand why assigning a base
type to a derived type is dangerous, but it seems to me that
comparisons like these should be legal. I'm only raising this point
because the code snippet with the warning involved a comparison, not
an assignment.

Comparisons between ObjC pointers should contain an implicit cast to
id. I tweaked CodeGen to do this and sent the patch a month or so
ago, but it wasn't an ideal solution since the problem is really in
Sema, and it only fixed a few cases where Sema was doing almost the
right thing but omitting the implicit cast.

Wouldn't this implicit cast prevent the "comparison of distinct Objective-C types lacks a cast" warning ?

I think gcc emit this warning, because if gcc is right (the two pointer does not have compatible type), the comparaison will always be false.

It 's like when you try to compare a negative signed value with an unsigned value. considere this:

unsigned i;
...
if (-1 == i) ...

In this case, it's safe/legal to compare the two values, and the condition may even be true, but gcc emit a warning (but it look like clang does not says anything about it).
Note that I have no idea about what standard says about this case.

I also saw that "clang -pedantic" properly handles the "incompatible pointer types assigning…" warning.

No. The warning is just a warning. Objective-C type tags are informative, nothing more. The type of an Objective-C object is a property of the object, not of the variable holding the object. The warning says 'you think these two objects have different types, but you are still comparing them, are you sure that's sensible?' The implicit cast should be inserted so that CodeGen will generate the correct code, rather than failing because you are comparing distinct pointer types.

The same is true of an assignment. It is perfectly legal to assign any object pointer to any other object pointer in Objective-C, it just isn't usually sensible, which is why you get a warning. In some situations, however, it is useful. You might have something like this:

- (void) method:(A*)anObject
{
  if ([anObject isKindOfClass:[B class]])
  {
    [self methodExpectingB:(B*)anObject];
  }
  else
    ...
}

This can be required because Objective-C objects have two types - the implicit type determined by their signature and their structural type. The tag type carries both implicitly. You can safely pass object B to something expecting A if it implements every method of A. You may also want to override a method which expects an A, and have a special case for objects of type B, which either accesses their instance variables directly, or uses some methods only present in B. Of course, this is better done with protocols, but that requires you to have access to the class which you are subclassing's source code.

David

Sorry, didn't notice it was about implicit cast into CodeGen (and so should not have any impact on parse and Sema, and so on warning).
Of course, I fully agree with you. The Obj-C runtime don't care about what kind of object you pass, as long as it's first field is an isa pointer (at least for the Apple runtime) and it can dynamically dispatch method on it (even if the object does not directly handle the message and uses forwarding or dynamic method generation to handle it).