[PATCH]es: Objective-C lightweight generics

Hi all,

Last month, Apple introduced a new lightweight generics system into Objective-C. The attached patch series provides the complete implementation of this feature, the related “kindof” types feature, supporting warnings, test-cases, and so on. It should get us back to the point where we can parse the headers for Apple’s latest SDKs.

For those of you wondering what this whole lightweight generics thing is, I suggest either watching the following WWDC video starting at @21:45:

https://developer.apple.com/videos/wwdc/2015/?id=401

or keep reading for an explanation.

Lightweight generics allows Objective-C classes to be parameterized, e.g., NSArray can be parameterized on the type of its elements:

@interface NSArray : NSObject
-(ObjectType)objectAtIndex:(NSInteger)index;
@end

Objective-C classes, categories, extensions, and forward declarations can all specify type parameters. Naturally, the type parameters across different declarations associated with the same class must be consistent.

For a given parameterized class, one can provide type arguments, e.g., to state the type of the elements of an NSArray:

NSArray<NSString *> *strings;
NSArray<NSNumber *> *numbers;

Naturally, messaging an object of specialized type substitutes the type arguments for the type parameters, e.g.,

[strings objectAtIndex: 0]; // produces an NSString *

[numbers objectAtIndex: 0]; // produces an NSNumber *

One can, of course, leave the type arguments unspecified:

NSArray *array;

Each of the types NSArray<NSString *>, NSArray<NSNumber *>, and NSArray are distinct types. There are implicit conversions between the “specialized” types (NSArray<NSString *>, NSArray<NSNumber *>) and the “unspecialized” type (NSArray *), but not between different specialized types. For example:

array = strings; // okay, dropping type arguments
numbers = array; // okay, adding type arguments
strings = numbers; // not okay! NSArray<NSString *> and NSArray<NSNumber *> are incompatible types

The “lightweight” in lightweight generics comes from the implementation model, which is based on type erasure (a la Java Generics). All of the type argument information is completely erased by IR generation, so this feature requires no runtime or metadata changes. Obviously, this—along with the implicit conversion rules above—means that some "obvious” errors will be missed by the type checker (and won’t be enforced by the runtime). For example, the sequence “array = strings; numbers = array;” is obviously bogus: the elements cannot be both NSStrings and NSNumbers. Sema won’t catch this, just like Sema wouldn’t catch an assignment chain that converts an NSNumber to an NSString by way of ‘id’.

Type parameters have a few advanced features. First of all, they can have upper bounds, such that any type argument must be a subtype of the upper bound of the corresponding type parameter. For example:

@interface MyMap<KeyType : id, ObjectType>
-(KeyType)firstKeyForObject:(ObjectType)object; // yes, this is a weird example; explanation below
@end

It’s an error to form a MyMap<T, U> where T isn’t a subtype of id. When the type bound is omitted, it is “id”, i.e., all type parameters/arguments are all Objective-C objects. Type bounds are also important when using unspecialized types (e.g., MyMap *); we’ll come back to that later.

Type parameters can also be co- or contra-variant, similarly to C#’s generic type parameters (see, e.g., https://msdn.microsoft.com/en-us/library/dd799517(v=vs.110).aspx). This is particularly useful for the immutable collection classes, which we’ll use for exposition purposes. NSArray is actually declared as:

@interface NSArray<__covariant ObjectType> : NSObject
-(ObjectType)objectAtIndex:(NSInteger)index;
@end

Because ObjectType is covariant, NSArray is implicitly convertible to NSArray when T is a subtype of U, which allows, e.g.,

NSArray<NSMutableString *> *mutStrings;
NSArray<NSString *> *strings;
strings = mutStrings; // okay, because ObjectType is covariant (and NSArray is an immutable collection)

Contravariant type parameters go the opposite way, e.g., NSArray is implicitly convertible to NSArray when U is a subtype of T.

Objective-C lightweight generics also involves a related feature called “kindof” types, which strike a balance between “id” and specific Objective-C pointer types. For example, given:

__kindof NSString *kindofStr;

we can implicitly convert to supertypes and subtypes, but not unrelated types. For example:

NSObject *object = kindofStr; // okay, implicitly converting to supertype
NSMutableString *mutString = kindofStr; // okay, implicitly converting to subtype
NSNumber *number = kindofStr; // not okay! NSString and NSNumber are incompatible

Naturally, the features compose: one can create an array of kindof types, e.g.,

NSArray<__kindof NSValue *> *values;

which can be very helpful when adopting lightweight generics. Kindof types are also important when working with unspecialized types. For example, let’s invent a class that only wants to work with subclasses of ‘View’:

@interface SomeClass<T : View *> // T must be View or a subclass thereof

  • (T)view;
    @end

As noted earlier, when we message SomeClass, we substitute in the type arguments for the type parameters, e.g.,

SomeClass<Button *> *sc;
[sc view]; // produces a “Button *”

But what happens if we don’t have type arguments, because we’re messaging an object of type ’SomeClass *’? We produce a __kindoftype from the type bound, which gives us good static type information without forcing the user to introduce a large number of casts:

SomeClass *sc;
[sc view]; // produces a “__kindof View *”

That’s most of it! I’ll be happy to answer any questions. Actual coherent documentation is forthcoming, but we wanted to get this implementation out there for people to play with.

The patches are fairly large, since this is a nontrivial feature. I’ve broken it up into logical pieces, which are, roughly:

  1. Type parameter parsing, ASTs, etc.
  2. Type argument handling
  3. Substitution of type arguments for parameters when using specialized types
  4. Reworking our handling of the ternary operator on Objective-C pointer types
  5. Interaction between C++ templates and Objective-C lightweight generics
    6/7) Warnings for lightweight generics
  6. Kindof types
  7. Co- and contra-variant type parameters
  8. Annoying workaround for old Clang versions that apparently still need to build with

Detailed patch review should go to cfe-commits, so we can keep the discussion here more high-level.

  • Doug

0001-Parsing-semantic-analysis-and-AST-for-Objective-C-ty.patch (101 KB)

0002-Handle-Objective-C-type-arguments.patch (125 KB)

0003-Substitute-type-arguments-into-uses-of-Objective-C-i.patch (115 KB)

0004-Improve-the-Objective-C-common-type-computation-used.patch (22.5 KB)

0005-C-support-for-Objective-C-lightweight-generics.patch (108 KB)

0006-Warn-when-an-Objective-C-collection-literal-element-.patch (8.66 KB)

0007-Warn-when-an-intended-Objective-C-specialization-was.patch (11.7 KB)

0008-Implement-the-Objective-C-__kindof-type-qualifier.patch (68.9 KB)

0009-Implement-variance-for-Objective-C-type-parameters.patch (32.8 KB)

0010-Factor-the-simpleTransform-visitor-so-that-it-is-not.patch (26.5 KB)

Hi Doug,

Would there be interest in a compilation mode that inserted run-time checks for these cases? We discussed previously doing it on every down cast and cast-from-id in Objective-C, but came to the conclusion that the false positive rate would be too high. For generics, I suspect that people either want to not break the rules, or will just use the non-generic version.

Adding an objc_assert_type(id, Class) runtime function (and variations for co/contravariant generic types) and a call on every parameter/return value for a type-generic method would be fairly simple and I’d be happy to put together a patch if this is something that would be of interest.

The patch series seems to be missing a __has_feature() flag to detect for lightweight generics support.

David

Hi David,

Hi Doug,

Would there be interest in a compilation mode that inserted run-time checks for these cases? We discussed previously doing it on every down cast and cast-from-id in Objective-C, but came to the conclusion that the false positive rate would be too high. For generics, I suspect that people either want to not break the rules, or will just use the non-generic version.

My general impression here is that people will still want to break the rules, e.g., by continuing to use proxies that don't properly implement -isKindOfClass:, so I’ve not pursued this.

Adding an objc_assert_type(id, Class) runtime function (and variations for co/contravariant generic types) and a call on every parameter/return value for a type-generic method would be fairly simple and I’d be happy to put together a patch if this is something that would be of interest.

I don’t know how much interest there is. This feature has been available in Apple’s compiler for about a month publicly, and for quite a while longer internally, and IIRC nobody has asked for additional runtime checking. Now, this may be an artifact of the presentation of this feature—we’ve tended to emphasize “no code generation changes” because we don’t want developers to fear that adopting this feature will break their existing Objective-C code.

There have, however, been numerous requests for better static checking, e.g., a warning for unspecialized -> specialized conversions (where one is adding type arguments that could, conceivably, be wrong). We think that the static-checking angle is better addressed by the static analyzer, where we can get a better handle on the type arguments as they flow through the system.

The patch series seems to be missing a __has_feature() flag to detect for lightweight generics support.

It’s __has_feature(objc_generics). It went in with the commit that introduced C++ support for this feature, since that’s more-or-less when the feature became complete.

  - Doug

My thought was that it would behave in the same way as some of the other non-fatal errors in Objective-C, printing a message along the lines of “*** Cast from Foo to Bar disallowed by generics. Put breakpoint on objc_generics_whatever() to debug”. You’d turn this off in production, but when debugging (-fsanitize=objc-generics?) you’d be able to catch it early.

David

Yes, I absolutely agree that this could be useful for finding such problems. And perhaps my concerns about the prevalence of proxies not overriding -isKindOfClass: properly are overblown… only an experimental implementation can tell :slight_smile:

  - Doug