Devirtualizing local methods (messages) in final class in Objective-C
Local Methods (passed to ‘self’) of a final class may be devirtualized safely. In Objective-C it is possible to define methods in a translation unit without declaring them in the interface in a header file (see: localMethod and NoDecl_localMethod in the example). The local methods are usually implemented to abstract away some functionality and is only invoked via self.
It is possible to ‘devirtualize’ calls to local methods. Devirtualization is a compiler technique where a dynamic dispatch (message passed to an object) is converted to a direct function call when the compiler can determine the callee (for a specific callsite) at compile time.
In the example shown, calls to ‘NoDecl_localMethod’, and ‘localMethod’ maybe safely devirtualized.
//------------------------------------ Header file (a.h)
@interface MyObject : NSObject
- (void)publicMethod:(int)i idType:(id)ExtObject;
//------------------------------------ Source file (a.mm)
@interface MyObject ()
// no declaration required.
[self NoDecl_localMethod]; // devirtualize to MyObject::NoDecl_localMethod
[self localMethod]; // devirtualize to MyObject::localMethod
[ExtObject NoDecl_localMethod]; // don’t devirtualize
Pass organization and algorithm:
Overview of the steps: clang front-end
- Find all the local methods (methods which are defined only in .m/.mm file
- set the attribute of local methods in the front end to something like Attribute::ObjCLocalMethod to help identify them in llvm IR.
llvm middle-end (IPO Module level pass)
- Get list of local methods in a module by iterating over the method list
- Get method name (at a objc_msgSend callsite) by parsing selector name and inspecting the global data structures set up for tracking function pointer for each method.
- Devirtualize for only those callsites which are messages to ‘self’
How to track messages to self:
- Verify if the caller is a method (prefixed with “\01”). This seems like a hack but it is consistent with clang frontend’s translation of Objective-C declaration.
- For a method the first argument is the pointer to self unless the method has StRet.
This is a performance optimization in sense that it avoids a call to ‘objc_msgSend’. It saves a little bit of code size in some cases. It is possible to devirtualize even more messages for ‘final’ classes but that would require some more engineering effort e.g., Inferring the type of Receiver object statically, synthesizing the declaration of the callee in the caller’s translation unit etc.
The current implementation only devirtualizes local messages passed to self, as the type-inference of receiver object is not required and the declaration+definition is readily available in the same translation unit. Moreover, if we can guarantee (via design decision) that a subclass does not override a local method of a parent class, then all local methods (invoked via self) can still be devirtualized with this approach.
I would like for Objective-C experts to point out any gotchas with the this approach or feedback for further improvement.
Final class was introduced in Objective-C in: https://reviews.llvm.org/D25993