Objective-C Message Send Generation

Hi Everyone,

I've started working on Objective-C code generation. The attached diff contains my initial work. I've added a few hooks into the code generation code for handling the Objective-C AST nodes.

I've created a CodeGenObjCRuntime abstract class which encapsulates the runtime-specific components and a CodeGenObjCGNU concrete subclass for the GNU runtime. Currently, this only handles message sends (and assumes an id return type, which needs fixing later). It's also completely unoptimised as yet, and does slightly silly things like look up the selector every message send, rather than caching it after the first lookup (I'm not sure if this should be done here or if a later pass should know that sel_get_uid is a constant function and do the caching as an optimisation).

I am trying to keep the CodeGenObjC* classes free from clang-specific classes because I intend to use them with a Smalltalk and Self compiler (which won't use the clang AST representations) as well.

David

P.S. I've not used C++ for almost five years and not looked at the llvm/clang sources before Sunday, so I expect lots of comments.

clang.diff (10.7 KB)

Hi David,

Hi Everyone,

I've started working on Objective-C code generation. The attached diff contains my initial work.

very good!

I've added a few hooks into the code generation code for handling the Objective-C AST nodes.

I've created a CodeGenObjCRuntime abstract class which encapsulates the runtime-specific components and a CodeGenObjCGNU concrete subclass for the GNU runtime.

That's good idea.

Few nitpicks:
- I'd shorten class names by replacing "CodeGen" with CG.
- Please use vertical spaces (empty lines) appropriately to make code easy to read.
- Stay within 80 cols.

Currently, this only handles message sends (and assumes an id return type, which needs fixing later). It's also completely unoptimised as yet, and does slightly silly things like look up the selector every message send, rather than caching it after the first lookup

OK.

(I'm not sure if this should be done here or if a later pass should know that sel_get_uid is a constant function and do the caching as an optimisation).

I am trying to keep the CodeGenObjC* classes free from clang-specific classes because I intend to use them with a Smalltalk and Self compiler (which won't use the clang AST representations) as well.

David

P.S. I've not used C++ for almost five years and not looked at the llvm/clang sources before Sunday, so I expect lots of comments.

+ //TODO: Make this selectable at runtime
+ ObjC = ObjCDefaultRuntime(this);
+}

Rename ObjC as Runtime and propagate it down stream (e.g. see how Builder is propagated)

+Value *CodeGenObjCGNU::generateMessageSend(llvm::LLVMFoldingBuilder &B,
+ llvm::Value * receiver,
+ std::string &selector,
+ llvm::Value** ArgV,
+ size_t ArgC) {

Consistently use capital letter, e.g. Receive and Selector.
Please use Builder instead of just B.

+ const llvm::Type *selType = llvm::PointerType::get((llvm::Type*)llvm::StructType::get(str2), 0);

Avoid (llvm::Type*) casts. Instead
    const llvm::Type *STy = llvm::StructType::get(str2);
    const llvm::Type *SelType = llvm::PointerType::get(STy, 0);

Be consistent in naming local variables, (str2, FT, Idx0, index_vector ..)

+ } else if (dyn_cast<ObjCClassDecl>(D)){
+ //Forward declaration. Only used for type checking.
+ } else if (dyn_cast<ObjCProtocolDecl>(D)){
+ // TODO: Generate Protocol object.
+ } else if (dyn_cast<ObjCCategoryDecl>(D)){
+ //Only used for typechecking.
+ } else if (dyn_cast<ObjCCategoryImplDecl>(D)){
+ // TODO: Generate methods, attach to class structure
+ } else if (dyn_cast<ObjCImplementationDecl>(D)){
+ // TODO: Generate methods, attach to class structure
+ } else if (dyn_cast<ObjCInterfaceDecl>(D)){
+ // TODO: Set up class structure
+ } else if (dyn_cast<ObjCMethodDecl>(D)){
+ // TODO: Emit method, add method pointer to class structure.

We want to use assert() or Assert1() to emit an explicit "not implemented ... " message instead of silent TODOs.

Overall, very good start.

Hi David,

I've added a few hooks into the code generation code for handling
the Objective-C AST nodes.

I've created a CodeGenObjCRuntime abstract class which encapsulates
the runtime-specific components and a CodeGenObjCGNU concrete
subclass for the GNU runtime.

That's good idea.

Few nitpicks:
- I'd shorten class names by replacing "CodeGen" with CG.

Done.

- Please use vertical spaces (empty lines) appropriately to make code
easy to read.

I've separated semantic blocks with some white space.

- Stay within 80 cols.

+ //TODO: Make this selectable at runtime
+ ObjC = ObjCDefaultRuntime(this);
+}

Rename ObjC as Runtime and propagate it down stream (e.g. see how
Builder is propagated)

Done, I think. Please check I have done the propagation correctly. I am not sure if it should be propagated via CodeGenFunction?

+Value *CodeGenObjCGNU::generateMessageSend(llvm::LLVMFoldingBuilder &B,
+ llvm::Value * receiver,
+ std::string &selector,
+ llvm::Value** ArgV,
+ size_t ArgC) {

Consistently use capital letter, e.g. Receive and Selector.
Please use Builder instead of just B.

Ooops. I switched to using the LLVM style part way through and didn't update everything. I think I've fixed that now.

+ const llvm::Type *selType =
llvm::PointerType::get((llvm::Type*)llvm::StructType::get(str2), 0);

Avoid (llvm::Type*) casts. Instead
   const llvm::Type *STy = llvm::StructType::get(str2);
   const llvm::Type *SelType = llvm::PointerType::get(STy, 0);

Be consistent in naming local variables, (str2, FT, Idx0,
index_vector ..)

That's what happens when you try to switch coding styles part way though...
Should be fixed now.

+ } else if (dyn_cast<ObjCClassDecl>(D)){
+ //Forward declaration. Only used for type checking.
+ } else if (dyn_cast<ObjCProtocolDecl>(D)){
+ // TODO: Generate Protocol object.
+ } else if (dyn_cast<ObjCCategoryDecl>(D)){
+ //Only used for typechecking.
+ } else if (dyn_cast<ObjCCategoryImplDecl>(D)){
+ // TODO: Generate methods, attach to class structure
+ } else if (dyn_cast<ObjCImplementationDecl>(D)){
+ // TODO: Generate methods, attach to class structure
+ } else if (dyn_cast<ObjCInterfaceDecl>(D)){
+ // TODO: Set up class structure
+ } else if (dyn_cast<ObjCMethodDecl>(D)){
+ // TODO: Emit method, add method pointer to class structure.

We want to use assert() or Assert1() to emit an explicit "not
implemented ... " message instead of silent TODOs.

I'll keep these in my local copy because they let me compile simple Objective-C programs without hitting aborts, and add the missing bits, but I've removed them from the diff. I've also removed the inclusion of iostream that I was using for debugging and shouldn't be there anymore.

David

clang.diff (11 KB)

Ok! Thanks again for working on this by the way.

Here are some more picky details :slight_smile:

--- CGObjCRuntime.h (revision 0)
+++ CGObjCRuntime.h (revision 0)
@@ -0,0 +1,46 @@

+#ifndef __CODEGENOBJC_H_INCLUDED__
+#define __CODEGENOBJC_H_INCLUDED__

Please use a name that isn't in the implementation namespace, for example CLANG_CODEGEN_OBCJRUNTIME_H like the other headers.

+#include "clang/AST/AST.h"
+#include "clang/AST/Expr.h"

I'll keep these in my local copy because they let me compile simple Objective-C programs without hitting aborts, and add the missing bits, but I've removed them from the diff.

Ok! Thanks again for working on this by the way.

Here are some more picky details :slight_smile:

Keep them coming - I'm still trying to remember how C++ works...

--- CGObjCRuntime.h (revision 0)
+++ CGObjCRuntime.h (revision 0)
@@ -0,0 +1,46 @@

+#ifndef __CODEGENOBJC_H_INCLUDED__
+#define __CODEGENOBJC_H_INCLUDED__

Please use a name that isn't in the implementation namespace, for example CLANG_CODEGEN_OBCJRUNTIME_H like the other headers.

Fixed.

+#include "clang/AST/AST.h"
+#include "clang/AST/Expr.h"
+
+#include "llvm/Constants.h"
+#include "llvm/Function.h"
+#include "llvm/GlobalVariable.h"
+#include "llvm/Intrinsics.h"
+#include "llvm/Support/Compiler.h"
+#include "llvm/Support/LLVMBuilder.h"
+#include "llvm/Module.h"

Please don't #include headers redundantly (Module.h pulls in function.h, for example). Also, please use forward definitions of classes to avoid #including as much as possible. In this case, you should be able to fwd declare all the llvm classes you use.

I think that's fixed now. Let me know if I've missed any.

+using llvm::Value;

Headers shouldn't have "using", this pollutes the namespace of the file that includes it.

Ooops. Fixed.

+ //TODO: Make this selectable at runtime
+ Runtime = ObjCDefaultRuntime(M);
+}

This object needs to be deleted in ~CodeGenModule().

This destructor didn't seem to exist. I've created it, but let me know if I've put it in the wrong place...

+ Function *SelFunction = TheModule.getFunction("sel_get_uid");
+ // If we haven't got the selector lookup function, look it up now.
+ // TODO: Factor this out and use it to implement @selector() too.
+ if(SelFunction == 0) {
+ std::vector<const llvm::Type*> Str(1,
....
+ }

This code can be simplified through the use of the Module::getOrInsertFunction method. Something like this should work:

PtrToInt8Ty = llvm::PointerType::getUnqual(llvm::Type::Int8Ty);
llvm::Constant *SelFunction = TheModule.getOrInsertFunction("sel_get_uid", PtrToInt8Ty, NULL);

A similar approach can simplify creation of the objc_msg_lookup function.

Ah, I thought there was probably a simpler way of doing this (not quite like that - the return type should be a selector). Since that does implicit type coercion of the function it will also make it easier to handle the return types.

+ // Call the method
+ for(size_t i=0 ; i<ArgC ; i++) {
+ lookupArgs.push_back(ArgV[i]);
+ }

This loop can be replaced with:

lookupArgs.insert(lookupArgs.end(), ArgV, ArgV+ArgC);

Ah yes. Fixed.

Here's a new version of the diff. I've also tidied up a few of the type definitions. Let me know if I've missed anything important...

David

clang.diff (11.1 KB)

And here's another one that handles non-id returns correctly.

David

clang.diff (11.4 KB)

Very nice, I applied it with some minor changes:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080225/004502.html
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080225/004503.html

The main changes were to reduce #includage a little bit, avoid some temporary vectors, add some vertical whitespace, and renamed the creation function to CreateObjCRuntime. I also added it to the xcode project, added a virtual dtor to CGObjCRuntime (to silence warnings) and some other minor stuff.

Thanks David!

-Chris