Objective-C: methods, ivars, and cleanup

Hi,

This is a fairly big diff, so I've tried to split it into a few separate parts.

globals.diff is a simple addition to the global_ctors code which avoids emitting empty global_ctors.

etoile_runtime.diff contains a subclass of CGObjCRuntime which implements the runtime-specific parts for the Étoilé runtime. Other parts do not depend on this unless output for this runtime is selected (the runtime to use is currently hard-coded). I am now writing the GNU and Étoilé versions at the same time to make sure the interfaces are sufficiently abstract. Hopefully this will make adding support for the Apple runtimes easy for whoever gets that job.

obj_struct.diff handles conversion from Objective-C pointers to LLVM types. This is needed for accessing instance variables statically and is used in the ivar access parts. This patch is also self-contained, and does not depend on any other changes.

runtime_if.diff contains changes to the runtime interface. This probably needs to be committed at the same time as the changes to the GNU runtime, or there might be some conflicts. The Étoilé patch may also conflict with the old definitions if it is committed before this.

gnu_runtime.diff contains the changes made to the GNU runtime to adopt the new interface. It also adds methods for generating method functions and some things related to creating class structures which are not used yet.

method_gen.diff contains the code required to generate Objective-C methods (although it does not yet attach them to class structures.

Finally, ivar.diff contains the code required for accessing and modifying Objective-C instance variables. This depends on the stuff in objc_struct.diff for all runtimes that do not use late-bound instance variable addresses (i.e. all of the currently supported ones).

Have fun,

David

globals.diff (3.64 KB)

gnu_runtime.diff (14.4 KB)

ivar.diff (5.99 KB)

method_gen.diff (6.82 KB)

obj_struct.diff (3.97 KB)

runtime_if.diff (3.57 KB)

etoile_runtime.diff (9.28 KB)

Hi David,

Thanks for splitting patch into separate diffs! It helps.

Hi,

This is a fairly big diff, so I've tried to split it into a few separate parts.

globals.diff is a simple addition to the global_ctors code which avoids emitting empty global_ctors.

- // Populate the array
- std::vector<llvm::Constant*> CtorValues;
- llvm::Constant *MagicNumber =
- llvm::ConstantInt::get(llvm::Type::Int32Ty, 65535, false);
- std::vector<llvm::Constant*> StructValues;
- for (std::vector<llvm::Constant*>::iterator I = GlobalCtors.begin(),
- E = GlobalCtors.end(); I != E; ++I) {
- StructValues.clear();
- StructValues.push_back(MagicNumber);
- StructValues.push_back(*I);
+ // Populate the array
+ std::vector<llvm::Constant*> CtorValues;
+ llvm::Constant *MagicNumber = llvm::ConstantInt::get(llvm::IntegerType::Int32Ty,
+ 65535,
+ false);
+ for (std::vector<llvm::Constant*>::iterator I = GlobalCtors.begin(),
+ E = GlobalCtors.end(); I != E; ++I) {
+ std::vector<llvm::Constant*> StructValues;
+ StructValues.push_back(MagicNumber);

Hoist StructValues outside the for loop and use clear(), as it was done originally. Looks OK otherwise.

etoile_runtime.diff contains a subclass of CGObjCRuntime which implements the runtime-specific parts for the Étoilé runtime. Other parts do not depend on this unless output for this runtime is selected (the runtime to use is currently hard-coded). I am now writing the GNU and Étoilé versions at the same time to make sure the interfaces are sufficiently abstract. Hopefully this will make adding support for the Apple runtimes easy for whoever gets that job.

OK. Few nit-picks

1) Add doxygen style comments for class methods and variables. Right now, I don't see any comment.

2)
+#define SET(structure, index, value) do {\
+ llvm::Value *element_ptr = Builder.CreateStructGEP(structure, index);\
+ Builder.CreateStore(value, element_ptr);} while(0)

Avoid macros like this.

3)
+// BIG FAT WARNING: Much of this code will need factoring out later.

Just use, FIXME: Much of this code ...

4) Now, I regularly see one-parameter-per-line in your patches. For example,

+ // Slot type
+ SlotTy = llvm::StructType::get(IntTy,
+ IMPTy,
+ PtrToInt8Ty,
+ llvm::Type::Int32Ty,
+ 0);

I'd avoid this myself. Clang coding style does not require this (nor explicitly prohibits this).

obj_struct.diff handles conversion from Objective-C pointers to LLVM types. This is needed for accessing instance variables statically and is used in the ivar access parts. This patch is also self-contained, and does not depend on any other changes.

+ void CollectIvarTypes(ObjCInterfaceDecl *ObjCClass,
+ std::vector<const llvm::Type*> *IvarTypes);

This is Objective-C specific class method, so rename it as CollectObjCIvarTypes.
Prefer SmallVector instead of std::vector. Pass it as a reference instead of a pointer.

    /// getLLVMFieldNo - Return llvm::StructType element number
    /// that corresponds to the field FD.
    unsigned getLLVMFieldNo(const FieldDecl *FD);
+ unsigned getLLVMFieldNo(QualType ObjectTy, const ObjCIvarDecl *Decl);

Why do you need new method here ?

+ // Warning: Use of this is strongly discouraged. Late binding of instance
+ // variables is supported on some runtimes and so using static binding can
+ // break code when libraries are updated. Only use this if you have
+ // previously checked that the ObjCRuntime subclass in use does not support
+ // late-bound ivars.

Please add assert() to check late-bound of ivars support.

runtime_if.diff contains changes to the runtime interface. This probably needs to be committed at the same time as the changes to the GNU runtime, or there might be some conflicts. The Étoilé patch may also conflict with the old definitions if it is committed before this.

OK

-CGObjCRuntime *CreateObjCRuntime(llvm::Module &M);
+CGObjCRuntime *CreateObjCRuntime(llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType);

Is not it possible to derive LLVMIntType and LLVMLongType info based on Module ?

gnu_runtime.diff contains the changes made to the GNU runtime to adopt the new interface. It also adds methods for generating method functions and some things related to creating class structures which are not used yet.

OK. Please add doxygen style comments for each method.

method_gen.diff contains the code required to generate Objective-C methods (although it does not yet attach them to class structures.

Index: lib/CodeGen/CGStmt.cpp

Hi,

This is a fairly big diff, so I've tried to split it into a few separate parts.

Great, thanks!

globals.diff is a simple addition to the global_ctors code which avoids emitting empty global_ctors.

A couple things here:

Instead of nesting the entire body of the function, please just use:

if (GlobalCtors.empty()) return;

Second, you're implicitly reverting a cleanup fix:

- llvm::StructType* CtorStructTy =
- llvm::StructType::get(llvm::Type::Int32Ty, FPType, NULL);

is better than:

+ std::vector<const llvm::Type*> CtorFields;
+ CtorFields.push_back(llvm::IntegerType::get(32));
+ // i32, function type pair
+ CtorFields.push_back(llvm::PointerType::getUnqual(CtorFuncTy));
+ llvm::StructType* CtorStructTy = llvm::StructType::get(CtorFields, false);

Likewise the fix to inline GlobalCtorsVar and several other things.

Finally, please make sure the code fits in 80 columns.

etoile_runtime.diff contains a subclass of CGObjCRuntime which implements the runtime-specific parts for the Étoilé runtime. Other parts do not depend on this unless output for this runtime is selected (the runtime to use is currently hard-coded). I am now writing the GNU and Étoilé versions at the same time to make sure the interfaces are sufficiently abstract. Hopefully this will make adding support for the Apple runtimes easy for whoever gets that job.

Ok. I'll let you worry about the correctness of this code :). Please make sure you follow coding conventions though. For example, don't include commented out code:

+/*
+clang::CodeGen::CGObjCRuntime *clang::CodeGen::CreateObjCRuntime(
+ llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType) {
+ return new CGObjCEtoile(M, LLVMIntType, LLVMLongType);
+}
+*/

Please fit in 80 cols, please don't use macros:

+#define SET(structure, index, value) do {\
+ llvm::Value *element_ptr = Builder.CreateStructGEP(structure, index);\
+ Builder.CreateStore(value, element_ptr);} while(0)

(use a static function instead)

Stuff like this won't work on 64-bit platforms:

   llvm::Constant *SelFunction =
+ TheModule.getOrInsertFunction("lookup_typed_selector",
+ SelectorTy,
+ PtrToInt8Ty,
+ 0);

Hi,

This is a fairly big diff, so I've tried to split it into a few separate parts.

globals.diff is a simple addition to the global_ctors code which avoids emitting empty global_ctors.

- // Populate the array
- std::vector<llvm::Constant*> CtorValues;
- llvm::Constant *MagicNumber =
- llvm::ConstantInt::get(llvm::Type::Int32Ty, 65535, false);
- std::vector<llvm::Constant*> StructValues;
- for (std::vector<llvm::Constant*>::iterator I = GlobalCtors.begin(),
- E = GlobalCtors.end(); I != E; ++I) {
- StructValues.clear();
- StructValues.push_back(MagicNumber);
- StructValues.push_back(*I);
+ // Populate the array
+ std::vector<llvm::Constant*> CtorValues;
+ llvm::Constant *MagicNumber = llvm::ConstantInt::get(llvm::IntegerType::Int32Ty,
+ 65535,
+ false);
+ for (std::vector<llvm::Constant*>::iterator I = GlobalCtors.begin(),
+ E = GlobalCtors.end(); I != E; ++I) {
+ std::vector<llvm::Constant*> StructValues;
+ StructValues.push_back(MagicNumber);

Hoist StructValues outside the for loop and use clear(), as it was done originally. Looks OK otherwise.

Fixed.

etoile_runtime.diff contains a subclass of CGObjCRuntime which implements the runtime-specific parts for the Étoilé runtime. Other parts do not depend on this unless output for this runtime is selected (the runtime to use is currently hard-coded). I am now writing the GNU and Étoilé versions at the same time to make sure the interfaces are sufficiently abstract. Hopefully this will make adding support for the Apple runtimes easy for whoever gets that job.

OK. Few nit-picks

1) Add doxygen style comments for class methods and variables. Right now, I don't see any comment.

Done.

2)
+#define SET(structure, index, value) do {\
+ llvm::Value *element_ptr = Builder.CreateStructGEP(structure, index);\
+ Builder.CreateStore(value, element_ptr);} while(0)

Avoid macros like this.

Can you suggest an alternative? I use this pattern in a really large number of places (and will use it in a lot more by the time I'm finished. Expanding it everywhere makes the code a lot less readable and is likely to cause copy-and-paste bugs.

3)
+// BIG FAT WARNING: Much of this code will need factoring out later.

Just use, FIXME: Much of this code ...

Done.

4) Now, I regularly see one-parameter-per-line in your patches. For example,

+ // Slot type
+ SlotTy = llvm::StructType::get(IntTy,
+ IMPTy,
+ PtrToInt8Ty,
+ llvm::Type::Int32Ty,
+ 0);

I'd avoid this myself. Clang coding style does not require this (nor explicitly prohibits this).

I've only done this where putting them all on the same line would make the line over 80 characters long. I find it more readable than just wrapping at 80 characters.

obj_struct.diff handles conversion from Objective-C pointers to LLVM types. This is needed for accessing instance variables statically and is used in the ivar access parts. This patch is also self-contained, and does not depend on any other changes.

+ void CollectIvarTypes(ObjCInterfaceDecl *ObjCClass,
+ std::vector<const llvm::Type*> *IvarTypes);

This is Objective-C specific class method, so rename it as CollectObjCIvarTypes.
Prefer SmallVector instead of std::vector. Pass it as a reference instead of a pointer.

Fixed passing by pointer. Kept as std::vector because StructType::get() takes a std::vector.

  /// getLLVMFieldNo - Return llvm::StructType element number
  /// that corresponds to the field FD.
  unsigned getLLVMFieldNo(const FieldDecl *FD);
+ unsigned getLLVMFieldNo(QualType ObjectTy, const ObjCIvarDecl *Decl);

Why do you need new method here ?

This gets the field number from an LLVM structure created from an Objective-C class, rather than one created from a C structure. Possibly it should have a different name? Suggestions welcome.

+ // Warning: Use of this is strongly discouraged. Late binding of instance
+ // variables is supported on some runtimes and so using static binding can
+ // break code when libraries are updated. Only use this if you have
+ // previously checked that the ObjCRuntime subclass in use does not support
+ // late-bound ivars.

Please add assert() to check late-bound of ivars support.

I'd like to do this, but GodeGenTypes does not have a reference to anything that has a reference to anything that has a reference to the ObjCRuntime subclass, so it's non-trivial. Suggestions welcome.

runtime_if.diff contains changes to the runtime interface. This probably needs to be committed at the same time as the changes to the GNU runtime, or there might be some conflicts. The Étoilé patch may also conflict with the old definitions if it is committed before this.

OK

-CGObjCRuntime *CreateObjCRuntime(llvm::Module &M);
+CGObjCRuntime *CreateObjCRuntime(llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType);

Is not it possible to derive LLVMIntType and LLVMLongType info based on Module ?

Maybe? Can I have a hint please?

gnu_runtime.diff contains the changes made to the GNU runtime to adopt the new interface. It also adds methods for generating method functions and some things related to creating class structures which are not used yet.

OK. Please add doxygen style comments for each method.

Is three slashes right for Doxygen?

method_gen.diff contains the code required to generate Objective-C methods (although it does not yet attach them to class structures.

Index: lib/CodeGen/CGStmt.cpp

--- lib/CodeGen/CGStmt.cpp (revision 48734)
+++ lib/CodeGen/CGStmt.cpp (working copy)
@@ -332,9 +332,6 @@
  // Emit the result value, even if unused, to evalute the side effects.
  const Expr *RV = S.getRetValue();

- QualType FnRetTy = CurFuncDecl->getType().getCanonicalType();
- FnRetTy = cast<FunctionType>(FnRetTy)->getResultType();
-
  if (FnRetTy->isVoidType()) {
    // If the function returns void, emit ret void.
    Builder.CreateRetVoid();

Is this intentional ?

Yes. These have been moved to instance methods variables and created earlier because they can not be inferred in the same way for Objective-C methods as for C functions.

+ // Skip the hidden arguments.
+ ++AI; ++AI;

Name two hidden arguments in the comment.

Done.

void CodeGenFunction::GenerateCode(const FunctionDecl *FD) {
  LLVMIntTy = ConvertType(getContext().IntTy);
  LLVMPointerWidth = static_cast<unsigned>(
    getContext().getTypeSize(getContext().getPointerType(getContext().VoidTy)));

  CurFuncDecl = FD;
+ FnRetTy = CurFuncDecl->getType().getCanonicalType();
+ FnRetTy = cast<FunctionType>(FnRetTy)->getResultType();
+

Is this intentional ?

Yes. See above. These are set in GenerateCode and GenerateObjCMethod now, so that they are the only two methods that need to know whether a C function or an Objective-C method is being generated.

Finally, ivar.diff contains the code required for accessing and modifying Objective-C instance variables. This depends on the stuff in objc_struct.diff for all runtimes that do not use late-bound instance variable addresses (i.e. all of the currently supported ones).

@@ -341,6 +345,17 @@
      return LValue::MakeAddr(CGM.GetAddrOfGlobalVar(VD, false));
    else {
      llvm::Value *V = LocalDeclMap[D];
+ // Check if it's an implicit argument
+ if (!V) {
+ std::string VarName(D->getName());
+ llvm::Function::arg_iterator AI = CurFn->arg_begin();
+ while(AI) {
+ if(VarName == AI->getName()) {
+ return LValue::MakeAddr(AI);
+ }
+ }
+ }
      assert(V && "BlockVarDecl not entered in LocalDeclMap?");
      return LValue::MakeAddr(V);
    }

I'm getting dense here. Please explain what is "implicit argument" in this code? Thanks!

In Objective-C, a method is declared like this:

- (id) aMethodTakingAnInteger:(int)anInt andAnObject:(id)anObject

This is then converted to a C function like this:

id method(id self, SEL _cmd, int anInt, id anObject)

The two first arguments are implicit - they are not specified explicitly in the code, but are inserted in both the caller and callee by the compiler. They are not always self and _cmd (they are in the GNU and Apple runtimes, but not in the Étoilé one).

globals.diff is a simple addition to the global_ctors code which avoids emitting empty global_ctors.

A couple things here:

Instead of nesting the entire body of the function, please just use:

if (GlobalCtors.empty()) return;

Done.

Second, you're implicitly reverting a cleanup fix:

- llvm::StructType* CtorStructTy =
- llvm::StructType::get(llvm::Type::Int32Ty, FPType, NULL);

is better than:

+ std::vector<const llvm::Type*> CtorFields;
+ CtorFields.push_back(llvm::IntegerType::get(32));
+ // i32, function type pair
+ CtorFields.push_back(llvm::PointerType::getUnqual(CtorFuncTy));
+ llvm::StructType* CtorStructTy = llvm::StructType::get(CtorFields, false);

Likewise the fix to inline GlobalCtorsVar and several other things.

Finally, please make sure the code fits in 80 columns.

Ooops. svn said it was a conflict when I updated and I thought it was just being wrong. I've reverted this to the version in svn and reapplied the fixes and now it is more sensible.

etoile_runtime.diff contains a subclass of CGObjCRuntime which implements the runtime-specific parts for the Étoilé runtime. Other parts do not depend on this unless output for this runtime is selected (the runtime to use is currently hard-coded). I am now writing the GNU and Étoilé versions at the same time to make sure the interfaces are sufficiently abstract. Hopefully this will make adding support for the Apple runtimes easy for whoever gets that job.

Ok. I'll let you worry about the correctness of this code :). Please make sure you follow coding conventions though. For example, don't include commented out code:

+/*
+clang::CodeGen::CGObjCRuntime *clang::CodeGen::CreateObjCRuntime(
+ llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType) {
+ return new CGObjCEtoile(M, LLVMIntType, LLVMLongType);
+}
+*/

Ooops. I just have that for testing. At some point I need to add a command-line option for selecting the runtime so I don't need a recompile every time I want to switch. Want to give me any hints about which bits of code I need to look at to do this?

Please fit in 80 cols, please don't use macros:

+#define SET(structure, index, value) do {\
+ llvm::Value *element_ptr = Builder.CreateStructGEP(structure, index);\
+ Builder.CreateStore(value, element_ptr);} while(0)

(use a static function instead)

Fixed.

Stuff like this won't work on 64-bit platforms:

llvm::Constant *SelFunction =
+ TheModule.getOrInsertFunction("lookup_typed_selector",
+ SelectorTy,
+ PtrToInt8Ty,
+ 0);
+

Use NULL instead of 0. Likewise for StructType::get and other variadic functions.

Fixed everywhere I spotted it. Let me know if I've missed any...

+#include <stdarg.h>

Ooops. That's left over from a helper function that's now in LLVM.

I don't think you need this, but if you do, please use <cstdarg> instead.

+//===------- CGObjCEtoile.cpp - Emit LLVM Code from ASTs for a Module --------===//

This should follow the formating of the other files more closely.

Slightly confused - I copied the formatting directly from one of the other files...

The attached diff contains all of the changes mentioned in this email, and a few minor cleanups, but no completely new code.

David

clang.diff (45.6 KB)

+ void CollectIvarTypes(ObjCInterfaceDecl *ObjCClass,
+ std::vector<const llvm::Type*> *IvarTypes);

This is Objective-C specific class method, so rename it as CollectObjCIvarTypes.
Prefer SmallVector instead of std::vector. Pass it as a reference instead of a pointer.

Fixed passing by pointer. Kept as std::vector because StructType::get() takes a std::vector.

/// getLLVMFieldNo - Return llvm::StructType element number
/// that corresponds to the field FD.
unsigned getLLVMFieldNo(const FieldDecl *FD);
+ unsigned getLLVMFieldNo(QualType ObjectTy, const ObjCIvarDecl *Decl);

Why do you need new method here ?

This gets the field number from an LLVM structure created from an Objective-C class, rather than one created from a C structure. Possibly it should have a different name? Suggestions welcome.

In this method you do a field lookup by iterating over all ObjectTy ivars. Would not it be more efficient to keep a map just like normal C struct fields ?

+ // Warning: Use of this is strongly discouraged. Late binding of instance
+ // variables is supported on some runtimes and so using static binding can
+ // break code when libraries are updated. Only use this if you have
+ // previously checked that the ObjCRuntime subclass in use does not support
+ // late-bound ivars.

Please add assert() to check late-bound of ivars support.

I'd like to do this, but GodeGenTypes does not have a reference to anything that has a reference to anything that has a reference to the ObjCRuntime subclass, so it's non-trivial. Suggestions welcome.

You may want to collect runtime info while constructing CodeGenTypes.

runtime_if.diff contains changes to the runtime interface. This probably needs to be committed at the same time as the changes to the GNU runtime, or there might be some conflicts. The Étoilé patch may also conflict with the old definitions if it is committed before this.

OK

-CGObjCRuntime *CreateObjCRuntime(llvm::Module &M);
+CGObjCRuntime *CreateObjCRuntime(llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType);

Is not it possible to derive LLVMIntType and LLVMLongType info based on Module ?

Maybe? Can I have a hint please?

IIUC, these are target specific int and long types. In that case, TargetInfo should provide you this info. I guess this OK for now.

Is three slashes right for Doxygen?

yes.

Index: lib/CodeGen/CGStmt.cpp

--- lib/CodeGen/CGStmt.cpp (revision 48734)
+++ lib/CodeGen/CGStmt.cpp (working copy)
@@ -332,9 +332,6 @@
// Emit the result value, even if unused, to evalute the side effects.
const Expr *RV = S.getRetValue();

- QualType FnRetTy = CurFuncDecl->getType().getCanonicalType();
- FnRetTy = cast<FunctionType>(FnRetTy)->getResultType();
-
if (FnRetTy->isVoidType()) {
   // If the function returns void, emit ret void.
   Builder.CreateRetVoid();

Is this intentional ?

Yes. These have been moved to instance methods variables and created earlier because they can not be inferred in the same way for Objective-C methods as for C functions.

Aha.. ok

Finally, ivar.diff contains the code required for accessing and modifying Objective-C instance variables. This depends on the stuff in objc_struct.diff for all runtimes that do not use late-bound instance variable addresses (i.e. all of the currently supported ones).

@@ -341,6 +345,17 @@
     return LValue::MakeAddr(CGM.GetAddrOfGlobalVar(VD, false));
   else {
     llvm::Value *V = LocalDeclMap[D];
+ // Check if it's an implicit argument
+ if (!V) {
+ std::string VarName(D->getName());
+ llvm::Function::arg_iterator AI = CurFn->arg_begin();
+ while(AI) {
+ if(VarName == AI->getName()) {
+ return LValue::MakeAddr(AI);
+ }
+ }
+ }
     assert(V && "BlockVarDecl not entered in LocalDeclMap?");
     return LValue::MakeAddr(V);
   }

I'm getting dense here. Please explain what is "implicit argument" in this code? Thanks!

In Objective-C, a method is declared like this:

- (id) aMethodTakingAnInteger:(int)anInt andAnObject:(id)anObject

This is then converted to a C function like this:

id method(id self, SEL _cmd, int anInt, id anObject)

The two first arguments are implicit - they are not specified explicitly in the code, but are inserted in both the caller and callee by the compiler. They are not always self and _cmd (they are in the GNU and Apple runtimes, but not in the Étoilé one).

Thanks! Make sure while(AI) is not an infinite loop.

+ void CollectIvarTypes(ObjCInterfaceDecl *ObjCClass,
+ std::vector<const llvm::Type*> *IvarTypes);

This is Objective-C specific class method, so rename it as
CollectObjCIvarTypes.
Prefer SmallVector instead of std::vector. Pass it as a reference
instead of a pointer.

Fixed passing by pointer. Kept as std::vector because
StructType::get() takes a std::vector.

/// getLLVMFieldNo - Return llvm::StructType element number
/// that corresponds to the field FD.
unsigned getLLVMFieldNo(const FieldDecl *FD);
+ unsigned getLLVMFieldNo(QualType ObjectTy, const ObjCIvarDecl
*Decl);

Why do you need new method here ?

This gets the field number from an LLVM structure created from an
Objective-C class, rather than one created from a C structure.
Possibly it should have a different name? Suggestions welcome.

In this method you do a field lookup by iterating over all ObjectTy
ivars. Would not it be more efficient to keep a map just like normal C
struct fields ?

You'd need to generate the map at some point, possibly the first time it's called? Since it works as-is, I'd rather do this later if profiling indicates this is a bottleneck.

+ // Warning: Use of this is strongly discouraged. Late binding
of instance
+ // variables is supported on some runtimes and so using static
binding can
+ // break code when libraries are updated. Only use this if
you have
+ // previously checked that the ObjCRuntime subclass in use
does not support
+ // late-bound ivars.

Please add assert() to check late-bound of ivars support.

I'd like to do this, but GodeGenTypes does not have a reference to
anything that has a reference to anything that has a reference to
the ObjCRuntime subclass, so it's non-trivial. Suggestions welcome.

You may want to collect runtime info while constructing CodeGenTypes.

Doing this would require major changes to the structure of CodeGenModule, since its Types member would have to be initialised after the runtime was constructed. It's also not quite semantically correct - you should still be able to get the static structure of the object using @defs() at compile time, and the compiler should emit a warning that you are introducing stronger constraints on the ABI that the runtime requires (and thus susceptible to the fragile base class problem).

runtime_if.diff contains changes to the runtime interface. This
probably needs to be committed at the same time as the changes to
the GNU runtime, or there might be some conflicts. The Étoilé
patch may also conflict with the old definitions if it is
committed before this.

OK

-CGObjCRuntime *CreateObjCRuntime(llvm::Module &M);
+CGObjCRuntime *CreateObjCRuntime(llvm::Module &M,
+ const llvm::Type *LLVMIntType,
+ const llvm::Type *LLVMLongType);

Is not it possible to derive LLVMIntType and LLVMLongType info
based on Module ?

Maybe? Can I have a hint please?

IIUC, these are target specific int and long types. In that case,
TargetInfo should provide you this info. I guess this OK for now.

This whole interface will probably be tweaked a few more times before it's finalised, but since it's only called in one place it's easy to fix later.

Thanks! Make sure while(AI) is not an infinite loop.

Well spotted. It is now not an infinite loop...

David

clang.diff (45.1 KB)

In this method you do a field lookup by iterating over all ObjectTy
ivars. Would not it be more efficient to keep a map just like normal C
struct fields ?

You'd need to generate the map at some point, possibly the first time it's called?

The map is collected for C structure while determining its layout (in ConvertType()).

Since it works as-is, I'd rather do this later if profiling indicates this is a bottleneck.

A class is defined once but its fields are referenced multiple times, so it makes sense to avoid lookups during field access. That's what we do for C structures. Let's not diverge here.

+ assert(!verifyFunction(*CurFn));

nit-pick, add a message here.

+ // Warning: Use of this is strongly discouraged. Late binding
of instance
+ // variables is supported on some runtimes and so using static
binding can
+ // break code when libraries are updated. Only use this if
you have
+ // previously checked that the ObjCRuntime subclass in use
does not support
+ // late-bound ivars.

Please add assert() to check late-bound of ivars support.

I'd like to do this, but GodeGenTypes does not have a reference to
anything that has a reference to anything that has a reference to
the ObjCRuntime subclass, so it's non-trivial. Suggestions welcome.

You may want to collect runtime info while constructing CodeGenTypes.

Doing this would require major changes to the structure of CodeGenModule, since its Types member would have to be initialised after the runtime was constructed. It's also not quite semantically correct - you should still be able to get the static structure of the object using @defs() at compile time, and the compiler should emit a warning that you are introducing stronger constraints on the ABI that the runtime requires (and thus susceptible to the fragile base class problem).

hmm.. OK for now.

Otherwise patch looks OK. Very nice!
Thanks,

In this method you do a field lookup by iterating over all ObjectTy
ivars. Would not it be more efficient to keep a map just like normal C
struct fields ?

You'd need to generate the map at some point, possibly the first time it's called?

The map is collected for C structure while determining its layout (in ConvertType()).

Since it works as-is, I'd rather do this later if profiling indicates this is a bottleneck.

A class is defined once but its fields are referenced multiple times, so it makes sense to avoid lookups during field access. That's what we do for C structures. Let's not diverge here.

I have added a map from decls to field numbers which is filled in when the LLVM type of the object is generated. This isn't quite ideal, since subclasses have the same ivars at the same locations and this is not taken into account, but is simpler (and might be faster) than calculating the offset every lookup.

+ assert(!verifyFunction(*CurFn));

nit-pick, add a message here.

That code was copied-and-pasted from the GenerateCode method used for generating functions. For completeness I've added a message in to where I originally copied it from as well.

David

clang.diff (45.6 KB)

Great!. I'll apply this later today (or go ahead if you've commit access).

+ assert(!verifyFunction(*CurFn));

nit-pick, add a message here.

That code was copied-and-pasted from the GenerateCode method used for generating functions. For completeness I've added a message in to where I originally copied it from as well.

Hi David,

Some more minor nit-picky stuff (this is why small patches are better than big ones ;-):

+++ lib/CodeGen/CGObjCGNU.cpp (working copy)

+#include <stdarg.h>

Use <cstdarg> and move it to the end of the #include list if it is needed.

+ llvm::Value *GenerateIvarList(llvm::LLVMFoldingBuilder &Builder,
+ std::vector<llvm::Constant*> MethodNames,
+ std::vector<llvm::Constant*> MethodTypes,
+ std::vector<llvm::Constant*> MethodIMPs);

You're passing these vectors by value, I'd suggest passing them by (const) reference unless you really really want a copy made. likewise in a few other places.

+ // Get the selector Type.
+ std::vector<const llvm::Type*> Str2(2, PtrToInt8Ty);
+ const llvm::Type *SelStructTy = llvm::StructType::get(Str2);

Please use the variadic version of STructType::get to avoid the vector.

+ // C string type. Used in lots of places.
+ PtrToInt8Ty =
+ llvm::PointerType::getUnqual(llvm::Type::Int8Ty);
+ // Get the selector Type.
+ std::vector<const llvm::Type*> Str2(2, PtrToInt8Ty);
+ const llvm::Type *SelStructTy = llvm::StructType::get(Str2);
+ SelectorTy = llvm::PointerType::getUnqual(SelStructTy);
+ PtrToIntTy = llvm::PointerType::getUnqual(IntTy);
+ PtrTy = llvm::PointerType::getUnqual(llvm::Type::Int8Ty);

You call PointerType::getUnqual(llvm::Type::Int8Ty) twice here, please eliminate one.

+ // Object type
+ llvm::OpaqueType *OpaqueObjTy = llvm::OpaqueType::get();
+ llvm::Type *OpaqueIdTy = llvm::PointerType::getUnqual(OpaqueObjTy);
+ IdTy = llvm::PointerType::getUnqual(llvm::StructType::get(OpaqueIdTy, NULL));
+ OpaqueObjTy->refineAbstractTypeTo(IdTy);

This code is unsafe, you need to use a PATypeHolder as described here:
http://llvm.org/docs/ProgrammersManual.html#TypeResolve

Likewise in GenerateMethodList and other places you use refineAbstractTypeTo

+static void SetField(llvm::LLVMFoldingBuilder &Builder,
+ llvm::Value *Structure,
+ unsigned Index,
+ llvm::Value *Value) {
+ llvm::Value *element_ptr = Builder.CreateStructGEP(Structure, Index);
+ Builder.CreateStore(Value, element_ptr);

Nice, much better than a macro :wink: ;-). Please only indent the body by 2, not 4 though. Also, you could just use:

+ Builder.CreateStore(Value, Builder.CreateStructGEP(Structure, Index));

at which point it might make more sense to just inline this everywhere it is used.

+ if(SelTypes == 0) {
+ llvm::Constant *SelFunction =
+ TheModule.getOrInsertFunction("sel_get_uid", SelectorTy, PtrToInt8Ty, NULL);
+ cmd = Builder.CreateCall(SelFunction, SelName);

80 cols.

+ llvm::SmallVector<llvm::Value*, 2> Args;
+ Args.push_back(SelName);
+ Args.push_back(SelTypes);
+ cmd = Builder.CreateCall(SelFunction, Args.begin(), Args.end());

There is no need to use a SmallVector when the size is fixed, I'd just use:

    llvm::Value *Args = { SelName, SelTypes };
    cmd = Builder.CreateCall(SelFunction, Args, Args+2);

+ for(unsigned int i=0 ; i<1 ; i++) {

urr? A single iteration loop?

+ std::vector<const llvm::Type*> Args;
+ Args.push_back(SelfTy);
+ Args.push_back(SelectorTy);
+ for (unsigned i=0; i<ArgC ; i++) {
+ Args.push_back(ArgTy[i]);
+ }

Instead of the loop, you can use:

+ Args.insert(Args.end(), ArgTy, ArgTy+ArgC);

+ llvm::FunctionType *MethodTy = llvm::FunctionType::get(ReturnTy, Args, isVarArg);

80 cols

+ FnRetTy = OMD->getResultType(); ///.getCanonicalType();

Please remove this comment or improve it.

+ FnRetTy = CurFuncDecl->getType().getCanonicalType();
+ FnRetTy = cast<FunctionType>(FnRetTy)->getResultType();

This would be more idiomatic to write as:

FnRetTy = CurFuncDecl->getType()->getAsFunctionType()->getResultType();

+Value *ScalarExprEmitter::VisitObjCIvarRefExpr(ObjCIvarRefExpr *E) {

A lot of this code will be independent of whether the IVar is a scalar, complex or aggregate. I'd suggest moving the "get the address of an ivar" piece into CGObjC*.cpp and just have VisitObjCIvarRefExpr be a simple function which calls into the "form address" part and then just does a load. This can be done as a follow on patch, but will be required to get Complex and aggregate (e.g. struct) ivars working. ... actually, isn't this function just "EmitObjCIvarRefLValue" ?

@@ -341,6 +345,17 @@
        return LValue::MakeAddr(CGM.GetAddrOfGlobalVar(VD, false));
      else {
        llvm::Value *V = LocalDeclMap[D];
+ // Check if it's an implicit argument
+ // TODO: Other runtimes may set non-argument hidden variables
+ if (!V) {
+ std::string VarName(D->getName());
+ llvm::Function::arg_iterator AI = CurFn->arg_begin();
+ do {
+ if(VarName == AI->getName()) {
+ return LValue::MakeAddr(AI);
+ }
+ } while(++AI);
+ }
        assert(V && "BlockVarDecl not entered in LocalDeclMap?");

This is incredibly inefficient, what are you trying to accomplish here? I'd suggest removing this and handling it as part of a follow-on patch.

-Chris

One more try...

@@ -341,6 +345,17 @@
       return LValue::MakeAddr(CGM.GetAddrOfGlobalVar(VD, false));
     else {
       llvm::Value *V = LocalDeclMap[D];
+ // Check if it's an implicit argument
+ // TODO: Other runtimes may set non-argument hidden variables
+ if (!V) {
+ std::string VarName(D->getName());
+ llvm::Function::arg_iterator AI = CurFn->arg_begin();
+ do {
+ if(VarName == AI->getName()) {
+ return LValue::MakeAddr(AI);
+ }
+ } while(++AI);
+ }
       assert(V && "BlockVarDecl not entered in LocalDeclMap?");

This is incredibly inefficient, what are you trying to accomplish
here?

I have now fixed this (I hope) by adding the self decl from the method to the local decls map. That gets rid if this hack for accessing instance variables on self. It does not work for _cmd (or _call in the Étoilé runtime). I suggest the best way of doing this is to have a companion class for CGObjCRuntime in Sema which provides the decls for the runtime in use at the time (pulling the code out of ObjCActOnStartOfMethodDef). If you agree that's a sensible approach, then I'll add this in the next patch.

I've also removed the three methods in CGObjCGNU that are not yet used.

David

clang.diff (38.4 KB)

:slight_smile:

Applied!
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080324/004863.html

Sorry for the delay!

-Chris

There were two issues with the last ObjC CodeGen patch. This should address one of them (assignment to self).

I have modified CodeGenFunction so that CurFuncDecl is now a Decl (not a FunctionDecl) and stores either the FunctionDecl for C functions or the ObjCMethodDecl for Objective-C methods. In future, it will presumably store method decls for C++ methods as well. I have also added dyn_casts wherever it is used (only in one place, at the moment) to check it is the correct type. (Note: __func__ and friends do not currently work correctly for Objective-C).

I have also added a LoadObjCSelf method which returns an llvm::Value containing the current value of self. This is needed because self is accessed from outside CodeGenFunction.

Modifying the value of self now works and both ivar accesses and message sends use the correct one.

objc_ivar.diff (4.77 KB)

There were two issues with the last ObjC CodeGen patch. This should address one of them (assignment to self).

I have modified CodeGenFunction so that CurFuncDecl is now a Decl (not a FunctionDecl) and stores either the FunctionDecl for C functions or the ObjCMethodDecl for Objective-C methods. In future, it will presumably store method decls for C++ methods as well. I have also added dyn_casts wherever it is used (only in one place, at the moment) to check it is the correct type. (Note: __func__ and friends do not currently work correctly for Objective-C).

I have also added a LoadObjCSelf method which returns an llvm::Value containing the current value of self. This is needed because self is accessed from outside CodeGenFunction.

Modifying the value of self now works and both ivar accesses and message sends use the correct one.

objc_ivar.diff (4.77 KB)

Looks great, applied!

BTW, I didn't drop your new-runtime codegen support intentionally, I probably just didn't svn add it. Please resend and I'll check it in.

-Chris