Objective-C tidy up

Hi,

Here's the next installment of my ObjC changes. This is fairly minor, although the diff is quite large. The biggest change is that I've moved common code used to generate Objective-C methods and C functions out into a separate method in CodeGenFunction and I've moved GenerateObjCMethod from CodeGenFunction.cpp into CGObjC.cpp where it should probably have been from the start.

This diff also includes small changes to Sema to make self and _cmd implicit parameters. At some point, I imagine these will be pulled out into a superclass for method decls that can be used for both Objective-C and C++ methods.

David

objc_tidy.diff (18.1 KB)

Why should self and _cmd be ParmVarDecls? Shouldn't they be PredefinedExprs?

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,8 @@
    // The following are only used for method definitions, null otherwise.
    // FIXME: space savings opportunity, consider a sub-class.
    Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ llvm::SmallVector<ParmVarDecl *, 2> ImplicitParamInfo;

If there are always exactly two of these, just use:

ParmVarDecl *ImplicitParamInfo[2];

Otherwise, looks reasonable to me if it causes no testsuite regressions.

-Chris

Hi,

Here's the next installment of my ObjC changes. This is fairly minor, although the diff is quite large. The biggest change is that I've moved common code used to generate Objective-C methods and C functions out into a separate method in CodeGenFunction and I've moved GenerateObjCMethod from CodeGenFunction.cpp into CGObjC.cpp where it should probably have been from the start.

This diff also includes small changes to Sema to make self and _cmd implicit parameters. At some point, I imagine these will be pulled out into a superclass for method decls that can be used for both Objective-C and C++ methods.

Why should self and _cmd be ParmVarDecls? Shouldn't they be PredefinedExprs?

That's what I did to start with, but PredefinedExprs need special handling in codegen, while ParmVarDecls are automatically turned into loads of %var.addr. Since they are parameters, and since it simplifies the rest of the implementation, it seemed like a ParmVarDecl would be the right thing to choose.

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,8 @@
  // The following are only used for method definitions, null otherwise.
  // FIXME: space savings opportunity, consider a sub-class.
  Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ llvm::SmallVector<ParmVarDecl *, 2> ImplicitParamInfo;

If there are always exactly two of these, just use:

ParmVarDecl *ImplicitParamInfo[2];

There will be three for the Étoilé runtime. Someone expressed interest in porting Objective-C to the COLA runtime from the Viewpoints Research Institute as well, and this would have somewhere between one and four depending on how they did it.

David

The codegen is not the only consumer of the ASTs. We should go with the solution that makes more sense in the general case.

To me, having parameters being ParamVarDecls did make more sense in the general case. It is also more flexible - if they are PredefinedExprs, then you are going to need extra special cases in all of these AST consumers for each runtime that exposes others. If they are ParamVarDecls then you use the same handling in your consumer for implicit and explicit parameters unless you explicitly need to know the difference.

David

This makes a lot of sense to me. I can see a similar argument for "this" in C++.

Chris: What was your main insight on using PredefinedExprs instead?

Hi,

Here's the next installment of my ObjC changes. This is fairly minor, although the diff is quite large. The biggest change is that I've moved common code used to generate Objective-C methods and C functions out into a separate method in CodeGenFunction and I've moved GenerateObjCMethod from CodeGenFunction.cpp into CGObjC.cpp where it should probably have been from the start.

This diff also includes small changes to Sema to make self and _cmd implicit parameters. At some point, I imagine these will be pulled out into a superclass for method decls that can be used for both Objective-C and C++ methods.

Why should self and _cmd be ParmVarDecls? Shouldn't they be PredefinedExprs?

That's what I did to start with, but PredefinedExprs need special handling in codegen, while ParmVarDecls are automatically turned into loads of %var.addr. Since they are parameters, and since it simplifies the rest of the implementation, it seemed like a ParmVarDecl would be the right thing to choose.

ParmVarDecl is an artifact of common implementations, not the way the language works. They are not parameters, as they obviously don't have location info etc.

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,8 @@
// The following are only used for method definitions, null otherwise.
// FIXME: space savings opportunity, consider a sub-class.
Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ llvm::SmallVector<ParmVarDecl *, 2> ImplicitParamInfo;

If there are always exactly two of these, just use:

ParmVarDecl *ImplicitParamInfo[2];

There will be three for the Étoilé runtime. Someone expressed interest in porting Objective-C to the COLA runtime from the Viewpoints Research Institute as well, and this would have somewhere between one and four depending on how they did it.

Please use a fixed size array. That is cheaper than a smallvector<4>.

-Chris

Why should self and _cmd be ParmVarDecls? Shouldn't they be
PredefinedExprs?

That's what I did to start with, but PredefinedExprs need special
handling in codegen, while ParmVarDecls are automatically turned
into loads of %var.addr. Since they are parameters, and since it
simplifies the rest of the implementation, it seemed like a
ParmVarDecl would be the right thing to choose.

ParmVarDecl is an artifact of common implementations, not the way the
language works. They are not parameters, as they obviously don't have
location info etc.

They are artefacts of both all existing implementations (GNU, Apple, and POC) and the language philosophy ('no magic'), and is something relied on by a lot of existing code. You can promote them to explicit parameters by requesting a pointer to the function implementing the method (something supported by both Brad Cox's original design and implementation, and by OpenStep) and calling this as a C function. The fact that they have no source location is merely syntactic sugar - they are real parameters and Objective-C programmers expect to be able to treat them as such. Treating them as something special seems like an unnecessary layer of abstraction which reflects neither how they are used nor how they are implemented.

This is in contrast to super, which is an alias for self with different semantics (i.e. sending a message to super sends a message where the destination class is resolved at compile time, not at runtime).

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,8 @@
// The following are only used for method definitions, null
otherwise.
// FIXME: space savings opportunity, consider a sub-class.
Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ llvm::SmallVector<ParmVarDecl *, 2> ImplicitParamInfo;

If there are always exactly two of these, just use:

ParmVarDecl *ImplicitParamInfo[2];

There will be three for the Étoilé runtime. Someone expressed
interest in porting Objective-C to the COLA runtime from the
Viewpoints Research Institute as well, and this would have somewhere
between one and four depending on how they did it.

Please use a fixed size array. That is cheaper than a smallvector<4>.

I've made this change, and included the updated diff.

David

objc.diff (18.2 KB)

ParmVarDecl is an artifact of common implementations, not the way the
language works. They are not parameters, as they obviously don't have
location info etc.

They are artefacts of both all existing implementations (GNU, Apple, and POC) and the language philosophy ('no magic'), and is something relied on by a lot of existing code.

I fail to see the point. "this" in C++ has the exact same behavior and should also be modeled with PredefinedExpr.

This is in contrast to super, which is an alias for self with different semantics (i.e. sending a message to super sends a message where the destination class is resolved at compile time, not at runtime).

The clang AST currently captures something that is somewhat close to a parse tree. This helps ensure that we don't bog down nodes with extraneous data that they shouldn't have. Creating multiple extra decls for each ObjC method is wasteful and pointless. This really is an artifact of the implementation, even if the implementation shows through in certain language constructs.

-Chris

Why should self and _cmd be ParmVarDecls? Shouldn't they be
PredefinedExprs?

That's what I did to start with, but PredefinedExprs need special
handling in codegen, while ParmVarDecls are automatically turned
into loads of %var.addr. Since they are parameters, and since it
simplifies the rest of the implementation, it seemed like a
ParmVarDecl would be the right thing to choose.

ParmVarDecl is an artifact of common implementations, not the way the
language works. They are not parameters, as they obviously don't have
location info etc.

They are artefacts of both all existing implementations (GNU, Apple, and POC) and the language philosophy ('no magic'), and is something relied on by a lot of existing code. You can promote them to explicit parameters by requesting a pointer to the function implementing the method (something supported by both Brad Cox's original design and implementation, and by OpenStep) and calling this as a C function.

When Brad and I were developing ObjC @ PPI, we never considered having multiple ABI's and runtime's.

Today, it's clearer that the mechanism for calling a C function (that represents a method) is ABI dependent.

I don't see any reason why the front-end needs to commit to these details (especially for _cmd, which I regret, since it's so uncommon).

snaroff

There is one important difference with this. In C++, this is a keyword, so the following is an invalid C++ program:

int main(void)
{
     int this = 0;
     return this;
}

In contrast, the following is a valid Objective-C program:

int main(void)
{
     int self = 0;
     return self;
}

Any function in Objective-C can use self and _cmd as identifier names. Any occurrence of the 'this' token in a C++ program refers to an instance of an object with a type defined by the context. Occurrences of self / _cmd in an Objective-C program refer to some variable defined in some way in some scope, with an implicit declaration if the scope happens to be a method. This means that, unless you want to complicate parsing a lot, self and _cmd want to be some kind of decl refs, when encountered in the source code (this was originally done with super, where it was a decl ref with an implicit cast, but I had to change it because super really does require special handling). By the way, it is not uncommon to use self and _cmd symbols in Objective-C programs when you write method-like functions that you want to call directly, without going via the dynamic dispatch mechanism (the equivalent of C++ non-virtual member functions).

So, the question is not what should occurrences of the self and _cmd tokens be turned into (although that might be a separate question), it's what the entries in the decl map should be populated with. As far as my code goes, this can be pretty much any sort of decl. It needs to have a type associated with it for Sema to be able to be able to do type checking (for self, this is a pointer to an instance of the current class, e.g. NSObject*, for _cmd it is SEL). For CodeGen, it needs to have a name.

Steve - I don't intend to make any requirements on the calling convention. For the Étoilé runtime, _cmd is a field of a structure passed as the second argument, not the second argument. As long as CodeGen can get the argument from somewhere and create %self.addr and %_cmd.addr in the function preamble, it won't be a problem (currently I do this in the runtime-agnostic part. I plan on moving this into the CGObjCRuntime class, so that all of the traditional runtimes can do it this way, but the Étoilé runtime and potentially COLA or a future Apple runtime can do it a different way).

David

I fail to see the point. "this" in C++ has the exact same behavior and should also be modeled with PredefinedExpr.

There is one important difference with this. In C++, this is a keyword, so the following is an invalid C++ program:

int main(void)
{
   int this = 0;
   return this;
}

In contrast, the following is a valid Objective-C program:

int main(void)
{
   int self = 0;
   return self;
}

Ahhh, excellent point. Okay, you successfully convinced me that these should be decls, otherwise they can't participate in name lookup.

Any function in Objective-C can use self and _cmd as identifier names. Any occurrence of the 'this' token in a C++ program refers to an instance of an object with a type defined by the context.

Does 'super' in ObjC work the same way?

So, the question is not what should occurrences of the self and _cmd tokens be turned into (although that might be a separate question), it's what the entries in the decl map should be populated with. As far as my code goes, this can be pretty much any sort of decl. It needs to have a type associated with it for Sema to be able to be able to do type checking (for self, this is a pointer to an instance of the current class, e.g. NSObject*, for _cmd it is SEL). For CodeGen, it needs to have a name.

Gotcha. What do you think of a new subclass of ValueDecl?

It would be really nice to lazily allocate these objects when they are first used. This would allow us to avoid allocating the _cmd decl for almost every ObjC method, and avoid 'self' on many of them.

Thanks for being patient with me David!

-Chris

Any function in Objective-C can use self and _cmd as identifier
names. Any occurrence of the 'this' token in a C++ program refers
to an instance of an object with a type defined by the context.

Does 'super' in ObjC work the same way?

Yes. Objective-C is a pure superset of C, so anything that is valid C is valid Objective-C. Which means my current handling of super is wrong...

I'm not sure the best way to handle this, since super has very specific semantics in Objective-C (it is only valid as the receiver for messages) and so can't just have a phantom decl added. Suggestions welcome...

So, the question is not what should occurrences of the self and _cmd
tokens be turned into (although that might be a separate question),
it's what the entries in the decl map should be populated with. As
far as my code goes, this can be pretty much any sort of decl. It
needs to have a type associated with it for Sema to be able to be
able to do type checking (for self, this is a pointer to an instance
of the current class, e.g. NSObject*, for _cmd it is SEL). For
CodeGen, it needs to have a name.

Gotcha. What do you think of a new subclass of ValueDecl?

Do we need a new subclass? I'm not actually sure what this needs that a ValueDecl doesn't provide, so we could possibly just use a ValueDecl.

It would be really nice to lazily allocate these objects when they are
first used. This would allow us to avoid allocating the _cmd decl for
almost every ObjC method, and avoid 'self' on many of them.

That would be ideal. I'm not sure where the best place to put it would be. Ideally I would want failed lookups in the decl map to insert it, but I think a few places access the decl map. One other option would be to allocate the self decl on a per-class basis (since it's the same for every method in a class - a pointer to an instance of that class) and the _cmd once per module. It would probably easier to allocate them both once per class.

It is very rare to have a class which does not contain at least one reference to self. Most classes will override -init, and the common pattern for this is to start the method with:

if (nil == (self = [self init])) return nil;

References to _cmd are less common - they are generally only used for message forwarding (which tends to only be done in a root class) or for debugging. I'm not sure if we would get any noticeable speedup from allocating this on use over allocating it in the ObjCImplementationDecl and ObjCCategoryImplDecl[1].

David

[1] These should probably have a common superclass, since they are almost identical - in fact an ObjCImplementationDecl could inherit from ObjCCategoryImplDecl, since a class is effectively a category which is allowed to declare instance variables.

In fact, handling of super is wrong, but only slightly. I am testing if the identifier is "super" in only if CurMethodDecl is set, so the predefined expr is only create inside an objective-c method. There is a bug, however. I am testing for the identifier using strncmp, which means that super_class and other identifiers of this form will be treated as being 'super,' which is clearly wrong.

Can someone please change:

     if (SD == 0 && !strncmp(II.getName(), "super", 5)) {

to

     if (SD == 0 && !strcmp(II.getName(), "super")) {

On lib/Sema/SemaExpr.cpp:102.

David

Any function in Objective-C can use self and _cmd as identifier
names. Any occurrence of the 'this' token in a C++ program refers
to an instance of an object with a type defined by the context.

Does 'super' in ObjC work the same way?

Yes. Objective-C is a pure superset of C, so anything that is valid C is valid Objective-C. Which means my current handling of super is wrong...

I'm not sure the best way to handle this, since super has very specific semantics in Objective-C (it is only valid as the receiver for messages) and so can't just have a phantom decl added. Suggestions welcome...

Ok. I think we converged on this. the parsing code is nearly right for super (please check in your fix) and modeling it as predefined expr is great.

So, the question is not what should occurrences of the self and _cmd
tokens be turned into (although that might be a separate question),
it's what the entries in the decl map should be populated with. As
far as my code goes, this can be pretty much any sort of decl. It
needs to have a type associated with it for Sema to be able to be
able to do type checking (for self, this is a pointer to an instance
of the current class, e.g. NSObject*, for _cmd it is SEL). For
CodeGen, it needs to have a name.

Gotcha. What do you think of a new subclass of ValueDecl?

Do we need a new subclass? I'm not actually sure what this needs that a ValueDecl doesn't provide, so we could possibly just use a ValueDecl.

If that works, please go for it. Please add a big comment above te ValueDecl class indicating that ObjC uses it this way though.

It would be really nice to lazily allocate these objects when they are
first used. This would allow us to avoid allocating the _cmd decl for
almost every ObjC method, and avoid 'self' on many of them.

That would be ideal. I'm not sure where the best place to put it would be. Ideally I would want failed lookups in the decl map to insert it, but I think a few places access the decl map. One other option would be to allocate the self decl on a per-class basis (since it's the same for every method in a class - a pointer to an instance of that class) and the _cmd once per module. It would probably easier to allocate them both once per class.

Take a look at how name lookup is currently working and how builtins are lazily created, you should be able to do something similar.

It is very rare to have a class which does not contain at least one reference to self. Most classes will override -init, and the common pattern for this is to start the method with:

if (nil == (self = [self init])) return nil;

Sure, but we'd need to make one decl for self in *every method* that references it, not once per class.

References to _cmd are less common - they are generally only used for message forwarding (which tends to only be done in a root class) or for debugging.

Sure.

[1] These should probably have a common superclass, since they are almost identical - in fact an ObjCImplementationDecl could inherit from ObjCCategoryImplDecl, since a class is effectively a category which is allowed to declare instance variables.

I think it would be great to unify some of this stuff. There is a lot of awkward code in the objc front-end for groking this stuff that should be unified, but I haven't figured out how. Suggestions welcome :slight_smile:

-chris

Any function in Objective-C can use self and _cmd as identifier
names. Any occurrence of the 'this' token in a C++ program refers
to an instance of an object with a type defined by the context.

Does 'super' in ObjC work the same way?

Yes. Objective-C is a pure superset of C, so anything that is valid
C is valid Objective-C. Which means my current handling of super is
wrong...

I'm not sure the best way to handle this, since super has very
specific semantics in Objective-C (it is only valid as the receiver
for messages) and so can't just have a phantom decl added.
Suggestions welcome...

Ok. I think we converged on this. the parsing code is nearly right
for super (please check in your fix) and modeling it as predefined
expr is great.

I think super is right now.

So, the question is not what should occurrences of the self and _cmd
tokens be turned into (although that might be a separate question),
it's what the entries in the decl map should be populated with. As
far as my code goes, this can be pretty much any sort of decl. It
needs to have a type associated with it for Sema to be able to be
able to do type checking (for self, this is a pointer to an instance
of the current class, e.g. NSObject*, for _cmd it is SEL). For
CodeGen, it needs to have a name.

Gotcha. What do you think of a new subclass of ValueDecl?

Do we need a new subclass? I'm not actually sure what this needs
that a ValueDecl doesn't provide, so we could possibly just use a
ValueDecl.

If that works, please go for it. Please add a big comment above te
ValueDecl class indicating that ObjC uses it this way though.

ValueDecl's constructor is currently private, so I've added a public Create() method. This seems to work. I wasn't using anything specific to the param decl, so (after fixing a load of compiler complaints) it is working again.

It would be really nice to lazily allocate these objects when they
are
first used. This would allow us to avoid allocating the _cmd decl
for
almost every ObjC method, and avoid 'self' on many of them.

That would be ideal. I'm not sure where the best place to put it
would be. Ideally I would want failed lookups in the decl map to
insert it, but I think a few places access the decl map. One other
option would be to allocate the self decl on a per-class basis
(since it's the same for every method in a class - a pointer to an
instance of that class) and the _cmd once per module. It would
probably easier to allocate them both once per class.

Take a look at how name lookup is currently working and how builtins
are lazily created, you should be able to do something similar.

This is fine for codegen. Will other AST consumers object if implicit parameter decls aren't there if they are not referenced?

It is very rare to have a class which does not contain at least one
reference to self. Most classes will override -init, and the common
pattern for this is to start the method with:

if (nil == (self = [self init])) return nil;

Sure, but we'd need to make one decl for self in *every method* that
references it, not once per class.

Can we not have one decl object instance per class, but one decl map entry (pointing to this instance)?

References to _cmd are less common - they are generally only used
for message forwarding (which tends to only be done in a root class)
or for debugging.

Sure.

[1] These should probably have a common superclass, since they are
almost identical - in fact an ObjCImplementationDecl could inherit
from ObjCCategoryImplDecl, since a class is effectively a category
which is allowed to declare instance variables.

I think it would be great to unify some of this stuff. There is a lot
of awkward code in the objc front-end for groking this stuff that
should be unified, but I haven't figured out how. Suggestions
welcome :slight_smile:

I still have about 2000 lines of uncommitted diffs, but when my tree has sync'd up with trunk a bit better then I'll take a look at the front end a bit and see what can be unified. There are a few other strange things there, such as the 'isArrow' property on ivar refs which, unless I am missing something important, can never be false.

David

Ooops - forgot to attach the updated diff.

objc.diff (21 KB)

Do we need a new subclass? I'm not actually sure what this needs

that a ValueDecl doesn't provide, so we could possibly just use a
ValueDecl.

If that works, please go for it. Please add a big comment above te
ValueDecl class indicating that ObjC uses it this way though.

ValueDecl's constructor is currently private, so I've added a public Create() method. This seems to work. I wasn't using anything specific to the param decl, so (after fixing a load of compiler complaints) it is working again.

Ok, this looks good. You should probably use a specific Decl enum value for them though. Would it be useful to add a subclass of ValueDecl just to support easy use of :

if (isa<FooDecl>(X))

That would check for the enum? In fact, I really do think this would be preferable, just because the name ValueDecl really does seem like an abstract type. What do you think?

Take a look at how name lookup is currently working and how builtins
are lazily created, you should be able to do something similar.

This is fine for codegen. Will other AST consumers object if implicit parameter decls aren't there if they are not referenced?

Builtins are correct in the ast. They are lazily generated by sema, but sema updates the AST and namelookup appropriately.

ry rare to have a class which does not contain at least one
reference to self. Most classes will override -init, and the common
pattern for this is to start the method with:

if (nil == (self = [self init])) return nil;

Sure, but we'd need to make one decl for self in *every method* that
references it, not once per class.

Can we not have one decl object instance per class, but one decl map entry (pointing to this instance)?

As we discussed on irc, this is a totally awesome idea. Lets talk about this as a followup patch though.

[1] These should probably have a common superclass, since they are
almost identical - in fact an ObjCImplementationDecl could inherit
from ObjCCategoryImplDecl, since a class is effectively a category
which is allowed to declare instance variables.

I think it would be great to unify some of this stuff. There is a lot
of awkward code in the objc front-end for groking this stuff that
should be unified, but I haven't figured out how. Suggestions
welcome :slight_smile:

I still have about 2000 lines of uncommitted diffs, but when my tree has sync'd up with trunk a bit better then I'll take a look at the front end a bit and see what can be unified. There are a few other strange things there, such as the 'isArrow' property on ivar refs which, unless I am missing something important, can never be false.

Ok! We welcome cleanup for the ObjC ASTs!

Some other thoughts about your patch:

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,9 @@
    // The following are only used for method definitions, null otherwise.
    // FIXME: space savings opportunity, consider a sub-class.
    Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ ValueDecl *ImplicitParamInfo[4];
+ unsigned NumImplicitParams;

Please turn this into SelfDecl/CmdDecl/etc with explicit getters/setters instead of providing iteration with a count.

+++ lib/CodeGen/CodeGenFunction.cpp (working copy)
@@ -27,7 +27,11 @@

  CodeGenFunction::CodeGenFunction(CodeGenModule &cgm)
    : CGM(cgm), Target(CGM.getContext().Target), SwitchInsn(NULL),
- CaseRangeBlock(NULL) {}
+ CaseRangeBlock(NULL) {
+ LLVMIntTy = ConvertType(getContext().IntTy);
+ LLVMPointerWidth = static_cast<unsigned>(

Do we need a new subclass? I'm not actually sure what this needs

that a ValueDecl doesn't provide, so we could possibly just use a
ValueDecl.

If that works, please go for it. Please add a big comment above te
ValueDecl class indicating that ObjC uses it this way though.

ValueDecl's constructor is currently private, so I've added a public
Create() method. This seems to work. I wasn't using anything
specific to the param decl, so (after fixing a load of compiler
complaints) it is working again.

Ok, this looks good. You should probably use a specific Decl enum
value for them though. Would it be useful to add a subclass of
ValueDecl just to support easy use of :

if (isa<FooDecl>(X))

That would check for the enum? In fact, I really do think this would
be preferable, just because the name ValueDecl really does seem like
an abstract type. What do you think?

I've created ImplicitParamDecl as a subclass of ValueDecl (which declares no new state or methods) and moved the static constructor into this class.

ry rare to have a class which does not contain at least one
reference to self. Most classes will override -init, and the common
pattern for this is to start the method with:

if (nil == (self = [self init])) return nil;

Sure, but we'd need to make one decl for self in *every method* that
references it, not once per class.

Can we not have one decl object instance per class, but one decl map
entry (pointing to this instance)?

As we discussed on irc, this is a totally awesome idea. Lets talk
about this as a followup patch though.

Sure - let me get the stuff I've already done tidied and committed before I start on new stuff...

[1] These should probably have a common superclass, since they are
almost identical - in fact an ObjCImplementationDecl could inherit
from ObjCCategoryImplDecl, since a class is effectively a category
which is allowed to declare instance variables.

I think it would be great to unify some of this stuff. There is a
lot
of awkward code in the objc front-end for groking this stuff that
should be unified, but I haven't figured out how. Suggestions
welcome :slight_smile:

I still have about 2000 lines of uncommitted diffs, but when my tree
has sync'd up with trunk a bit better then I'll take a look at the
front end a bit and see what can be unified. There are a few other
strange things there, such as the 'isArrow' property on ivar refs
which, unless I am missing something important, can never be false.

Ok! We welcome cleanup for the ObjC ASTs!

Some other thoughts about your patch:

+++ include/clang/AST/DeclObjC.h (working copy)
@@ -91,7 +91,9 @@
   // The following are only used for method definitions, null
otherwise.
   // FIXME: space savings opportunity, consider a sub-class.
   Stmt *Body;
- ParmVarDecl *SelfDecl;
+ // Decls for implicit parameters
+ ValueDecl *ImplicitParamInfo[4];
+ unsigned NumImplicitParams;

Please turn this into SelfDecl/CmdDecl/etc with explicit getters/
setters instead of providing iteration with a count.

Done.

+++ lib/CodeGen/CodeGenFunction.cpp (working copy)
@@ -27,7 +27,11 @@

CodeGenFunction::CodeGenFunction(CodeGenModule &cgm)
   : CGM(cgm), Target(CGM.getContext().Target), SwitchInsn(NULL),
- CaseRangeBlock(NULL) {}
+ CaseRangeBlock(NULL) {
+ LLVMIntTy = ConvertType(getContext().IntTy);
+ LLVMPointerWidth = static_cast<unsigned>(
+
getContext
().getTypeSize(getContext().getPointerType(getContext().VoidTy)));
+}

Watch out for 80 columns. You can directly get the size of a pointer
from TargetData instead of cons'ing up a pointer type and asking for
its size.

I moved this code here from GenerateCode() (someone else wrote it). I've now modified it to get the number from Target - please check it is actually doing the right thing...

+const llvm::Type *CodeGenTypes::ConvertReturnType(QualType T) {
+ if (T->isVoidType())
+ return llvm::Type::VoidTy; // Result of function uses llvm
void.
+ else
+ return ConvertType(T);
+}

Please indent by 2.

Fixed.

David

And here's the diff that was meant to go with the last email...

objc.diff (22.7 KB)