[PATCH] Ordinary name lookup for class and enum names in C++

The patch introduces support for ordinary name lookup of class and
enum names in C++. For example:

  class C { };
  C c;

Incidentally, it also introduces support for declaring classes with
"class" in C++, but that's rather trite.

Interesting bits in this patch:
  - Sema::ActOnTag: In C++, we introduce an implicit typedef into the
same scope as the tag type. This allows the name of a class or enum to
be found via ordinary name lookup. Note that we don't introduce this
implicit typedef if there is already a declaration with that name,
because the existing declaration takes precedence, e.g.,
      int D(int);
      class D { };
      // ordinary name lookup of "D" finds the function. Use "class D"
to get to the class.

  - Sema::ActOnDeclarator: Introducing a declaration for something
that has the same name as a class is okay; the new declaration hides
the class name. For example:
      class D { };
      int D(int);
      // ordinary name lookup of "D" finds the function. Use "class D"
to get to the class.
    To deal with this, we just go ahead and delete the implicit
typedef when a new declaration comes in.

I briefly considered eliminating the notion of the implicit typedef,
and instead modifying name lookup to look into both places in C++...
but this approach is cleaner.

  - Doug

clang-class-names.patch (13.2 KB)

The patch introduces support for ordinary name lookup of class and
enum names in C++. For example:

class C { };
C c;

Incidentally, it also introduces support for declaring classes with
"class" in C++, but that's rather trite.

Great!

I briefly considered eliminating the notion of the implicit typedef,
and instead modifying name lookup to look into both places in C++...
but this approach is cleaner.

Lets talk about this a bit. I always assumed that we would modify the name lookup machinery to handle this instead of adding a typedef... however I admit that I haven't fully thought through the implications and so I may not see all the problems with this approach.

It seems that it would be possible to handle this by making the scope lookup take a bitmask of different namespaces to search. When looking up 'class foo' only the tag namespace would be searched, but when looking up 'foo' in C++, both the tag and normal namespace would be searched.

I'm sure there are more details to it than this, but avoiding creation of a dummy typedef decl for every struct/class/union is a pretty big win. What drawbacks do you see?

-Chris

> I briefly considered eliminating the notion of the implicit typedef,
> and instead modifying name lookup to look into both places in C++...
> but this approach is cleaner.
>

Lets talk about this a bit. I always assumed that we would modify the name
lookup machinery to handle this instead of adding a typedef... however I
admit that I haven't fully thought through the implications and so I may not
see all the problems with this approach.

It seems that it would be possible to handle this by making the scope
lookup take a bitmask of different namespaces to search. When looking up
'class foo' only the tag namespace would be searched, but when looking up
'foo' in C++, both the tag and normal namespace would be searched.

Yeah, that's the first implementation that came to mind, but I didn't
think it would pan out...

I'm sure there are more details to it than this, but avoiding creation of a
dummy typedef decl for every struct/class/union is a pretty big win. What
drawbacks do you see?

When I started coding it that way, I found that it broke an invariant
that I really like. When we lookup what an identifier means, and we
find out that it is a type (e.g., with isTypeName), we always get a
typedef in C and C++. (In Objective C, we can get a ObjCInterfaceDecl,
although I'd rather get a TypedefDecl that refers to that
ObjCInterfaceDecl, so this is a real invariant). That's handy: it
means that we can use ASTContext::getTypedefType to get at the actual
type easily, rather than dealing with TypedefDecls, RecordDecls,
EnumDecls, and ObjCInterfaceDecls separately.

Also, from an implementation standpoint, it gets a little weird to
cope with ordinary name lookup that can find multiple things in the
same scope and then choose among them. Say we have:

  void D(int);
  class D;

When performing ordinary name lookup for 'D', we'll see the class 'D'
first. But that's not the result of name lookup; we need to skip it
and find the function 'D' that's in the same scope. But don't consider
things in an outer scope:

  void D(int);

  namespace N {
    class D;
  }

At this point, I decided that while it is implementable in this way,
my gut feeling is that things are going to get messy once we get to
dealing with namespace scopes, class scopes, etc. Implicit typedefs
are conceptually simple and they have the right semantics.

  - Doug

Yeah, that's the first implementation that came to mind, but I didn't
think it would pan out...

I'm sure there are more details to it than this, but avoiding creation of a
dummy typedef decl for every struct/class/union is a pretty big win. What
drawbacks do you see?

When I started coding it that way, I found that it broke an invariant
that I really like. When we lookup what an identifier means, and we
find out that it is a type (e.g., with isTypeName), we always get a
typedef in C and C++. (In Objective C, we can get a ObjCInterfaceDecl,
although I'd rather get a TypedefDecl that refers to that
ObjCInterfaceDecl, so this is a real invariant). That's handy: it
means that we can use ASTContext::getTypedefType to get at the actual
type easily, rather than dealing with TypedefDecls, RecordDecls,
EnumDecls, and ObjCInterfaceDecls separately.

Ok, I agree that it is a nice invariant (one we don't currently have, at least with objc), but it seems that this can be provided with a nicer API. It sounds like you really want "isTypeName()" and "gimmeTheType(identifier)" which returns a QualType. This could be done in one place, which would handle any nasties, and let the clients have a similar clean interface.

Also, from an implementation standpoint, it gets a little weird to
cope with ordinary name lookup that can find multiple things in the
same scope and then choose among them. Say we have:

void D(int);
class D;

When performing ordinary name lookup for 'D', we'll see the class 'D'
first. But that's not the result of name lookup; we need to skip it
and find the function 'D' that's in the same scope.

Right. This namespace lookup would have to handle this case correctly. Since there are a small number of namespaces, this extra check is pretty cheap. We already have to do similar things in a couple places the scope stuff Argiris recently touched.

But don't consider things in an outer scope:

void D(int);

namespace N {
   class D;
}

At this point, I decided that while it is implementable in this way,
my gut feeling is that things are going to get messy once we get to
dealing with namespace scopes, class scopes, etc.

This is a pretty different case. Here, the two D's are really in two different Scope objects: this is a shadowing case, not a synonym-in-different-namespace case. This is more analogous to:

  void D(int);
  void foo() {
    class D;
}

or:

class D;
void foo() {
   int D;
}

The two D's are very different.

-Chris

Doug Gregor wrote:

Also, from an implementation standpoint, it gets a little weird to
cope with ordinary name lookup that can find multiple things in the
same scope and then choose among them. Say we have:

  void D(int);
  class D;

When performing ordinary name lookup for 'D', we'll see the class 'D'
first. But that's not the result of name lookup; we need to skip it
and find the function 'D' that's in the same scope. But don't consider
things in an outer scope:

  void D(int);

  namespace N {
    class D;
  }
  
How about modifying IdentifierResolver::AddDecl so that, when in C++ and decl is a RecordDecl, it looks for 'D' in normal namespace and in the same scope,
if a decl exists, it puts the RecordDecl on top of it in the chain:

class D
^- void D()

That way non-tag decls get precedence when searching in both tag and normal namespaces.

-Argiris

I went ahead with this approach. The attached patch allows searching
in both the tag and ordinary namespaces in C++, so that we get the
right name lookup for tag names without introducing implicit typedefs.
Argiris, you might want to take a look at the IdentifierResolver bits.

Interesting bits:
  - IdentifierResolver::AddDecl does exactly what Argiris suggested:
it makes sure that if there is both a tag and an ordinary name in the
same scope, the ordinary name will be on top.

  - IdentifierResolver::Lookup allows NS to be a bitmask, so that we
can search multiple namespaces.

  - Sema::LookupDecl tweaks ordinary name lookup to also look for tags
(C++ only)

  - Sema::ActOnDeclarator ignores any previous declaration it finds in
the tag namespace, because any other declarations will override them
anyway. We *may* need to do the same thing somewhere for
Objective-C++, but I didn't touch that.

  - ASTContext::getTypeDeclType retrieves the QualType for any
TypeDecl. This is the "gimmeTheType" operation that makes it look like
we have the named-types-are-typedefs invariant, even though we don't..
Note that it builds RecordType and EnumType nodes for class types and
enums, rather than TagType nodes. I've switched to using
getTypeDeclType in those places in the front end where we can end up
finding a named type; longer-term, I'd like to remove getTypedefType,
getObjCInterfaceType, and getTagDeclType from ASTContext and have all
clients call getTypeDeclType instead.

The "-rel" version of the patch is relative against the "class" tag
handling patch, here:
  http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-April/001411.html

The other version of the patch is against the latest Clang from
Subversion; it includes the class tag handling changes.

  - Doug

clang-class-names-bitmask.patch (15.9 KB)

clang-class-names-bitmask-rel.patch (12 KB)

How about modifying IdentifierResolver::AddDecl so that, when in C++ and
decl is a RecordDecl, it looks for 'D' in normal namespace and in the same
scope,
if a decl exists, it puts the RecordDecl on top of it in the chain:

class D
^- void D()

That way non-tag decls get precedence when searching in both tag and normal
namespaces.

I went ahead with this approach.

Great!

The attached patch allows searching
in both the tag and ordinary namespaces in C++, so that we get the
right name lookup for tag names without introducing implicit typedefs.
Argiris, you might want to take a look at the IdentifierResolver bits.

Yes, Argiris please do review this.

Interesting bits:
- ASTContext::getTypeDeclType retrieves the QualType for any
TypeDecl. This is the "gimmeTheType" operation that makes it look like
we have the named-types-are-typedefs invariant, even though we don't..

Sounds good.

Note that it builds RecordType and EnumType nodes for class types and
enums, rather than TagType nodes. I've switched to using
getTypeDeclType in those places in the front end where we can end up
finding a named type; longer-term, I'd like to remove getTypedefType,
getObjCInterfaceType, and getTagDeclType from ASTContext and have all
clients call getTypeDeclType instead.

Sounds good. In the short term, it looks like all clients of ASTContext::getTagDeclType can directly change to calling getTypeDeclType now, want to remove it?

The "-rel" version of the patch is relative against the "class" tag
handling patch, here:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-April/001411.html

Looks great to me, please apply!

-Chris