C++ PATCH: Basic inheritance syntax, semantics

This patch adds very basic support for parsing and type-checking class
inheritance in C++. It'll parse the base-specifier list, e.g.,

  class D : public B1, virtual public B2 { };

and do some of the simpler semantic checks (B1 and B2 are classes;
they aren't unions or incomplete types, etc). This patch *does not*
introduce any representation of base classes into the AST. That (along
with the semantic checks it enables) will be a follow-on patch.

The only interesting part of this patch is that I've completely
removed Parser::ParseStructUnionSpecifier in favor of the new
Parser::ParseClassSpecifier. The latter parses C++ class specifiers,
which are a superset of the C99 struct-or-union-specifier. So there's
no reason to keep around the C-only ParseStructOrUnionSpecifier.

This patch depends on my pending patches to add support for parsing
classes and for name lookup of class names.

  - Doug

clang-inherit.patch (19.6 KB)

This patch adds very basic support for parsing and type-checking class
inheritance in C++. It'll parse the base-specifier list, e.g.,

class D : public B1, virtual public B2 { };

Nice. One non-technical thing I should mention: Argiris is scheduled to implement class-related (methods, instance vars, etc) parsing and sema for google summer of code, which should start in a couple(?) weeks. I (as his mentor) have no problem with him working on other stuff if he prefers, but you guys should probably sync up so that you don't step on each others toes.

and do some of the simpler semantic checks (B1 and B2 are classes;
they aren't unions or incomplete types, etc). This patch *does not*
introduce any representation of base classes into the AST. That (along
with the semantic checks it enables) will be a follow-on patch.

Sounds good.

The only interesting part of this patch is that I've completely
removed Parser::ParseStructUnionSpecifier in favor of the new
Parser::ParseClassSpecifier. The latter parses C++ class specifiers,
which are a superset of the C99 struct-or-union-specifier. So there's
no reason to keep around the C-only ParseStructOrUnionSpecifier.

Makes sense. Please add a comment that indicates that it is used in C though:

+++ include/clang/Parse/Parser.h (working copy)

+ //

> This patch adds very basic support for parsing and type-checking class
> inheritance in C++. It'll parse the base-specifier list, e.g.,
>
> class D : public B1, virtual public B2 { };
>

Nice. One non-technical thing I should mention: Argiris is scheduled to
implement class-related (methods, instance vars, etc) parsing and sema for
google summer of code, which should start in a couple(?) weeks. I (as his
mentor) have no problem with him working on other stuff if he prefers, but
you guys should probably sync up so that you don't step on each others toes.

I'm happy to leave the class-related bits to Argiris. Thanks for the heads-up.

Makes sense. Please add a comment that indicates that it is used in C
though:

+++ include/clang/Parse/Parser.h (working copy)

+
//===--------------------------------------------------------------------===//
+ // C++ 9: classes [class]
+ void ParseClassSpecifier(DeclSpec &DS);

Something like:

+ // C++ 9: classes [class] and C struct/union bodies.

or something.

Okay.

Also, perhaps ParseStructUnionSpecifier/ParseClassSpecifier
should be renamed ParseStructUnionClassSpecifier, though that's pretty long
:). ParseSUCSpecifier?

Yuck. In C++-speak, it's just a "class specifier" because classes,
structs, and unions are all class types. That's why I prefer the name
ParseClassSpecifier (but the comment is helpful).

+ AccessSpecifier IsAccessSpecifier();

should be 'const'. Also, for better or worse, other 'is' predicates in
Parser (e.g. isDeclarationSpecifier) use a lowercase 'i', please be
consistent with that. Also, it's somewhat strange for an 'is' method to
return something other than bool. Maybe this should be:
"getAccessSpecifierIfPresent()" or something like that?

I like getAccessSpecifierIfPresent; let's go with that.

+void Parser::ParseBaseClause(DeclTy *ClassDecl)
+{
+ assert(Tok.is(tok::colon) && "Not a base clause");
+ SourceLocation ColonLoc = ConsumeToken();
+
+ Actions.ActOnBaseClause(ClassDecl, ColonLoc);
+ while (true) {
+ // Parse a base-specifier.
+ if (ParseBaseSpecifier(ClassDecl)) {
...

I know the code is unfinished, but what do you anticipate using
'ActOnBaseClause' for? Right now it provides a Loc for the colon to sema
and does checking for unions. If we don't currently care about the colon
loc, would it make sense to do the union check in ActOnBaseSpecifier? It's
obviously not a big deal, but one fewer action is nice.

I don't anything else planned for ActOnBaseClause... the union check
could move into ActOnBaseSpecifier.

+/// ParseBaseSpecifier - Parse a C++ base-specifier.
+///

Please explain a bit more about what a base-specifier is to give quick
intuition. This is important for people who aren't fully immersed in the
grammar but want to understand what the code is doing quickly. Something
like:

+/// ParseBaseSpecifier - Parse a C++ base-specifier. base-specifier is
+/// one entry in the base class list of a class specifier, for example:
+/// class foo : public bar, private baz {
+/// 'public bar' and 'private baz' are each base-specifiers.

We don't do this consistently in the parser, but it is very useful where we
do (for me specifically, I highly value similar comments in the ObjC parts
:wink:

Okay.

+ // We have an identifier; check whether it is actually a type.
+ if (DeclTy *BaseType = Actions.isTypeName(*Tok.getIdentifierInfo(),
+ CurScope)) {
...

To reduce nesting, how about:

+ // We have an identifier; check whether it is actually a type.
+ DeclTy *BaseType = Actions.isTypeName(*Tok.getIdentifierInfo(),
CurScope);

if (BaseType == 0) {
  error
  return;
}
... handle good case...

Okay.

+ // Find the complete source range for the base-specifier.
+ // FIXME: Is there a better way to
+ SourceRange Range(StartLoc);
+ unsigned RawSpecifierEnd
+ = Tok.getLocation().getRawEncoding() + Tok.getLength();
+ Range.setEnd(SourceLocation::getFromRawEncoding(RawSpecifierEnd));

You shouldn't need to do this. The end location of a source range actually
points to the first character of the last token.

... and that token is still highlighted with the source range. I
didn't realize that. Thanks!

FWIW, the second patch I sent just added the regression test and fixed
a typo or two. Nothing that requires a separate review.

  - Doug

Also, perhaps ParseStructUnionSpecifier/ParseClassSpecifier
should be renamed ParseStructUnionClassSpecifier, though that's pretty long
:). ParseSUCSpecifier?

Yuck. In C++-speak, it's just a "class specifier" because classes,
structs, and unions are all class types. That's why I prefer the name
ParseClassSpecifier (but the comment is helpful).

Ok, fair enough!

+ AccessSpecifier IsAccessSpecifier();

should be 'const'. Also, for better or worse, other 'is' predicates in
Parser (e.g. isDeclarationSpecifier) use a lowercase 'i', please be
consistent with that. Also, it's somewhat strange for an 'is' method to
return something other than bool. Maybe this should be:
"getAccessSpecifierIfPresent()" or something like that?

I like getAccessSpecifierIfPresent; let's go with that.

Ok.

+ // Find the complete source range for the base-specifier.
+ // FIXME: Is there a better way to
+ SourceRange Range(StartLoc);
+ unsigned RawSpecifierEnd
+ = Tok.getLocation().getRawEncoding() + Tok.getLength();
+ Range.setEnd(SourceLocation::getFromRawEncoding(RawSpecifierEnd));

You shouldn't need to do this. The end location of a source range actually
points to the first character of the last token.

... and that token is still highlighted with the source range. I
didn't realize that. Thanks!

np, it is very non-obvious, but is an important simplification for clients as you noticed :wink:

FWIW, the second patch I sent just added the regression test and fixed
a typo or two. Nothing that requires a separate review.

Ok, sounds great!

-Chris

Doug Gregor wrote:

Nice. One non-technical thing I should mention: Argiris is scheduled to
implement class-related (methods, instance vars, etc) parsing and sema for
google summer of code, which should start in a couple(?) weeks. I (as his
mentor) have no problem with him working on other stuff if he prefers, but
you guys should probably sync up so that you don't step on each others toes.
    
I'm happy to leave the class-related bits to Argiris. Thanks for the heads-up.
  
Thanks Doug! Just FYI, here's the deliverables of the proposal:

-classes with nested types (nested classes/typedefs)
-basic method support
---inline member function definitions
---member function declarations
---member function definitions outside of class definition
---use of instance variables and class nested types inside a member function
-access specifier for class members
-forbidding use of private/protected data member references outside of class methods

I also have a C++ namespaces patch on queue waiting until I sort out some name-lookup things first.

I'm really glad that there's another developer pushing for C++ support too! :slight_smile:

-Argiris

Thanks Doug! Just FYI, here's the deliverables of the proposal:

-classes with nested types (nested classes/typedefs)
-basic method support
---inline member function definitions
---member function declarations
---member function definitions outside of class definition
---use of instance variables and class nested types inside a member
function
-access specifier for class members
-forbidding use of private/protected data member references outside of
class methods

That looks like a reasonable chunk of work for the summer. Good luck!
I'll make sure not to step on those parts of the front end.

I also have a C++ namespaces patch on queue waiting until I sort out some
name-lookup things first.

Oh, good. I didn't know if anyone was picking this up.

I'm really glad that there's another developer pushing for C++ support too!
:slight_smile:

As time permits... anyway, I'm off playing with overloading at the
moment, so I don't expect that I'll be stepping on anyone's toes.

  - Doug

Hi,

A summary of the attached patch is this:

-Added TranslationUnitDecl class to serve as top declaration context
-ASTContext gets a TUDecl member and a getTranslationUnitDecl() function
-All ScopedDecls get the TUDecl as DeclContext when declared at global scope

Note that I didn't touch the objc decls, so TUDecl doesn't apply for them (getParent() of ObjCMethod will return null).
Someone more familiar with objc should probably sort out the DeclContext chains for them.

-Argiris

tudecl.patch (20.3 KB)

Looks great to me, please apply!

-Chris

Chris Lattner wrote:

Hi,

A summary of the attached patch is this:

-Added TranslationUnitDecl class to serve as top declaration context
-ASTContext gets a TUDecl member and a getTranslationUnitDecl() function
-All ScopedDecls get the TUDecl as DeclContext when declared at global scope

Note that I didn't touch the objc decls, so TUDecl doesn't apply for them (getParent() of ObjCMethod will return null).
Someone more familiar with objc should probably sort out the DeclContext chains for them.

Looks great to me, please apply!

Applied:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080414/005288.html