[PATCH]: Parsing for C++ classes

Hi,

The attached patch adds C++ classes support to the Parser. I'll provide the Sema support in a separate patch.
Here's what can be parsed:

-Access specifiers
-method declarations and inline method definitions
-static members
-typedefs, enums, and nested classes
-'this' keyword

Inline method definitions are lexed and their tokens stored for later parsing, after the class is complete.

This test is representative of the new features:

// RUN: clang -parse-noop -verify %s
class C {

protected:
  typedef int A,B;
  static int sf(), u;

  auto int sx; // expected-error {{error: storage class specified for a member declaration}}
  register int sy; // expected-error {{error: storage class specified for a member declaration}}
  extern int sz; // expected-error {{error: storage class specified for a member declaration}}

  struct S {};
  enum {};
  int; // expected-error {{error: declaration does not declare anything}}

  int b : 1, w : 2;
  typedef int E : 1; // expected-error {{error: cannot declare 'E' to be a bit-field type}}
  static int sb : 1; // expected-error {{error: static member 'sb' cannot be a bit-field}}

  int i = 0; // expected-error {{error: member 'i' can be initialized only if it is a static const integral data member}}

public:
  void m() {
    int l = 2;
  }
private:
  int x,f(),y,g();
};

int x = this; // expected-error {{error: invalid use of 'this' at top level}}

-Argiris

cxx-classes-parser.patch (36.1 KB)

I've attached patches against the latest trunk of Parser and Sema changes for C++ classes support.

The Parser->Sema diff for "cxx-sema.patch" was kinda manually created to make reviewing just the Sema changes easier, so you cannot apply it. If you want to try out Parser+Sema changes you need to apply the "cxx-parser-sema.patch", which contains the unified diff.

Here's a few more details in addition to the summary for the parser changes which I mentioned in the previous post:

When the parser encounters an inline method definition, it lexes and stores its tokens so that it can parse it just after the parsing of the topmost class. I.e:

class C {
   void m1() {}
     class NC {
       void m2() {}
   };
};

The parsing of the above class is something like this:

class C {
   void m1();
     class NC {
       void m2();
   };
};

void m1() {} // of class C
void m2() {} // of class C::NC

For local classes, it would be like a function is defined inside another function. The Action module is supposed to be aware of this and allow it if the function being defined is an inline class method.

-Argiris

Argiris Kirtzidis wrote:

cxx-parser.patch (35.9 KB)

cxx-parser-sema.patch (50.8 KB)

cxx-sema.patch (16.6 KB)

Some rough comments looking over the code:

In Parser::ParseCXXMemberSpecification:
+ AccessSpecifier AS = getAccessSpecifierIfPresent();
+ if (AS != AS_none) {
+ // We encountered a C++ access specifier.
+ CurAS = AS;
+ ConsumeToken();

I've attached patches against the latest trunk of Parser and Sema changes for C++ classes support.

Wow, this is looking very nice Argiris. Most of the stuff below is nit picky details, you're making excellent progress.

The Parser->Sema diff for "cxx-sema.patch" was kinda manually created to make reviewing just the Sema changes easier, so you cannot apply it. If you want to try out Parser+Sema changes you need to apply the "cxx-parser-sema.patch", which contains the unified diff.

I'll start with the parser pieces. Once those go in I'll take a look at sema pieces, thanks! Incidentally, the plan is to get more caught up on recent clang development this weekend. Here's hoping :slight_smile:

Here's a few more details in addition to the summary for the parser changes which I mentioned in the previous post:

When the parser encounters an inline method definition, it lexes and stores its tokens so that it can parse it just after the parsing of the topmost class. I.e:

class C {
void m1() {}
  class NC {
     void m2() {}
};
};

The parsing of the above class is something like this:

class C {
void m1();
  class NC {
     void m2();
};
};

void m1() {} // of class C
void m2() {} // of class C::NC

Right.

For local classes, it would be like a function is defined inside another function. The Action module is supposed to be aware of this and allow it if the function being defined is an inline class method.

Makes sense.

Details:

+DIAG(err_member_initialization, ERROR,
+ "member '%0' can be initialized only if it is a static const integral data member")

80 cols please :slight_smile:

+++ include/clang/Parse/Parser.h (working copy)

   template <typename ConsumeFunc>
+ bool ConsumeUntil(tok::TokenKind T, ConsumeFunc &ConFn, bool StopAtSemi = true,
+ bool DontConsume = false) {
+ return ConsumeUntil(&T, 1, ConFn, StopAtSemi, DontConsume);
+ }

In Sema::ActOnIdentifierExpr:
+ // FIXME: Should this use a new expr for a direct reference, instead of
+ // turning it into "this->field" ? Or else a MemberExpr with null base ?
+ ExprResult ThisExpr = ActOnCXXThis(Loc);
+ return new MemberExpr(static_cast<Expr*>(ThisExpr.Val),
+ true, FD, Loc, FD->getType());

I'd say this is fine, except that you should use an empty
SourceLocation for the artificial "this" so it's distinguishable from
a real MemberExpr. clang currently uses this technique for
initializer lists.

Great catch. I'd strongly prefer to represent "field" and "this- >field" differently. The former should be a DeclRefExpr and the later should be a member expr. We are still trying to untangle the ObjC front-end from this early mistake.

-Chris

Hi Eli, thanks for reviewing.

Eli Friedman wrote:

In Parser::ParseCXXMemberSpecification:
+ AccessSpecifier AS = getAccessSpecifierIfPresent();
+ if (AS != AS_none) {
+ // We encountered a C++ access specifier.
+ CurAS = AS;
+ ConsumeToken();
+
+ if (Tok.is(tok::colon))
+ ConsumeToken();
+ else
+ Diag(Tok.getLocation(), diag::err_expected_colon);

I think you forgot to put a "continue;" here. (Consider the testcase
"class A {public:private:int a;public:};".)

+ }
  
Oops, thanks for noticing!

In Parser::ParseCXXClassMemberDeclaration:
You mention the pure specifier in the comments, but it looks like you
forgot to actually implement it (or at least put in a FIXME where it
should be implemented).
  
I've put a [TODO] for the pure specifier on the top of ParseCXXClassMemberDeclaration. And I've put

        // Defer all other checks to Sema::AddInitializerToDecl.

which is where it is supposed to be handled. Shall I add a more specific comment for pure specifier ?
Also note that initializers are not handled at the moment.

In Parser::SkipUntil:
Standard large patch advice: split out changes that can be easily
split out. (Here, the change isn't especially useful without the rest
of the patch, but it makes review easier and the svn history a bit
cleaner.)
  
I'll commit the change to SkipUntil in a separate patch if the consensus is that this is the way to go for lexing the inline method definitions.

In Parser::TryParseFunctionOrCXXMethodDef:
+ Tok.is(tok::colon) || // int X(): -> not a function def
Hmm? Doesn't a colon guarantee that this is a function definition
with a ctor-initializer? Or are you planning on handling this case
some other way?
  
I think that constructors will be handled in some other way; as I see it, this method will be used on declarators with declaration specifiers.
If it turns out that it'd be useful for constructors to go through this method, we can always change it later.

- isDeclarationSpecifier())) { // int X(f) int f; {}
+ (!getLang().CPlusPlus &&
+ isDeclarationSpecifier()))) { // int X(f) int f; {}
This change should be split out.
  
Ok, this will be in a different commit.

About the Sema Decl context stuff:
It is in fact possible to nest Function/MethodDecls with C++. Simple example:
int a() {
  static int b;
  class C {
    public:
    int a() {
      return b;
    }
  } r;
  return r.a();
}

If I'm not mistaken, member() should be parsed in the context of
func(). (This is somewhat evil/not especially useful, but as far as I
can tell, it's legal.)
  
You are referring to
    assert(CurFunctionDecl == 0 && "Confused parsing.");

right ?
I agree, thanks again for noticing!

-Argiris

Hi Chris,

Chris Lattner wrote:

+++ include/clang/Parse/Parser.h (working copy)

  template <typename ConsumeFunc>
+ bool ConsumeUntil(tok::TokenKind T, ConsumeFunc &ConFn, bool StopAtSemi = true,
+ bool DontConsume = false) {
+ return ConsumeUntil(&T, 1, ConFn, StopAtSemi, DontConsume);
+ }
+
+ template <typename ConsumeFunc>
+ bool ConsumeUntil(const tok::TokenKind *Toks, unsigned NumToks,
+ ConsumeFunc &ConFn, bool StopAtSemi, bool DontConsume) {

I'm fairly uncomfortable with making ConsumeUntil completely templated. In addition to inviting code bloat, this requires a bunch of implementation details to be in the header. Can this be changed to take a function pointer (possibly with a void* data pointer) instead?

It looks like you're using this for "method body eating". It would also be reasonable to have a specialized version of this that just counts braces/parens etc. I consider SkipUntil to really be part of the diagnostic machinery... having it be used by the parser is a bit strange to me, and could make future extensions to it more difficult.

I don't quite see how the specialized version will be different from SkipUntil. You still need to consider braces,parens,brackets, strings etc. Basically the implementation of SkipUntil will be copied.
Is it ok to duplicate the code for this ?

+ // C++ 9.2p6: A member shall not be declared to have automatic storage
+ // duration (auto, register) or with the extern storage-class-specifier.
+ switch (DS.getStorageClassSpec()) {
+ case DeclSpec::SCS_unspecified:

Should this code live in parser or in sema? It seems cleaner to keep the 'analysis' code in sema as much as possible. I know we aren't necessarily clean about this everywhere, but I think it would be useful to avoid this if possible.

Ok, I was under the impression that if the parser has the necessary information to determine an error, it should emit the error.
Is the parser supposed to defer to Sema as many checks as possible ?

+ } else if (DS.getStorageClassSpec() == DeclSpec::SCS_static) {
+ // "static member 'A' cannot be a bit-field"
+ Diag(colonLoc, diag::err_static_not_bitfield,
+ DeclaratorInfo.getIdentifier()->getName());
+ SkipUntil(tok::comma, true, true);
+
+ } else {
+ assert(0 && "Didn't we cover all member kinds?");

How about:

+ } else {
+ assert(DS.getStorageClassSpec() == DeclSpec::SCS_static &&
+ "Didn't we cover all member kinds?");
+ // "static member 'A' cannot be a bit-field"
+ Diag(colonLoc, diag::err_static_not_bitfield,
+ DeclaratorInfo.getIdentifier()->getName());
+ SkipUntil(tok::comma, true, true);
+
+ }

which is faster in release-asserts mode and doesn't have a false path.

+ // C++ 11p3: Members of a class defined with the keyword class are private
+ // by default. Members of a class defined with the keywords struct or union
+ // are public by default.
+ AccessSpecifier CurAS =
+ TagType == DeclSpec::TST_struct ||
+ TagType == DeclSpec::TST_union ? AS_public
+ : AS_private;

The precedence here is non-obvious, please use an if/then/else instead of ?: or parenthesize.

+ if (Tok.is(tok::colon))
+ ConsumeToken();
+ else
+ Diag(Tok.getLocation(), diag::err_expected_colon);

How about:
  ExpectAndConsume(tok::colon, diag::err_expected_colon);

Ok to all.

+ // C++ 9.2p2: Within the class member-specification, the class is regarded as
+ // complete within function bodies, default arguments,
+ // exception-specifications, and constructor ctor-initializers (including
+ // such things in nested classes).
+ //
+ // FIXME: Only function bodies are parsed correctly, fix the rest.
+ if (!(CurScope->getFlags() & Scope::CXXClassScope)) {

Is this check sufficient for nested classes? If not, please add a fixme.

Yes, the check handles nested classes too.

Very picky but:

+ assert(I->Toks.size() > 0 && "Empty body!");

How about !I->Toks.empty()

Ok.

+ // Append the current token at the end of the new token stream so that it
+ // doesn't get lost.
+ I->Toks.push_back(Tok);
+ PP.EnterTokenStream(&I->Toks.front(), I->Toks.size(), true, false);

Very nice! If there are multiple methods, should this push Tok for all of them?

Yes, after the method is parsed, the current token is the previously pushed token, and it needs to be pushed again.

+Parser::ExprResult Parser::ParseCXXThis() {
+ assert(Tok.is(tok::kw_this) && "Not 'this'!");
+ SourceLocation ThisLoc = ConsumeToken();
+ if (CurScope->getFnParent() == 0) {
+ Diag(ThisLoc, diag::err_invalid_this_at_top_level);
+ return ExprResult(true);
+ }

Should this also check for static methods? What about C functions? Alternatively, does sema handle these checks? If so, should Sema handle all the checks?

Static methods and C functions are checked on Sema since the parser don't have this kind of information.
This applies to my previous question about where should the checks be handled.

+/// TryParseFunctionOrCXXMethodDef - Check the parsed declarator and continue
+/// parsing if it is a function definition.
+/// If it is a function definition it will return true and 'Res' will hold the
+/// declaration.
+/// If it is not a function definition it will return false.
+/// If a parsing error occured, it will return true and 'Res' will hold 0.

Please mention that on a parse error, no tokens will be consumed.

+bool Parser::TryParseFunctionOrCXXMethodDef(Declarator &DeclaratorInfo,
+ DeclTy *&Res) {

This returns a Decl plus a "success" bool. Should this return a "DeclResult" like the existing ExprResult/StmtResult etc stuff used by actions?

Ok, good idea.

-Argiris

In Parser::ParseCXXClassMemberDeclaration:
You mention the pure specifier in the comments, but it looks like you
forgot to actually implement it (or at least put in a FIXME where it
should be implemented).

I've put a [TODO] for the pure specifier on the top of
ParseCXXClassMemberDeclaration. And I've put

       // Defer all other checks to Sema::AddInitializerToDecl.

which is where it is supposed to be handled. Shall I add a more specific
comment for pure specifier ?

Nevermind; I was thinking it might be difficult to handle this from
Sema because the AST doesn't know the difference between different
ways of writing 0, but I don't think it'll be an issue.

In Parser::TryParseFunctionOrCXXMethodDef:
+ Tok.is(tok::colon) || // int X(): -> not a function def
Hmm? Doesn't a colon guarantee that this is a function definition
with a ctor-initializer? Or are you planning on handling this case
some other way?

I think that constructors will be handled in some other way; as I see it,
this method will be used on declarators with declaration specifiers.
If it turns out that it'd be useful for constructors to go through this
method, we can always change it later.

Okay, makes sense.

You are referring to
assert(CurFunctionDecl == 0 && "Confused parsing.");
right ?

Yes.

A couple more potential issues that I found while re-reading the patch:
+ bool isInstField = (DS.getStorageClassSpec() ==
DeclSpec::SCS_unspecified &&
+ !DeclaratorInfo.isFunctionDeclarator());

This check isn't reliable. Simple example: "typedef int func(); class
C {func a,b;};".

+ typedef std::list<LexedInlineMethod> LexedInlineMethodsTy;
+ /// LexedInlineMethods - During parsing of a C++ class, its inline method
+ /// definitions and the inline method definitions of its nested classes are
+ /// lexed and stored here.
+ LexedInlineMethodsTy LexedInlineMethods;
[...]
+ if (!(CurScope->getFlags() & Scope::CXXClassScope)) {
+ // We are no longer inside a C++ class. This class and its nested classes
+ // are complete and we can parse the lexed inline method definitions.
+ for (LexedInlineMethodsTy::iterator I = LexedInlineMethods.begin();
+ I != LexedInlineMethods.end(); ++I) {

This isn't quite right; consider the following:
class X {

  int outer() {
   static int b;
   class C {
     public:
     int inner() {
       return b;
     }
   } r;
   return r.inner();
  };

  int other() {return b;}

  int x;
};

With the current code, at least according to my reading, first X gets
parsed, then we start parsing outer, then we finish parsing class C,
then we attempt to start parsing outer again, then bad things happen.

I'd suggest allocating LexedInlineMethods in
ParseCXXMemberSpecification, rather than making it a member of Parser.

-Eli

Chris Lattner wrote:

+/// TryParseFunctionOrCXXMethodDef - Check the parsed declarator and continue
+/// parsing if it is a function definition.
+/// If it is a function definition it will return true and 'Res' will hold the
+/// declaration.
+/// If it is not a function definition it will return false.
+/// If a parsing error occured, it will return true and 'Res' will hold 0.

Please mention that on a parse error, no tokens will be consumed.

+bool Parser::TryParseFunctionOrCXXMethodDef(Declarator &DeclaratorInfo,
+ DeclTy *&Res) {

This returns a Decl plus a "success" bool. Should this return a "DeclResult" like the existing ExprResult/StmtResult etc stuff used by actions?

Actually, I don't think a "DeclResult" is appropriate. The semantic of DeclResult is that on "success" it holds a valid value, otherwise the value is invalid. The return semantics of this method are slightly different, "success" means "I took care of the parsing and here's the result (the result could be valid or invalid)"; "failure" means "not a function definition, continue parsing it as a declaration".
So "success" can result to an invalid value as well, unlike DeclResult.

How about I split this method into "isDeclaration" and "ParseFunctionOrCXXMethodDecl" so that instead of

  DeclTy *Res;
  bool wasParsed = TryParseFunctionOrCXXMethodDef(DeclaratorInfo, Res);
  if (wasParsed) return Res;

we have

  if (!isDeclaration(DeclaratorInfo))
     return ParseFunctionOrCXXMethodDef(DeclaratorInfo, Res);

?

-Argiris

Chris Lattner wrote:

In Sema::ActOnIdentifierExpr:
+ // FIXME: Should this use a new expr for a direct reference, instead of
+ // turning it into "this->field" ? Or else a MemberExpr with null base ?
+ ExprResult ThisExpr = ActOnCXXThis(Loc);
+ return new MemberExpr(static_cast<Expr*>(ThisExpr.Val),
+ true, FD, Loc, FD->getType());

I'd say this is fine, except that you should use an empty
SourceLocation for the artificial "this" so it's distinguishable from
a real MemberExpr. clang currently uses this technique for
initializer lists.

Great catch. I'd strongly prefer to represent "field" and "this->field" differently. The former should be a DeclRefExpr and the later should be a member expr. We are still trying to untangle the ObjC front-end from this early mistake.

DeclRefExpr is "A reference to a declared variable, function, enum, etc." and accepts a ValueDecl. Is it appropriate to change it to accept a NamedDecl so that a CXXFieldDecl can be passed ?

-Argiris

Eli Friedman wrote:

A couple more potential issues that I found while re-reading the patch:
+ bool isInstField = (DS.getStorageClassSpec() ==
DeclSpec::SCS_unspecified &&
+ !DeclaratorInfo.isFunctionDeclarator());

This check isn't reliable. Simple example: "typedef int func(); class
C {func a,b;};".
  
Ah, good find!
How about adding a Action::isFunctionTypeDeclarator for the parser to get this kind of information:

    // typedef int func();
    // class C {func a,b;};
    bool isTypedefFunc = (!DeclaratorInfo.isFunctionDeclarator() &&
                          Actions.isFunctionTypeDeclarator(DeclaratorInfo, CurScope));
    bool isInstField = (DS.getStorageClassSpec() == DeclSpec::SCS_unspecified &&
                        !DeclaratorInfo.isFunctionDeclarator() &&
                        !isTypedefFunc);

Do you have some other suggestion ?

+ typedef std::list<LexedInlineMethod> LexedInlineMethodsTy;
+ /// LexedInlineMethods - During parsing of a C++ class, its inline method
+ /// definitions and the inline method definitions of its nested classes are
+ /// lexed and stored here.
+ LexedInlineMethodsTy LexedInlineMethods;
[...]
+ if (!(CurScope->getFlags() & Scope::CXXClassScope)) {
+ // We are no longer inside a C++ class. This class and its nested classes
+ // are complete and we can parse the lexed inline method definitions.
+ for (LexedInlineMethodsTy::iterator I = LexedInlineMethods.begin();
+ I != LexedInlineMethods.end(); ++I) {

This isn't quite right; consider the following:
class X {

  int outer() {
   static int b;
   class C {
     public:
     int inner() {
       return b;
     }
   } r;
   return r.inner();
  };

  int other() {return b;}

  int x;
};

With the current code, at least according to my reading, first X gets
parsed, then we start parsing outer, then we finish parsing class C,
then we attempt to start parsing outer again, then bad things happen.

I'd suggest allocating LexedInlineMethods in
ParseCXXMemberSpecification, rather than making it a member of Parser.
  
Ok, I now use a "stack<LexedInlineMethodsTy>" as member of Parser and ParseCXXMemberSpecification pushes a new LexedInlineMethodsTy container for each non-nested class.
ParseInlineMethodDef pushes the new method to the top container.

Thanks for the catch!

-Argiris

Having the parser query Sema is no good; it's a layering violation.
The expectation is that the parser works the same way no matter what
action handler it's hooked up to. I'd suggest just verifying that the
form is legal per the grammar, and moving the logic that decides what
type we're dealing with into Sema (something like
ActOnMemberDeclarator). There's no good reason to force checks into
the parser that it can't handle easily.

-Eli

It looks like you're using this for "method body eating". It would also be reasonable to have a specialized version of this that just counts braces/parens etc. I consider SkipUntil to really be part of the diagnostic machinery... having it be used by the parser is a bit strange to me, and could make future extensions to it more difficult.

I don't quite see how the specialized version will be different from SkipUntil. You still need to consider braces,parens,brackets, strings etc. Basically the implementation of SkipUntil will be copied.
Is it ok to duplicate the code for this ?

Okay, you're right. I think using a "function pointer + void*" is the right tradeoff.

+ // C++ 9.2p6: A member shall not be declared to have automatic storage
+ // duration (auto, register) or with the extern storage-class-specifier.
+ switch (DS.getStorageClassSpec()) {
+ case DeclSpec::SCS_unspecified:

Should this code live in parser or in sema? It seems cleaner to keep the 'analysis' code in sema as much as possible. I know we aren't necessarily clean about this everywhere, but I think it would be useful to avoid this if possible.

Ok, I was under the impression that if the parser has the necessary information to determine an error, it should emit the error.
Is the parser supposed to defer to Sema as many checks as possible?

There is no one right answer unfortunately. When I initially started on clang, that is the direction I went because I didn't have sema and wanted to implement some of these checks.

Advantages of doing everything in sema:

1. All semantic checking is in one place.
2. It is much more obvious when checks are duplicated. Right now, it is easy to do something in sema that is already covered by the parser, or are only covered in some cases.
3. Moving sema checks into sema keeps the parser proper clean and easier to understand.
4. Clients that don't care about sema (e.g. things that just parse to get function names) get a small amount of "error tolerance" if the parser is only strict about stuff that affects the parse.

Advantages of doing it in parser:

1. Every action implementation doesn't need to reimplement the checks.

While we do checks in both parser and sema, I think we should migrate to doing everything in sema where possible. Seem reasonable?

+ // Append the current token at the end of the new token stream so that it
+ // doesn't get lost.
+ I->Toks.push_back(Tok);
+ PP.EnterTokenStream(&I->Toks.front(), I->Toks.size(), true, false);

Very nice! If there are multiple methods, should this push Tok for all of them?

Yes, after the method is parsed, the current token is the previously pushed token, and it needs to be pushed again.

Ok.

+Parser::ExprResult Parser::ParseCXXThis() {
+ assert(Tok.is(tok::kw_this) && "Not 'this'!");
+ SourceLocation ThisLoc = ConsumeToken();
+ if (CurScope->getFnParent() == 0) {
+ Diag(ThisLoc, diag::err_invalid_this_at_top_level);
+ return ExprResult(true);
+ }

Should this also check for static methods? What about C functions? Alternatively, does sema handle these checks? If so, should Sema handle all the checks?

Static methods and C functions are checked on Sema since the parser don't have this kind of information.
This applies to my previous question about where should the checks be handled.

I think this is a great example of something that would be cleaner/more obvious in sema, because it could handle all the cases uniformly in one place.

-Chris

Bleh, I forgot that C is ambiguous anyway if you can't figure out
whether an identifier declares a type. So this layering violation
exists anyway.

That said, it's a larger burden on additional Action implementations
to figure out the type of a identifier versus figuring out whether an
identifier refers to a type. Figuring out whether an identifier refers
to a type only requires limited resolution of declarations and the
type heirarchy (see MinimalAction.cpp). Figuring out whether a type
is a function type is a bit more complicated because the Action
implementation has to track whether a typedef type is a function type.

This also affects the GNU typeof extension; for a typeof, determining
whether the type is a function type essentially requires a complete
Sema implementation. (I guess it's worth noting for this that g++
doesn't accept "typeof(x)::a", which might complicate this depending
on how it were defined.)

-Eli

Sounds excellent!

-chris

Yeah, I think that's ok. If you think a new *Expr is appropriate, that would also be fine with me too.

Thanks Argiris,

-Chris

Hi,

I incorporated the excellent feedback by Chris and Eli and I'm posting a new Parser patch for C++ classes.
The main differences with the previous one are:

-The C++ class parser is more simplified and "naive", and almost all checks are deferred to Sema.
-There's a new Parser::ConsumeAndStoreUntil method, instead of the templated ConsumeUntil. SkipUntil is untouched.
-New Action methods: ActOnCXXMemberDeclarator and ActOnFinishCXXMemberSpecification.

-Argiris

class-parse-2.patch (24.8 KB)

This is looking great! Some questions:

+ /// LexedMethodStacks - During parsing of a top (non-nested) C++ class, its
+ /// inline method definitions and the inline method definitions of its nested
+ /// classes are lexed and stored here.
+ /// A new lexed methods stack is pushed for local classes in inline methods.
+ std::stack< std::stack<LexedMethod> > LexedMethodStacks;

Out of curiousity, why use a std::stack instead of std::deque or std::vector?

For clarity, it might be better to make a typedef for the inner stack, something like:

   typedef std::stack<LexedMethod> MethodTokensForOneClass;

or something? Use of that typedef would make the interfaces a bit more clear.

+ void PushLexedMethodStack() {
+ void PopLexedMethodStack() { LexedMethodStacks.pop(); }

These really push one 'nested class' worth of methods, right? It might be better to name these "PushClassStack" or something.

+++ lib/Parse/ParseCXXInlineMethods.cpp (revision 0)

A new file is a great idea.

+ Parser::ParseCXXInlineMethodDef(AccessSpecifier AS, Declarator &D) {
+ assert(D.getTypeObject(0).Kind == DeclaratorChunk::Function &&
+ "This isn't a function declarator!");

Hi Chris,

Chris Lattner wrote:

+ /// LexedMethodStacks - During parsing of a top (non-nested) C++ class, its
+ /// inline method definitions and the inline method definitions of its nested
+ /// classes are lexed and stored here.
+ /// A new lexed methods stack is pushed for local classes in inline methods.
+ std::stack< std::stack<LexedMethod> > LexedMethodStacks;

Out of curiousity, why use a std::stack instead of std::deque or std::vector?

About the outer stack, it's the last "MethodTokensForOneClass" pushed that is going to be used for adding lexed methods, so a stack seems appropriate.
About the inner stack, there won't be a difference whether it's a stack, list or deque, although std::vector might not be as appropriate since there's no need to have LexedMethods in contiguous memory.
Since a std::stack was already brought in for the outer stack, I figured I could use it again to avoid bringing in an additional container.

For clarity, it might be better to make a typedef for the inner stack, something like:

  typedef std::stack<LexedMethod> MethodTokensForOneClass;

or something? Use of that typedef would make the interfaces a bit more clear.

+ void PushLexedMethodStack() {
+ void PopLexedMethodStack() { LexedMethodStacks.pop(); }

These really push one 'nested class' worth of methods, right? It might be better to name these "PushClassStack" or something.

Ok to both suggestions.
Just a sidenote, I think these containers and methods will need refactoring to accomodate other types of "delayed parsing", like default parameters. So they are subject to change; probably a new class will be added for all this.

+++ lib/Parse/ParseCXXInlineMethods.cpp (revision 0)

A new file is a great idea.

+ Parser::ParseCXXInlineMethodDef(AccessSpecifier AS, Declarator &D) {
+ assert(D.getTypeObject(0).Kind == DeclaratorChunk::Function &&
+ "This isn't a function declarator!");
+
+ DeclTy *FnD = Actions.ActOnCXXMemberDeclarator(CurScope, AS, D, 0, 0, 0);
+
+ // We should have an opening brace now.
+ if (Tok.isNot(tok::l_brace)) {

The only caller of this knows that the current token is l_brace, so this can be an assert.

Ok.

In Parser::ConsumeAndStoreUntil, do you think it would make sense to specialize it for T=r_brace, or do you expect other clients?

This method calls itself recursively for braces and parentheses.

-Argiris

Chris Lattner wrote:

+ /// LexedMethodStacks - During parsing of a top (non-nested) C++ class, its
+ /// inline method definitions and the inline method definitions of its nested
+ /// classes are lexed and stored here.
+ /// A new lexed methods stack is pushed for local classes in inline methods.
+ std::stack< std::stack<LexedMethod> > LexedMethodStacks;

Out of curiousity, why use a std::stack instead of std::deque or std::vector?

About the outer stack, it's the last "MethodTokensForOneClass" pushed that is going to be used for adding lexed methods, so a stack seems appropriate.

Ok

About the inner stack, there won't be a difference whether it's a stack, list or deque, although std::vector might not be as appropriate since there's no need to have LexedMethods in contiguous memory.
Since a std::stack was already brought in for the outer stack, I figured I could use it again to avoid bringing in an additional container.

std::stack is actually not a container, it is an adaptor that wraps std::deque (by default).

These really push one 'nested class' worth of methods, right? It might be better to name these "PushClassStack" or something.

Ok to both suggestions.
Just a sidenote, I think these containers and methods will need refactoring to accomodate other types of "delayed parsing", like default parameters. So they are subject to change; probably a new class will be added for all this.

Makes sense, using a new struct (even with one member) is even better than a typedef!

In Parser::ConsumeAndStoreUntil, do you think it would make sense to specialize it for T=r_brace, or do you expect other clients?

This method calls itself recursively for braces and parentheses.

Ok!

-Chris