WIP patch for qualified-ids in class member access

Hi all. I've been following the clang project passively for a little while, and figured it was time to start contributing a bit. After hacking around for a while to get familiar with the code, I took a look at the open projects list and settled on getting qualified name lookup for class members working in C++.

In the interest of making sure I'm headed down the right path, and that I'm not violating any obvious stylistic conventions in clang, I thought it would be a good idea for other people to look at my code. With that in mind, attached is the beginnings of a patch to add support for this feature, along with a test for the parts that work as expected so far.

Currently, the patch does not support qualified-id lookup for dependent types or for classes with an overloaded operator->. The latter should be straightforward to add, but the former is a bit trickier, since I'm not familiar with how clang handles class templates yet. There are also likely some bugs with handling templates inside the qualified-id, but I haven't had a chance to look at that yet. Finally, I haven't handled the actual lookup of the members, only the nested-name-specifier part.

Hopefully this patch is more-or-less correct, even though it's not complete yet. Comments are, of course, welcome (in fact, they're requested!).

- Jim

qualified-ids.patch (7.84 KB)

qual-id-test.cpp (1.15 KB)

Hi James,

Hi all. I’ve been following the clang project passively for a little while, and figured it was time to start contributing a bit.

Great!

Currently, the patch does not support qualified-id lookup for dependent types or for classes with an overloaded operator->. The latter should be straightforward to add, but the former is a bit trickier, since I’m not familiar with how clang handles class templates yet. There are also likely some bugs with handling templates inside the qualified-id, but I haven’t had a chance to look at that yet. Finally, I haven’t handled the actual lookup of the members, only the nested-name-specifier part.

Okay. This is a rather non-trivial feature to tackle, so I don’t expect templates to work on the first try.

Hopefully this patch is more-or-less correct, even though it’s not complete yet. Comments are, of course, welcome (in fact, they’re requested!).

  • Jim
    Index: include/clang/Parse/Action.h
    ===================================================================
    — include/clang/Parse/Action.h (revision 74890)
    +++ include/clang/Parse/Action.h (working copy)
    @@ -235,6 +235,14 @@
    return 0;
    }
  • virtual bool ActOnCXXEnterMemberScope(Scope *S, ExprArg &exp) {
  • return false;
  • }
  • virtual void ActOnCXXExitMemberScope(Scope *S, ExprArg &exp,
  • CXXScopeSpec &SS) {
  • }

The idea here is that the “exp” argument isn’t really being consumed by ActOnCXXEnterMemberScope or ActOnCXXExitMemberScope, but that “exp” is there to help these actions enter or exit that scope, right? If so, I suggest passing an ExprTy* rather than a reference to an ExprArg. ExprArg is a smart pointer that’s meant to be passed by value, so the signatures above look like a mistake for those of us used to the smart pointers.

Please add some comments on these two actions, describing their purpose.

[snip]
Index: lib/Sema/SemaCXXScopeSpec.cpp

— lib/Sema/SemaCXXScopeSpec.cpp (revision 74890)
+++ lib/Sema/SemaCXXScopeSpec.cpp (working copy)
@@ -14,6 +14,7 @@
#include “Sema.h”
#include “clang/AST/ASTContext.h”
#include “clang/AST/DeclTemplate.h”
+#include “clang/AST/ExprCXX.h”
#include “clang/AST/NestedNameSpecifier.h”
#include “clang/Parse/DeclSpec.h”
#include “llvm/ADT/STLExtras.h”
@@ -278,6 +279,65 @@
T.getTypePtr());
}

+bool Sema::ActOnCXXEnterMemberScope(Scope *S, ExprArg &Base) {

  • Expr BaseExpr = static_cast<Expr>(Base.get());
  • if(BaseExpr == 0)
  • return false;
  • QualType BaseType = BaseExpr->getType();
  • if (BaseExpr->getType()->isPointerType())
  • BaseType = BaseType->getAsPointerType()->getPointeeType();

This will compute a base type even for something like:

class C { int foo; };
void f(C *ptr) {
return ptr.foo;
}

I think that ActOnCXXEnterMemberScope needs to know whether the member access is with a “->” or a “.”, so it knows whether to expect BaseExpr’s type to be a pointer or not. (It’s also important when you get to implementing support for an overloaded operator->).

  • if (BaseType->isDependentType()) {
  • // FIXME: handle dependent types
  • return false;

FIW, just creating a NestedNameSpecifier from the type pointer should suffice for dependent types (at least for now).

  • } else if (!BaseType->isRecordType()) {
  • Diag(BaseExpr->getExprLoc(),
  • diag::err_typecheck_member_reference_struct_union)
  • << BaseType << BaseExpr->getSourceRange();
  • return false;
  • }

Okay, good.

  • CXXScopeSpec SS;
  • SS.setRange(BaseExpr->getSourceRange());
  • SS.setScopeRep(
  • NestedNameSpecifier::Create(Context, 0, false, BaseType.getTypePtr())
  • );
  • ActOnCXXEnterDeclaratorScope(S,SS);
  • return true;
    +}

+void Sema::ActOnCXXExitMemberScope(Scope *S, ExprArg &Base, CXXScopeSpec &SS) {

  • ExitDeclaratorContext(S);

Okay.

  • Expr BaseExpr = static_cast<Expr>(Base.get());
  • assert(BaseExpr && “Unable to look up base expression”);
  • QualType BaseType = BaseExpr->getType();
  • if (BaseExpr->getType()->isPointerType())
  • BaseType = BaseType->getAsPointerType()->getPointeeType();
  • RecordDecl *Rec = BaseType->getAsRecordType()->getDecl();

It’s unfortunate that we have to recompute this base type information :frowning:

  • if (SS.isSet()) {
  • NestedNameSpecifier *Prefix
  • = static_cast<NestedNameSpecifier *>(SS.getScopeRep());
  • assert(Prefix && Prefix->getKind() == NestedNameSpecifier::TypeSpec
  • && “Nested name specifier does not name a type”);
  • Type *T = Prefix->getAsType();
  • DeclarationName Name = T->getAsRecordType()->getDecl()->getDeclName();
  • LookupResult Result = LookupQualifiedName(Rec, Name, LookupMemberName,
  • false);
  • if (!Result.getAsDecl()) {
  • Diag(SS.getBeginLoc(), diag::err_not_direct_base_or_virtual)
  • << Name.getAsIdentifierInfo() << BaseType;
  • }
  • }
    +}

This seems a little strange to me. Why isn’t this semantic check done in ActOnMemberExpr?

/// ActOnCXXEnterDeclaratorScope - Called when a C++ scope specifier (global
/// scope or nested-name-specifier) is parsed, part of a declarator-id.
/// After this method is called, according to [C++ 3.4.3p3], names should be
Index: lib/Sema/SemaExpr.cpp

— lib/Sema/SemaExpr.cpp (revision 74890)
+++ lib/Sema/SemaExpr.cpp (working copy)
@@ -2067,7 +2067,11 @@
Sema::ActOnMemberReferenceExpr(Scope *S, ExprArg Base, SourceLocation OpLoc,
tok::TokenKind OpKind, SourceLocation MemberLoc,
IdentifierInfo &Member,

  • DeclPtrTy ObjCImpDecl) {
  • DeclPtrTy ObjCImpDecl,const CXXScopeSpec *SS) {
  • // FIXME: handle the CXXScopeSpec for proper lookup of qualified-ids
  • if (SS && SS->isInvalid())
  • return ExprError();

Expr *BaseExpr = Base.takeAs();
assert(BaseExpr && “no record expression”);

Shouldn’t SS be used to perform name lookup for the Member somewhere in ActOnMemberReferenceExpr?

Index: lib/Parse/ParseExpr.cpp

— lib/Parse/ParseExpr.cpp (revision 74890)
+++ lib/Parse/ParseExpr.cpp (working copy)
@@ -915,6 +915,14 @@
tok::TokenKind OpKind = Tok.getKind();
SourceLocation OpLoc = ConsumeToken(); // Eat the “.” or “->” token.

  • CXXScopeSpec SS;
  • if (getLang().CPlusPlus) {
  • if( !Actions.ActOnCXXEnterMemberScope(CurScope,LHS) )
  • return ExprError();
  • ParseOptionalCXXScopeSpecifier(SS);
  • Actions.ActOnCXXExitMemberScope(CurScope,LHS,SS);
  • }

Oh, this is interesting. I completely expected to have something like:

  • CXXScopeSpec SS;
  • if (getLang().CPlusPlus) {
  • if( !Actions.ActOnCXXEnterMemberScope(CurScope,LHS) )
  • return ExprError();
  • ParseOptionalCXXScopeSpecifier(SS);

followed by a call to ActOnMemberReferenceExpr, then

if (getLang().CPlusPlus)
Actions.ActOnCXXExitMemberScope(CurScope,LHS,SS);

In other words, I expected that we would enter the member scope, process the member-reference expression within that scope (so that name lookup would be based on that scope), then exit the member scope at the end. \

// RUN: clang-cc -fsyntax-only -verify %s

Thanks for including test cases. You might also consider adding in some tests that involve templates (even if there are no dependent types in the expression).

This is a good start, and I think you’re on the right track. I think the biggest question at the moment is how to make ActOnMemberExpr interact with its CXXScopeSpec parameter.

  • Doug

Thinking about this further, I think you had the right idea after all. At least, I can see both ways working, and they need to produce the same results. The really, really tricky case is with a conversion function, e.g., x.operator T(). C++ [basic.lookup.classref]p7 says this:

If the id-expression is a conversion-function-id, its conversion-type-id shall denote the same type in both the context in which the entire postfix-expression occurs and in the context of the class of the object expression (or the class pointed to by the pointer expression).

Since the conversion-function-id is the only place where I could imagine your approach and my suggestion to differ, and in theory we have to do both, let’s go with your approach: it requires fewer changes, and we can think about diagnosing paragraph 7 later.

  • Doug

Douglas Gregor wrote:

Since the conversion-function-id is the only place where I could imagine your approach and my suggestion to differ, and in theory we have to do *both*, let's go with your approach: it requires fewer changes, and we can think about diagnosing paragraph 7 later.

I've actually already changed to your original suggestion, since it cleans things up a bit so that I'm not rechecking the underlying types of the operations quite so often. I haven't posted it to the list yet because I'm not done yet, and I want to write up some more tests (especially for overloaded operator->, which I had to change a bit).

It's also possible that a different approach entirely will be required, since in addition to paragraph 7, paragraph 6 also has some strange rules for name lookup:

"If the nested-name-specifier contains a class template-id (14.2), its template-arguments are evaluated in the context in which the entire postfix-expression occurs."

I'm almost certain that my patch will fail on this case, since it'll be lookup up the template-arguments in the class's context as well. I have a feeling that there's a solution that will handle both of these.

Incidentally, I think it's possible to cause the same issue you mentioned about [basic.lookup.classref]p7 with *any* qualified-id (p4). I'll see if I can write up a test that reveals this. The important part will be ensuring that identifier A refers to a (base) class in the class context and a namespace in the expression context. It'll probably work the same way with either approach, though.

- Jim

Douglas Gregor wrote:

Since the conversion-function-id is the only place where I could imagine
your approach and my suggestion to differ, and in theory we have to do
*both*, let's go with your approach: it requires fewer changes, and we
can think about diagnosing paragraph 7 later.

I've actually already changed to your original suggestion, since it
cleans things up a bit so that I'm not rechecking the underlying types
of the operations quite so often.

Oh, great! I was afraid that my original suggestion would have made it uglier. I'm happy with whatever is clean :slight_smile:

It's also possible that a different approach entirely will be required,
since in addition to paragraph 7, paragraph 6 also has some strange
rules for name lookup:

"If the nested-name-specifier contains a class template-id (14.2), its
template-arguments are evaluated in the context in which the entire
postfix-expression occurs."

I'm almost certain that my patch will fail on this case, since it'll be
lookup up the template-arguments in the class's context as well. I have
a feeling that there's a solution that will handle both of these.

Ah, right. There's similar wording in [class.qual]p1 when parsing nested-name-specifiers (that aren't for class member access), but in that case we don't try to "enter" the context of the class/namespace in the nested-name-specifier, so we should get this right. Perhaps that's what we missed: maybe we need special name-lookup routines for the identifier that follows the -> (to deal with all of those weird semantics), rather than trying to enter a different context.

Incidentally, I think it's possible to cause the same issue you
mentioned about [basic.lookup.classref]p7 with *any* qualified-id (p4).
I'll see if I can write up a test that reveals this.

The tests for this probably need to make sure that the class type of the object expression is in a disjoint namespace from the postfix-expression itself, so that we don't accidentally find the right thing (e.g., up at global scope).

  - Doug

I've updated my patch (attached). The main change is that this version supports qualified-ids with overloaded operator->. Other than that, most of the changes are merely organizational.

Douglas Gregor wrote:

Oh, great! I was afraid that my original suggestion would have made it uglier. I'm happy with whatever is clean :slight_smile:

Yeah, honestly my original code was organized that way just because that's how it looked when I had hacked the feature when I was still trying to figure out what was going on. That's where my curious use of passing ExprArg by reference came from too. :slight_smile:

That said, the new patch has a few things that are less-than-clean, like how I handle overloaded operator->, which needed to happen before parsing the nested-name-specifier, so it got moved into ActOnCXXEnterMemberScope.

Ah, right. There's similar wording in [class.qual]p1 when parsing nested-name-specifiers (that aren't for class member access), but in that case we don't try to "enter" the context of the class/namespace in the nested-name-specifier, so we should get this right. Perhaps that's what we missed: maybe we need special name-lookup routines for the identifier that follows the -> (to deal with all of those weird semantics), rather than trying to enter a different context.

That's what I'm thinking. Currently (Enter|Exit)DeclaratorContext is a black box to me, and I've just been using it because it "works", more-or-less.

- Jim

qualified-ids.patch (12.8 KB)

qual-id-test.cpp (2.05 KB)

James Porter wrote:

I've updated my patch (attached). The main change is that this version supports qualified-ids with overloaded operator->. Other than that, most of the changes are merely organizational.

Ping! If anyone has a chance to look over this to evaluate the changes to this patch, that'd be excellent. I think I ended up posting it right around when all the C++ folks were busy in Germany at the Frankfurt meeting.

- Jim

James Porter wrote:

I've updated my patch (attached). The main change is that this version
supports qualified-ids with overloaded operator->. Other than that, most
of the changes are merely organizational.

Ping! If anyone has a chance to look over this to evaluate the changes
to this patch, that'd be excellent. I think I ended up posting it right
around when all the C++ folks were busy in Germany at the Frankfurt meeting.

Your timing was perfect; I didn't even realize you'd posted a updated patch :slight_smile:

Some comments:

First of all, this patch breaks some tests that are currently working:

   test/SemaCXX/class.cpp
   test/SemaCXX/member-expr.cpp
   test/SemaTemplate/ext-vector-type.cpp
   test/SemaTemplate/instantiate-clang.cpp
   test/SemaTemplate/instantiate-expr-4.cpp

We'll need all tests running cleanly before we can commit this change.

+Action::OwningExprResult
+Sema::ActOnCXXEnterMemberScope(Scope *S, CXXScopeSpec &SS, ExprArg Base,
+ tok::TokenKind OpKind) {
+ Expr *BaseExpr = Base.takeAs<Expr>();
+ assert(BaseExpr && "no record expansion");

At this point, nobody "owns" the expression BaseExpr. So, if we return without explicitly destroying it, we'll leak the expression. I'd suggest letting Base own the expression. Then, code like this:

+ if (BaseType->isDependentType())
+ return Owned(BaseExpr);

can just move Base out to the result, e.g.,

   if (BaseType->isDependentType())
     return move(Base);

Then, when coping with an overloaded operator->:

+ Base = BuildOverloadedArrowExpr(S, BaseExpr, BaseExpr- >getExprLoc());
+ BaseExpr = Base.takeAs<Expr>();
+ if (BaseExpr == NULL)
+ return ExprError();

Replace the "takeAs" with "getAs", so we don't take ownership from Base. If Base always owns the expression, it'll always clean it up properly.

+ if (!BaseType->isRecordType()) {
+ return ExprError(Diag(BaseExpr->getExprLoc(),

Douglas Gregor wrote:

First of all, this patch breaks some tests that are currently working:

[snip]

We'll need all tests running cleanly before we can commit this change.

I should have these fixed. "make check" is successful, and I ran all the tests in SemaCXX and SemaTemplate. Is there a way to automatically run these tests along with the ones from "make check"? I ended up just writing a shell script to do the ones I mentioned.

+Action::OwningExprResult
+Sema::ActOnCXXEnterMemberScope(Scope *S, CXXScopeSpec &SS, ExprArg Base,
+ tok::TokenKind OpKind) {
+ Expr *BaseExpr = Base.takeAs<Expr>();
+ assert(BaseExpr && "no record expansion");

At this point, nobody "owns" the expression BaseExpr. So, if we return without explicitly destroying it, we'll leak the expression. I'd suggest letting Base own the expression.

Fixed, though I didn't see a "getAs" member, so I just used "get" and a C-cast. I should probably change that to a static_cast at least, unless I'm just mistaken about the existence of "getAs".

+ if (!BaseType->isRecordType()) {
+ return ExprError(Diag(BaseExpr->getExprLoc(),
+ diag::err_typecheck_member_reference_struct_union)
+ << BaseType << BaseExpr->getSourceRange());
+ }

We can also end up with various Objective-C types and ExtVectorTypes here, so this diagnostic comes too early. (That's why SemaTemplate/ext- vector-type.cpp is failing).

I removed this. It seems to work without it anyway.

I don't really understand what the lookup above is meant to do. I expected that it would take the nested-name-specifier in SS and try to find the member name (Member) within that scope-specifier, but instead it's looking for the name in the nested-name-specifier itself (?)

I had expected that the lookup for the member would do something like this:

    // Perform name lookup for the member
    DeclContext *DC = RDecl;
    if (SS && SS->isSet()) {
      // If the member name was a qualified-id, look into the
      // nested-name-specifier.
      DC = computeDeclContext(*SS, false);
      assert(DC && "Cannot handle non-computable dependent contexts in lookup");
    }

      LookupResult Result
        = LookupQualifiedName(DC, DeclarationName(&Member), LookupMemberName, false);

That would replace the current LookupQualifiedName call, which always looks into RDecl.

That was just me not fully understanding the usage of CXXScopeSpec. I changed the code to what you suggested, and added some checks to make sure the qualified-id is valid. It's still a little bit unhappy with template code, so I've had to be careful when making it not fail in currently-working scenarios.

Aside from eliminating my, er, "naive" code in ActOnMemberReferenceExpr, the biggest change is probably in TemplateExprInstantiator:: VisitCXXUnresolvedMemberExpr, which now "enters" the class's scope before acting on the member. Currently this is just to allow overloaded operator-> to work, but it's probably also necessary to get qualified name lookups working for templates.

There are a few other places in that file where ActOnMemberReferenceExpr is called. I might want to add the ActOnCXX(Enter|Exit)MemberScope methods around them too, but they weren't causing any test failures so I haven't looked at them too hard yet.

Thanks for taking the time to look over this patch. It's definitely been a good learning experience, and I'm getting considerably more comfortable with the clang codebase now (though I've probably got a ways to go yet!).

- Jim

qualified-ids.patch (14.9 KB)

qual-id-test.cpp (2.08 KB)