Hi Doug,
Thanks for review, and great comments!
I have addressed most of issues in attached version.
It also partially fixes other thing I have noticed, similar failure to
one from first version of patch with namespaces, now with tags.
Fix is not complete though, but behaviour is right. One commented out
in test is incorrectly diagnosed. I'll try send fix ASAP.
There is risk we don't check LookupResult for ambiguities in more
cases too... We need will need more tests. I have checked with gcc
testsuite, and we pass all tests that have 'using namespace', and
don't include standard library headers, use using-declaration, or
require strong-using now.
Hi Piotr,
Hi Dough,
I have played bit more with using-directives. Attached adds AST
support for using-directives (no serialization still), and makes
Sema::LookupName look ugly:) I may split it into two separated
patches(AST and lookup), if it is needed.
This is a big undertaking; thank you!
I'll do a detailed review later, but one thing in particular caught me
eye:
that sort_heap in CppLookupName is really going to hurt performance.
Could
you give me a short description of why we need it? (Or should I just wait
until I do that detailed review to get the big picture?)
For each Scope (namespace/translation-unit), we need to consider all
using-directives
which have common ancestor (of nominated namespace, and context
containing using-directive).
I see few ways to solve it:
- sort by common ancestor, use binary search to find range (we do
that now, we can check other sort algorithms, qsort might be bad,
since memory allocation often creates increasing addresses)
- use multiset/hashset
- use linear search (gcc does that)
Which is faster for real code? I don't know, have not messured it
honestly, I tend to belive it will be bit faster than gcc for larger
number of using-directives, or similar..., for small number we lose
now for sure.
I would like to play with it, and tweak it at later point, but I don't
think microbenchmarks are way to go.
Oh, I agree. I'm not concerned about the specific mechanism---sort_heap,
sort, whatever---but about the need to perform this kind of computation
every time we perform name lookup in C++.
Moreover, I consider caching this
sorted list in Scope structure (probably not updating online, just
invalidating), then it will always be a win over gcc in terms of time.
This is what I was looking for. We can cache the results of this operation
in the Scope structure or (in some cases) DeclContext. Performing this
operation once per LookupName call would be quite painful; but if the number
of times we need to do this is proportional instead to the number of
using-directives, that would be fine.
I would like to avoid making DeclContext heavier, without big
benfit... Scope sounds better to me.I have put some effort to avoid
that, because:
- C doesn't need it.
- There will be houndreds DeclContext's in typical code
- I will be surprised if there we were many more than 20 Scope objects
in typical
code at one time
- Scope has short pretty life, if we compare it to DeclContext.
- Other AST clients can't benefit from that information anyway...
There is still other issue, hurting lookup, not addressed by this
patch too, ie. looking in same DeclContext many times.
Oh, certainly. We have FIXME's for a bunch of ways we can make name lookup
faster for C++.
I would also want to also ask, if there is any master plan how to get
rid of OverloadedFunctionDecl yet, or anyone actively working on it
now?
The master plan is for LookupResult to provide an easy way to iterate
over
the set of overloaded functions that it found, which means providing a
LookupResult::iterator that understands how LookupResult stores
overloaded
functions (as a pair of iterators into a DeclContext or into an
IdentifierResolver chain). If it would help with using-directives, I
could
implement LookupResult::iterator relatively soon.
It is not essential now, but at later point I could avoid few
allocations! while merging multiple OverloadedFunctionDecls (now I
reuse first found OverloadedFunctionDecl, and free other).
Okay. I'll try to do this soon, anyway; it's always good to have an excuse
to do these kinds of architectural improvements.
Thanks!
Right now, OverloadedFunctionDecl is only created "on demand" by
LookupResult when the caller requests a Decl*. Callers should be migrated
to
look at the kind of LookupResult they receive from Lookup*Name, so that
they
can handle both ambiguities and overloaded functions. This change can
happen
gradually, and at some point none of the callers will need
OverloadedFunctionDecl. I haven't been actively working on doing this,
but
for the most part it also isn't hard. The trickiest part is deciding what
to
do with DeclRefExpr's that refer to overloaded functions.
Access to stored FunctionDecls avoiding allocation of
OverloadedFunctionDecl would improve performance in same cases, but
that would be not be big change.
Right.
Some comments on your patch:
+/// UsingDirectiveDecl - Represents C++ using-directive. For example:
+/// using namespace std;
+// NB: UsingDirectiveDecl should be Decl not NamedDecl, but we provide
+// name artificial name, for all using-directives in order to store
+// them in DeclContext effectivly.
+class UsingDirectiveDecl : public NamedDecl {
This will work, although I think we could do slightly better by altering
DeclContext's internal structure so that it can store UsingDirectiveDecls.
However, you don't need to worry about that for this patch: I can make that
DeclContext change after your patch goes in.
Also, typo "name artificial name".
+ UsingDirectiveDecl(DeclContext *DC, SourceLocation L,
+ SourceLocation NamespcLoc,
+ SourceLocation IdentLoc,
+ NamespaceDecl *Used)
+ : NamedDecl(Decl::UsingDirective, DC, L, getName()),
+ NamespaceLoc(NamespcLoc), IdentLoc(IdentLoc),
+ UsedNamespace(Used? Used->getOriginalNamespace() : 0) {
+ // Find enclosing context containing both using-directive and
+ // used namespace.
+ DeclContext *Ctx = Used;
+ while (Ctx && !Ctx->Encloses(getDeclContext()))
+ Ctx = Ctx->getParent();
+
+ CommonAncestor = Ctx;
+ }
I'd prefer that this computation be performed within Sema, and the
CommonAncestor value passed in to the constructor.
Done.
+ /// getUsedNamespace - Returns namespace nominated by using-directive.
+ NamespaceDecl *getUsedNamespace() { return UsedNamespace; }
+
+ const NamespaceDecl *getUsedNamespace() const {
+ return const_cast<UsingDirectiveDecl*>(this)->getUsedNamespace();
+ }
Why not use the name "getNominatedNamespace", since that's the standard
term?
True, just Used was easier, but changed that too ![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/google/slight_smile.png?v=12)
+ // UsingDirectiveDecl's are not really NamedDecl's, and all have same
name.
+ // We want to keep them unconditionally.
+ if (getKind() == Decl::UsingDirective)
+ return false; //FIXME: We could conserve space in some rare cases
returning
+ //true if both decls point on same namespaces.
+
Isn't this FIXME resolved by just checking whether
cast<UsingDirectiveDecl>(this)->getUsedNamespace()->getPrimaryContext() ==
cast<UsingDirectiveDecl>(OldD)->getUsedNamespace()->getPrimaryContext()?
Indeed, I was up to fix that before sending that, but just forgot. Fixed now.
getUsedNamespace/getNominatedNamespace always returns original namespace, so
getPrimaryContext is not needed here.
+ } else if (UsingDirectiveDecl *UD = dyn_cast<UsingDirectiveDecl>(D)) {
+ // print using-directive decl (e.g. "using namespace x;")
+ const char *ns;
+ if (const IdentifierInfo *II = UD->getUsedNamespace()->getIdentifier())
+ ns = II->getName();
+ else
+ ns = "<anonymous>";
+ fprintf(F, "\"%s %s;\"",UD->getDeclKindName(), ns);
Is it even possible to have a using directive that nominates an anonymous
namespace? I guess it depends on how we implement unnamed namespaces.
That reveals, how I plan deal with anonymous namespaces for name lookup ![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/google/slight_smile.png?v=12)
I wanted add implicit using-directive in enclosing context (as
Sebastian already pointed out). This should be enough, assuming that,
we work only with single translation unit.
In fact, I was planing to implement unqualified/qualified name lookup,
gcc strong-using/inline namespace and anonymous namespaces together,
but later it turned out to be too big for one patch!
+ bool hasBasePaths() const;
+
Do we need this? getBasePaths() could just return NULL if there are no base
paths.
Indeed, removed.
I like what you've done with LookupResult.
+ /// Name lookup results in an ambiguity because multiple definitions
+ /// of entity tat meet the lookup criteria were found in different
+ /// declaration contextes.
+ /// @code
+ /// namespace A {
+ /// int i;
+ /// namespace B { int i; }
+ /// int test() {
+ /// using namespace B;
+ /// return i; // error 'i' is found in namespace A and A::B
+ /// }
+ /// }
+ /// @endcode
+ AmbiguousReference
Two typos in that comment: "tat" and "contextes".
Fixed!
Should Sema::ActOnUsingDirective assert that the scope S it gets is a
DeclScope?
+ // Ensure, that decls are being added to orginal namespace.
+ // Ctx = Ctx->getPrimaryContext(); XXX: needless
+ Ctx->addDecl(UDir);
Right, we don't need that Ctx = Ctx->getPrimaryContext() here, but we will
need it later...
No we won't, using-directive should be added to lexical context, and
only be visible
in semantic context, just like other NamedDecls. This is how
DeclContext::addDecl works now, so we don't need anything special
here, right?
+ /// @brief Comapres UsingDirectiveDecl common ancestor with DeclContext.
+ bool operator () (UsingDirectiveDecl *U, const DeclContext *Ctx) const {
+ return U->getCommonAncestor() < Ctx;
+ }
Typo: "Comapres"
Fixed.
+/// AddNamespaceUsingDirectives - Adds all UsingDirectiveDecl's found in
+/// namespace NS including all found (recursivly) in nominated by them
+/// namespaces, to heap UDirs, (ordering by common ancestor).
+void AddNamespaceUsingDirectives(DeclContext *NS,
+ UsingDirectivesTy &UDirs,
+ NamespaceSet &Visited) {
Typo "recursivly", but that sentence also needs some work.
Fixed, OK now?
+ DeclContext *Parent = Ctx->getParent();
+
+ // FIXME: do we really need it, do we need add its parent too?
+ if (Parent && Ctx->getLexicalParent() != Parent) {
+ if (NamespaceDecl *NS = dyn_cast<NamespaceDecl>(Parent)) {
+ if (VisitedNS.count(NS)) return;
+ VisitedNS.insert(NS);
+ }
+ AddNamespaceUsingDirectives(Parent, UDirs, /*ref*/ VisitedNS);
+ }
Why are we looking into the parent context here? Using directives nominate
namespaces that are searched but, IIUC, their enclosing namespaces are not
searched.
It will be eventually, but true, we don't need it here.
+ Scope::udir_iterator
+ I = S->using_directives_begin(),
+ End = S->using_directives_end();
+
+ for (I, End; I != End; ++I) {
This could just be
for(; I != End; ++I) {
Ok, old habit, changed.
+/// FindUsingDirsByAncestors - Returns range [First, Last)
UsingDirectiveDecls
+/// which UsingDirectiveDecl::getCommonAncestor() == Ctx.
+/// It requires UDirs to be sorted by common ancestor.
+static std::pair<UsingDirectivesTy::iterator, UsingDirectivesTy::iterator>
+FindUsingDirsByAncestors(UsingDirectivesTy &UDirs, DeclContext *Ctx) {
+
+ UsingDirAncestorCompare Cmp;
+ UsingDirectivesTy::iterator First, Last;
+
+ First = std::lower_bound(UDirs.begin(), UDirs.end(), Ctx, Cmp);
+ Last = std::upper_bound(First, UDirs.end(), Ctx, Cmp);
+
+ return std::make_pair(First, Last);
+}
This whole function is just
return std::equal_range(UDirs.begin(), UDirs.end(), Ctx,
UsingDirAncestorCompare());
![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/google/slight_smile.png?v=12)
Thanks!
+/// Merges together multiplie LookupResults dealing with duplicated Decl's.
+static Sema::LookupResult
+MergeLookupResults(ASTContext &Context, LookupResultsTy &Results) {
Typo: "multiplie"
Fixed.
+ LookupResultsTy::iterator I = Results.begin(), End = Results.end();
+ for (I; I != End; ++I) {
Could be:
for (; I != End; ++I) {
Ok.
+ case LResult::AmbiguousBaseSubobjectTypes:
+ case LResult::AmbiguousBaseSubobjects: {
+ // FIXME: Do we care about later possibly found
+ // AmbiguousBaseSubobjectTypes, or AmbiguousBaseSubobjects?
+ // Is it ever possible?
I don't believe this is ever possible, since using directives can't occur at
class scope. We should probably have an assert() here.
It shouldn't be possible, however it is, probably because we look more
than once in same DeclContext.
+ case LResult::FoundOverloaded:
+ if (FoundOverloaded) {
+ // We have one spare OverloadedFunctionDecl already, so we store
+ // its function decls and free it.
+ OverloadedFunctionDecl *Ovl =
+ cast<OverloadedFunctionDecl>(I->getAsDecl());
+
+ OverloadedFunctionDecl::function_iterator
+ FI = Ovl->function_begin(),
+ FEnd = Ovl->function_end();
+
+ for (FI; FI != FEnd; ++FI)
+ FoundDecls.push_back(*FI);
+
+ Ovl->Destroy(Context);
+ } else {
+ // First time we found OverloadedFunctionDecl, we want to conserve
+ // it, and possibly add other found Decls later.
+ FoundOverloaded = cast<OverloadedFunctionDecl>(I->getAsDecl());
+ }
+ break;
In the "else" branch, don't you need to add all of the decls to FoundDecls?
No, because I reuse first OverloadedFunctionDecl I find, so it has
them already, do you see otherwise?
+ // Remove duplicated Decl pointing at same Decl, this might be case
+ // for code like:
+ //
+ // namespace A { int i; }
+ // namespace B { using namespace A; }
+ // namespace C { using namespace A; }
+ //
+ // void foo() {
+ // using namespace B;
+ // using namespace C;
+ // ++i; // finds A::i, from both namespace B and C at global scope
+ // }
+ //
+ // FIXME: Needs quote from paper.
+ // This is not ambiguous lookup.
+ //
+ // FIXME: At this point happends too, because we are doing redundant
lookups.
The paragraph you're looking for is 3.4.3.2p3:
The same declaration found more than once is not an ambiguity
(because it is still a unique declaration).
There's a typo "happends" in the second FIXME.
Thanks! Ugh, time to make vim check my errors...
+ DeclsVecTy::iterator DI = FoundDecls.begin(), DEnd = FoundDecls.end();
+ std::sort(DI, DEnd);
+ DEnd = std::unique(DI, DEnd);
I have to wonder if it would be easier just to insert everything into a
SmallPtrSet and then just iterate over the results, especially given the
code below it:
+ if (FoundOverloaded) {
+ // We found overloaded functions result. We want to add any other
+ // found decls, that are not already in FoundOverloaded, and are
functions
+ // or methods.
+ OverloadedFunctionDecl::function_iterator
+ FBegin = FoundOverloaded->function_begin(),
+ FEnd = FoundOverloaded->function_end();
+ for (DI ; DI < DEnd; ++DI) {
+ FunctionDecl *Fun = dyn_cast<FunctionDecl>(*DI);
+ if (Fun && (std::find(FBegin, FEnd, Fun) != FEnd)) { /* Skip.*/ }
+ else DEnd = std::remove(DI, DEnd, *DI);
+ }
This is O(n^2) in [DI, DEnd), because remove() is linear time. Sure, [DI,
DIEnd) should be pretty small---and so should [FBegin, FEnd) which we search
through linearly---but I think this whole section would be cleaner (and no
less efficient) with SmallPtrSet.
Would it fair, to fix it once OverloadedFunctionDecl dies, from
exactly small detail you mentioned below... Decls need to be added in
same order we found it.
+ // We might found multiple RecordDecls pointing at same definition.
+ // FIXME: What about case like:
+ // class X; class X {}; X::X(X&) {}
+ if (RecordDecl *R = dyn_cast<RecordDecl>(*FoundDecls.begin())) {
+ // Q: Is that correct way to check RecordDecls equivalence?
+ RecordDecl *RDef = R->getDefinition(Context);
+ DeclsVecTy::iterator RI = FoundDecls.begin(), REnd = DEnd;
This will work when the class has been defined, but not if it's only been
forward-declared, since getDefinition() will return a NULL pointer in that
case. I think it's time that we finally add a function
TagDecl *ASTContext::getCanonicalDecl(TagDecl *Tag) const {
QualType T = getTypeDeclType(Tag);
return cast<TagDecl>(cast<TagType>(T)->getDecl());
}
because this is *exactly* the case where we need it. I also suggest making
this code work on TagDecls, rather than just RecordDecls, since eventually
we'll need to cope with forward-declared enums (for C++0x, GNU C).
Thanks, added getCanonicalDecl, and used it.
+ // We might find FunctionDecls in two (or more) distinct DeclContexts.
+ Decl *D = MaybeConstructOverloadSet(Context, DI, DEnd);
+ if ((FoundLen == 1) || isa<OverloadedFunctionDecl>(D))
+ return LResult::CreateLookupResult(Context, D);
It's very interesting that this implements 3.4.3.2p5 (a comment and quote
from the standard here would help!), although I think there's one issue:
MaybeConstructOverloadSet(Context, DI, DEnd) expects that, if the sequence
[DI, DEnd) contains both object and tag names, that the object names precede
the tag names. This won't necessarily be the case with [DI, DEnd), since we
put decls into FoundDecls in the order we found them.
Are you sure it is right, in last final draft I have 3.4.3.2.p5 is:
"During the lookup of a qualified namespace member name, if the lookup
finds more than one declaration of the member, and if one
declaration introduces a class name or enumeration name and the other
declarations either introduce the same object, the same enumerator or
a set of functions, the non-type name hides the class or enumeration
name if and only if the declarations are from the same namespace;
otherwise
(the declarations are from different namespaces), the program is ill-formed."
But we do unqualified lookup here.
I based it rather on 3.4.p1:
"... Name lookup may associate more than one declaration with a name
if it finds the name to be a function name; the declarations are said
to form a set of
overloaded functions (13.1). Overload resolution (13.3) takes place
after name lookup has succeeded."
Is that correct?
+// Return qualified name for Decl D, like A:
:i, for i being member of
+// namespace A::B.
+std::string
+getQualifiedName(Decl *D) {
This is used for diagnostics. It's certainly not something you need to do,
but eventually I think we'll pull this into the diagnostics machinery, and
teach it to print NamedDecls with qualified names with markup like "%q0".
I left it alone for now, I will try to implement it in next patch.
+ // Collect UsingDirectiveDecls in all scopes, and recursivly all
+ // nominated namespaces by those using-directives.
+ // UsingDirectives are pushed to heap, in common ancestor pointer
+ // value order.
+ UsingDirectivesTy UDirs;
+ for (Scope *SC = Initial; SC; SC = SC->getParent())
+ AddScopeUsingDirectives(SC, UDirs);
+
+ // Sort heapified UsingDirectiveDecls.
+ std::sort_heap(UDirs.begin(), UDirs.end());
Could we move this computation down below the part that looks in local
scopes for names? For example, if we have something like this:
namespace N { int i; }
using namespace N;
int f(int i) {
return i;
}
Now 'i' is found before sort for this case, we return result in line 562:
return std::make_pair(true, Result);
I have checked that in debugger, do you see otherwise?
When doing name lookup for "i", we'll never look into a namespace or global
scope, so there's no reason to go through the effort of looking for using
directives.
+ for (; S && isFunctionScope(S); S = S->getParent()) {
// ...
+ // NB: Icky, we need to look in function scope, but we need to check
its
+ // parent DeclContext, in case it is out of line method definition.
+ if (S->getEntity() && isFunctionScope(S))
+ break;
This doesn't look right... function scopes have entities (the FunctionDecl);
it's only when that entity is a C++ member function that we need to look
into its (semantic) parent *after* we've looked in function scope. This look
should exit out once we hit a scope S whose entity is a NamespaceDecl or
TranslationUnitDecl; after that, we need to start worrying about using
directives.
If we did not break here; Scope->getEntity() would be translation-unit
or namespace, for out of line member function definition, and we would
not perform member name lookup. It is true, we could do
LookupQualifiedName here, and stop later at translation-unit namespace
Scope. Is that what you wanted me to implement?
+ for (llvm::tie(UI, UEnd) =
+ FindUsingDirsByAncestors(UDirs, Ctx); UI != UEnd; ++UI) {
+ // FIXME: We will have to ensure, that we won't consider
+ // again using-directives during qualified name lookup!
+ // (Once using-directives support for qualified name lookup gets
+ // implemented).
+ if (LookupResult R = LookupQualifiedName((*UI)->getUsedNamespace(),
+ Name, NameKind, RedeclarationOnly))
+ LookupResults.push_back(R);
Regarding this FIXME, I think CppLookupName will become a bit easier if
LookupQualifiedName understood how to search for those names within its
DeclContext that should be found via using directives. In particular,
LookupQualifiedName could take an optional "UsingDirectivesTy const *UDirs"
parameter that it would use to determine which other namespaces to search
in. I think the end result would be clearer (and more functional).
LookupQualifiedName will need to get logic for using-directives, which
is quite different from what we want for unqualified name lookup. That
means, it would get 2 modes for:
- unqualified name lookup
- qualified name lookup
Qualified name lookup is quite complicated too unfortunately, involves
tracking visited
contexts, etc...
Beside we will probably need off switch for considering
using-directives for ADL, because of 3.4.2.p3.
If it is not essential to land this patch, I would like to implement
qualified name lookup first.
This is a surprisingly complicated feature (who knew?), but I definitely
think you're close to the right implementation.
I did not
But looking amount of code gcc needs...
Your approach in walking all
of the using-directives in scope once, then sorting it and using binary
search to subset the list appropriately for each DeclContext lookup, is very
interesting: I was thinking about a different, far less clever
implementation, but I really like yours. If we can fix up most of the issues
I mentioned above, and perhaps throw a few extra test-cases at it (I can
help write some more later), it'll be ready to go in.
Thank you!
-Doug
I also added FIXME for caching sorted list of using-directives, I
would like to wait
with that, to moment when I know what data is needed to get qualified
lookup fast...
Piotr
decl.udir.v3.diff (59.8 KB)