PATCH: C++ Function Overloading (v2)

Here's a slightly revised version of the function overloading patch.
It depends on the most recent version of the overloaded function
declarations patch, here:

  http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-September/002779.html

Changes from the previous version:
  - The various small fixes for C++ support (string literals, bool)
aren't in here; I've already committed those.
  - Integral promotion has been fixed, and integral bitfield promotion
is now implemented
  - Better tests
  - Better names for the new functions

There are still big pieces missing (derived-to-base pointer
conversions, reference binding, user-defined conversions and
converting constructors), but the framework is here and it overloads
C-ish types.

  - Doug

clang-overloading.patch (50 KB)

Hi Doug,

I recently thought about overload resolution again and I have some questions.
While I understand the necessity of the OverloadType, is OverloadedFunctionDecl really necessary ?

It seems to me that overload resolution is part of semantic analysis (the Sema), and OverloadedFunctionDecl is like implementation details of Sema "leaking" into the AST.
OverloadedFunctionDecl will primarily be used by Sema, and expressions referencing it will get replaced and resolved to one FunctionDecl except probably when the expression is part of a template.
In the case that the AST clients get an expression referencing OverloadedFunctionDecl, it's unlikely that it will be useful to them, without getting it resolved by the same semantic analysis as that of Sema's.

How about if a reference of overloaded function "f", results in a OverloadExpr (with OverloadType), that contains the "f" IdentifierInfo and a 'void*' opaque value meant to be used by Sema (or in general, the Action implementation that created the OverloadExpr) ?
The advantage of this is that Sema has full control and flexibility over the overload resolution mechanism, e.g. it can change it completely and it will have no impact on the AST or the AST clients.

-Argiris

Doug Gregor wrote:

Hi Argiris,

I recently thought about overload resolution again and I have some
questions.
While I understand the necessity of the OverloadType, is
OverloadedFunctionDecl really necessary ?

We certainly need some way to represent the notion of an expression
that refers to an overloaded function; that's a basic notion in C++,
because name lookup can return a set of overloaded functions. Maybe it
could be something other than a DeclRefExpr that references an
OverloadedFunctionDecl.

It seems to me that overload resolution is part of semantic analysis (the
Sema), and OverloadedFunctionDecl is like implementation details of Sema
"leaking" into the AST.
OverloadedFunctionDecl will primarily be used by Sema, and expressions
referencing it will get replaced and resolved to one FunctionDecl except
probably when the expression is part of a template.

Templates will need to store sets of overloaded functions.

In the case that the AST clients get an expression referencing
OverloadedFunctionDecl, it's unlikely that it will be useful to them,
without getting it resolved by the same semantic analysis as that of Sema's.

I can think of two kinds of clients that might want or need to deal
with OverloadedFunctionDecls.

First of all, any clients that are focused on parsing and don't need
to do much semantic analysis could certainly skip overload resolution,
and would therefore leave the OverloadedFunctionDecls in place since
they don't need to be resolved.

Second, clients that want to do some kind of speculative overload
resolution would need to query the overloads that show up in the
overload set. For example, an IDE that tries to help with overload
resolution by showing options while you type. Type "f(17, " and it
shows you all of the f's that could still be called. This client needs
to be able to ask the AST or Sema (it's not clear which) which "f"'s
are available, and then ask Sema which ones are still callable.

None of these require the exact OverloadedFunctionDecl formulation in
my patch, but they are AST clients to consider.

How about if a reference of overloaded function "f", results in a
OverloadExpr (with OverloadType), that contains the "f" IdentifierInfo and a
'void*' opaque value meant to be used by Sema (or in general, the Action
implementation that created the OverloadExpr) ?
The advantage of this is that Sema has full control and flexibility over the
overload resolution mechanism, e.g. it can change it completely and it will
have no impact on the AST or the AST clients.

The representation of the set of overloaded functions doesn't really
have anything to do with the overload resolution mechanism. Overload
resolution has to pass over the set of overloaded functions found by
name lookup and build its own data structures, collect other functions
found via argument-dependent lookup, etc., so I think it is very
unlikely even that two different Semas would need different
representations of the set of overloaded functions.

The disadvantage of having the 'void*' opaque value is that it makes
ownership harder: how does this void* get deallocated, serialized, or
de-serialized?

  - Doug

Doug Gregor wrote:

We certainly need some way to represent the notion of an expression
that refers to an overloaded function; that's a basic notion in C++,
because name lookup can return a set of overloaded functions. Maybe it
could be something other than a DeclRefExpr that references an
OverloadedFunctionDecl.
  
Yes, I question only the use of OverloadedFunctionDecl. Currently, IdentifierResolver can already provide the set of overloaded functions so the overhead of OverloadedFunctionDecl is not strictly needed.

Templates will need to store sets of overloaded functions.
  
Even now, with no trace of a template implementation in clang, the use of OverloadedFunctionDecl is questionable:

-First, consider this example:

void f(char); #1

template<class T> void g(T t)
{
  f(t); #2 //dependent
}

void f(int); #3

void h()
{
  g(2); #4 //should cause a call of f(int)
}

At #2 'f' does not have any overloads and no OverloadedFunctionDecl exists, how should this be represented ?
If #2 is regarded as a call to #1, then we miss #3 at instantiation time (#4) (as specified in C++ 14.6p9).
In order to pickup #3, should we turn 'f' into a OverloadedFunctionDecl, even if 'f' nevers gets an overload, just because it gets called in a templated expression ?

-Second, I see no reason why IdentifierResolver can't be made to eventually work with templates and be used to get the overload set for 'f' at #4 (if it turns out that there are overloads)

I can think of two kinds of clients that might want or need to deal
with OverloadedFunctionDecls.

First of all, any clients that are focused on parsing and don't need
to do much semantic analysis could certainly skip overload resolution,
and would therefore leave the OverloadedFunctionDecls in place since
they don't need to be resolved.

Second, clients that want to do some kind of speculative overload
resolution would need to query the overloads that show up in the
overload set. For example, an IDE that tries to help with overload
resolution by showing options while you type. Type "f(17, " and it
shows you all of the f's that could still be called. This client needs
to be able to ask the AST or Sema (it's not clear which) which "f"'s
are available, and then ask Sema which ones are still callable.

None of these require the exact OverloadedFunctionDecl formulation in
my patch, but they are AST clients to consider.
  
There was a discussion here:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-September/002841.html
about how clients should reason about symbols/identifiers. The consensus was that Sema builds up a lot of useful semantic information that later discards. It would be useful to somehow expose semantic information from Sema and avoid bloating the AST with semantic information (as would be the case with OverloadedFunctionDecl).
For example, I think that eventually IdentifierResolver will either be exposed by Sema to clients or a similar construct will be used. By your example, an IDE usually wants to inquire "I'm in this scope, what can this identifier mean in this context ? Give me a set of decls." which is exactly the functionality and purpose of IdentifierResolver.

How about if a reference of overloaded function "f", results in a
OverloadExpr (with OverloadType), that contains the "f" IdentifierInfo and a
'void*' opaque value meant to be used by Sema (or in general, the Action
implementation that created the OverloadExpr) ?
The advantage of this is that Sema has full control and flexibility over the
overload resolution mechanism, e.g. it can change it completely and it will
have no impact on the AST or the AST clients.
    
The representation of the set of overloaded functions doesn't really
have anything to do with the overload resolution mechanism. Overload
resolution has to pass over the set of overloaded functions found by
name lookup and build its own data structures, collect other functions
found via argument-dependent lookup, etc., so I think it is very
unlikely even that two different Semas would need different
representations of the set of overloaded functions.

The disadvantage of having the 'void*' opaque value is that it makes
ownership harder: how does this void* get deallocated, serialized, or
de-serialized?
  
I agree, the opaque value definitely complicates things. How about a OverloadExpr that contains a DeclContext* and an IdentifierInfo* (for the moment, not sure what exact structure would be needed for templates).

To summarize, my point is that currently OverloadedFunctionDecl is an unnecessary overhead since IdentifierResolver already can be used to get the set of overloads. If later on (e.g. for templates), we find out that IdentifierResolver is not sufficient, we can always add OverloadedFunctionDecl too. At the moment it makes little sense to have them since the only expressions that are supposed to reference them (and not get immediately resolved) are expressions in templates, and there's a long way ahead before reaching the "template hill" :slight_smile:

A bit off-topic:
Currently the iterator abstraction of IdentifierResolver is stretched thin and will probably snap when more name lookup rules are added. I lean towards removing the iterator and using a single function that returns the set of decls based on an identifier.

-Argiris

Hi Argiris,

Doug Gregor wrote:

Templates will need to store sets of overloaded functions.

Even now, with no trace of a template implementation in clang, the use of
OverloadedFunctionDecl is questionable:

-First, consider this example:

void f(char); #1

template<class T> void g(T t)
{
f(t); #2 //dependent
}

void f(int); #3

void h()
{
g(2); #4 //should cause a call of f(int)
}

At #2 'f' does not have any overloads and no OverloadedFunctionDecl exists,
how should this be represented ?

It's just a FunctionDecl, because #1 is the only f. That's what we're doing now.

If #2 is regarded as a call to #1, then we miss #3 at instantiation time
(#4) (as specified in C++ 14.6p9).
In order to pickup #3, should we turn 'f' into a OverloadedFunctionDecl,
even if 'f' nevers gets an overload, just because it gets called in a
templated expression ?

We aren't supposed to pick up #3. GCC (and most other compilers) gets
this wrong; EDG gets it right. You can try it by tweaking your example
code a bit:

char& f(char); // #1

template<class T> void g(T t)
{
  int& x = f(t); // #2 //dependent
}

int& f(int); // #3

void h()
{
  g(2); // #4 //should cause a call of f(char), so it's ill-formed
}

The relevant paragraph is C++ 14.6.4p1, which specifies which names
are considered when resolving dependent names. The set of names
contains the names we found via name lookup at template definition
time (#1) plus any names we find via argument dependent lookup at
template definition time or at template instantiation time. This
latter rule means that if you tweak the example to use a user-defined
type rather than 'int' for #1 and for the call to g(), we will find
#3.

-Second, I see no reason why IdentifierResolver can't be made to eventually
work with templates and be used to get the overload set for 'f' at #4 (if it
turns out that there are overloads)

I can think of two kinds of clients that might want or need to deal
with OverloadedFunctionDecls.

First of all, any clients that are focused on parsing and don't need
to do much semantic analysis could certainly skip overload resolution,
and would therefore leave the OverloadedFunctionDecls in place since
they don't need to be resolved.

Second, clients that want to do some kind of speculative overload
resolution would need to query the overloads that show up in the
overload set. For example, an IDE that tries to help with overload
resolution by showing options while you type. Type "f(17, " and it
shows you all of the f's that could still be called. This client needs
to be able to ask the AST or Sema (it's not clear which) which "f"'s
are available, and then ask Sema which ones are still callable.

None of these require the exact OverloadedFunctionDecl formulation in
my patch, but they are AST clients to consider.

There was a discussion here:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-September/002841.html
about how clients should reason about symbols/identifiers. The consensus was
that Sema builds up a lot of useful semantic information that later
discards. It would be useful to somehow expose semantic information from
Sema and avoid bloating the AST with semantic information (as would be the
case with OverloadedFunctionDecl).
For example, I think that eventually IdentifierResolver will either be
exposed by Sema to clients or a similar construct will be used. By your
example, an IDE usually wants to inquire "I'm in this scope, what can this
identifier mean in this context ? Give me a set of decls." which is exactly
the functionality and purpose of IdentifierResolver.

Sure, although it's a slightly different form that what you state. We
need: "give me the set of decls that I would have seen at this
particular point in the translation."

The disadvantage of having the 'void*' opaque value is that it makes
ownership harder: how does this void* get deallocated, serialized, or
de-serialized?

I agree, the opaque value definitely complicates things. How about a
OverloadExpr that contains a DeclContext* and an IdentifierInfo*

I like DeclContext* + IdentifierInfo*; it's small and efficient. I'm
not thrilled about OverloadExpr, because I think it says the wrong
thing: we really are talking about a set of overloaded function
declarations, and that should be described by a Decl node.

Now, to deal with the template case you provided above, we will have
to extend this DeclContext* + IdentifierInfo* scheme to restrict name
lookup to only those declarations we would have seen at the point in
the translation where the OverloadedFunctionDecl was created. That's
not terribly hard---we could just use some kind of token or sequence
number and filter based on that---but it's the kind of thing that
would be hard to do across translation units, because the
token/sequence number in one translation unit cannot be mapped to a
token/sequence number in another translation unit. So, while
DeclContext* + IdentifierInfo* is serializable, DeclContext* +
IdentifierInfo* + sequence number isn't serializable.

(for the
moment, not sure what exact structure would be needed for templates).

There will be (optional) template arguments (e.g., in a call like
"f<int, float>(blah)"), but in this case I think we can ignore those
in our initial design.

To summarize, my point is that currently OverloadedFunctionDecl is an
unnecessary overhead since IdentifierResolver already can be used to get the
set of overloads. If later on (e.g. for templates), we find out that
IdentifierResolver is not sufficient, we can always add
OverloadedFunctionDecl too. At the moment it makes little sense to have them
since the only expressions that are supposed to reference them (and not get
immediately resolved) are expressions in templates, and there's a long way
ahead before reaching the "template hill" :slight_smile:

As you know, a lot of my design principles involve making sure that
the thing we design now is still the right abstraction for the time
when we implement templates. I certainly don't mind doing refactoring
now or down the road, as having a clean, well-documented AST makes
that easy, but I also don't want to make a trade-off now that we know
will do the wrong thing when templates come along.

A bit off-topic:
Currently the iterator abstraction of IdentifierResolver is stretched thin
and will probably snap when more name lookup rules are added. I lean towards
removing the iterator and using a single function that returns the set of
decls based on an identifier.

Iterators can be arbitrarily complex, so it's not that the iterator
abstraction *couldn't* handle this. The thing I like best about the
iterator abstraction is that it doesn't require us to allocate any
memory like returning a set of decls does, and doesn't expose any one
particular data structure. That said, the iterator itself will need to
get some more configurability---follow using directives or don't
follow using directives?, to name one---that might cost us too much in
iteration overhead.

  - Doug

Doug Gregor wrote:

  

Doug Gregor wrote:
    

Templates will need to store sets of overloaded functions.

Even now, with no trace of a template implementation in clang, the use of
OverloadedFunctionDecl is questionable:

-First, consider this example:

void f(char); #1

template<class T> void g(T t)
{
f(t); #2 //dependent
}

void f(int); #3

void h()
{
g(2); #4 //should cause a call of f(int)
}

At #2 'f' does not have any overloads and no OverloadedFunctionDecl exists,
how should this be represented ?
    
It's just a FunctionDecl, because #1 is the only f. That's what we're doing now.
  
Hmm.. this is taken verbatim from C++ 03 14.6p9:

void f(char);

template<class T> void g(T t)
{
f(1); // f(char)
f(T(1)); //dependent #1
f(t); //dependent #2
dd++; //not dependent // error: declaration for dd not found
}

void f(int);

double dd;
void h()
{
g(2); //will cause one call of f(char) followed
// by two calls of f(int)
g(’a’); //will cause three calls of f(char)
}

Note that it says "g(2)" will cause one call of f(char) followed by two calls of f(int). By regarding the 'f', inside the template, just a FunctionDecl (the "f(char)" one), won't this result to 3 calls of f(char) ?

By 14.6.2p1, the 'f' inside the template (at #1 and #2) is a dependent name and "Such names are unbound and are looked up at the point of the template instantiation (14.6.4.1) in both the context of the template definition and the context of the point of instantiation". Doesn't that mean that 'f' is not a simple FunctionDecl (since it's actually unbound) ?

Iterators can be arbitrarily complex, so it's not that the iterator
abstraction *couldn't* handle this. The thing I like best about the
iterator abstraction is that it doesn't require us to allocate any
memory like returning a set of decls does, and doesn't expose any one
particular data structure. That said, the iterator itself will need to
get some more configurability---follow using directives or don't
follow using directives?, to name one---that might cost us too much in
iteration overhead.
  
Yes, my concern is about the overhead, we could get away with it with not much of an impact though, not too sure.

-Argiris

Hmm.. this is taken verbatim from C++ 03 14.6p9:

void f(char);

template<class T> void g(T t)
{
f(1); // f(char)
f(T(1)); //dependent #1
f(t); //dependent #2
dd++; //not dependent // error: declaration for dd not found
}

void f(int);

double dd;
void h()
{
g(2); //will cause one call of f(char) followed
// by two calls of f(int)
g('a'); //will cause three calls of f(char)
}

Note that it says "g(2)" will cause one call of f(char) followed by two
calls of f(int). By regarding the 'f', inside the template, just a
FunctionDecl (the "f(char)" one), won't this result to 3 calls of f(char) ?

Yeah, the example is correct. This was noted in DR 197:

  http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#197

Doesn't that mean that 'f' is
not a simple FunctionDecl (since it's actually unbound) ?

It'll be a FunctionDecl in the AST. Then, when we actually go to
resolve the call, the set of candidate functions will include "f" and
anything found via ADL (which, in this case, is nothing).

Granted, this does mean that one can have a call in the AST for a
template where the function to be called is an identifier that can't
be resolved until instantiation time, e.g.,

template<typename T> void f(T x, T y) {
  swap(x, y); // there are no declarations of swap(), but that's okay
since it's a dependent name
}

BTW, I've committed my overloading patches basically as-is, because I
have some other changes coming that depend on them. However, I don't
consider the OverloadedFunctionDecl issue to be resolved; we should be
able to tweak its representation or how it's used without affecting
much other code, and it should be easier to experiment with it now
that it's in the tree.

  - Doug

Doug Gregor wrote:

Yeah, the example is correct. This was noted in DR 197:

  http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#197
  
Oh, incorrect examples in the standard, the fun never ends... :wink:

Then OverloadedFunctionDecl has a different issue, consider this:

void f(char); #1
void f(int); #2

template<class T> void g(T t)
{
f(t); #3 //dependent
}

void f(float); #4

void h()
{
g(2.3); #5
f(2.3); #6
}

At #3 there's an OverloadedFunctionDecl for the template to reference. But what about #4 ?
If it gets added to the same OverloadedFunctionDecl then #5 will incorrectly pick it up, while we do want #6 to pick it up.
Will there be two OverloadedFunctionDecls, one for the template and another for normal lookup in #6 ? That seems to defeat its purpose.

How about the 'f' in the template at #3 gets represented as a DependentNameExpr which contains:
-either a list of FunctionDecls found with normal lookup in the template context
-or a DeclContext* and IdentifierInfo* and in order to get the FunctionDecls visible in template context we do not consider declarations that appeared after the DependentNameExpr (using SourceLocation to check)

-Argiris

Doug Gregor wrote:

Yeah, the example is correct. This was noted in DR 197:

http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#197

Oh, incorrect examples in the standard, the fun never ends... :wink:

:slight_smile:

Then OverloadedFunctionDecl has a different issue, consider this:

void f(char); #1
void f(int); #2

template<class T> void g(T t)
{
f(t); #3 //dependent
}

void f(float); #4

void h()
{
g(2.3); #5
f(2.3); #6
}

At #3 there's an OverloadedFunctionDecl for the template to reference. But
what about #4 ?
If it gets added to the same OverloadedFunctionDecl then #5 will incorrectly
pick it up, while we do want #6 to pick it up.
Will there be two OverloadedFunctionDecls, one for the template and another
for normal lookup in #6 ? That seems to defeat its purpose.

I figured we would have two OverloadedFunctionDecls. I don't think it
defeats the purpose, because the AST is storing the set of overloaded
functions that it saw.

How about the 'f' in the template at #3 gets represented as a
DependentNameExpr which contains:
-either a list of FunctionDecls found with normal lookup in the template
context
-or a DeclContext* and IdentifierInfo* and in order to get the FunctionDecls
visible in template context we do not consider declarations that appeared
after the DependentNameExpr (using SourceLocation to check)

Both of these are reasonable representations for DependentNameExpr or
OverloadedFunctionDecl, because in many ways they are expressing the
same notion (although the latter also has its place, at least
temporarily, in non-template code).

I'm guessing that we probably won't need DependentNameExpr, because
dependent names don't really exist on their own in the AST. Rather,
they come about as a property of the arguments used with that name,
e.g., "f" in "f(blah)" is a dependent name if blah is a type-dependent
expression, "+" in "a + b" is a dependent name is "a" or "b" is a
type-dependent expression, etc. "f" or "+" could just as well be
represented by a (possibly empty!) OverloadedFunctionDecl that is
either the callee of a CallExpr or the initial overloaded operator set
of some hypothetical, overloaded version of BinaryOperator.

  - Doug

Doug Gregor wrote:

  

Then OverloadedFunctionDecl has a different issue, consider this:

void f(char); #1
void f(int); #2

template<class T> void g(T t)
{
f(t); #3 //dependent
}

void f(float); #4

void h()
{
g(2.3); #5
f(2.3); #6
}

At #3 there's an OverloadedFunctionDecl for the template to reference. But
what about #4 ?
If it gets added to the same OverloadedFunctionDecl then #5 will incorrectly
pick it up, while we do want #6 to pick it up.
Will there be two OverloadedFunctionDecls, one for the template and another
for normal lookup in #6 ? That seems to defeat its purpose.
    
I figured we would have two OverloadedFunctionDecls. I don't think it
defeats the purpose, because the AST is storing the set of overloaded
functions that it saw.
  
With "defeats the purpose" I mean that OverloadedFunctionDecl would be most useful if it was shared ('f' gets overloaded so 'f' gets replaced and the AST refers to this OverloadedFunctionDecl from now on), but, instead, each expression gets its own OverloadedFunctionDecl object.

So what is the advantage of using
- CallExpr -> DeclRefExpr -> OverloadedFunctionDecl (unique for this expression)
instead of
- CallExpr -> OverloadedFunctionExpr
?

The second avoids the ownership issues of the first (plus it's a bit more compact).

Another related question:
I assume that all FunctionDecls that would be found via ADL in the template definition context will also be found via ADL in the template instantiation context, is this correct ?

And consider another way that having OverloadedFunctionDecl complicates things:

namespace A {
  void f(int); #1
}

void f(char); #2

void g() {
  using namespace A; #3
  f(0);
}

#4

At #3, 'f' becomes an overloaded function (#1 and #2)
At #4, 'f' is not an overloaded function.

If an OverloadedFunctionDecl is created at #3, then it should be reverted for #4.
Much simpler is to just inquire the IdentifierResolver about the set of overloaded functions at each context, and avoid any OverloadedFunctionDecl manipulation.

-Argiris

Hi Argiris,

With "defeats the purpose" I mean that OverloadedFunctionDecl would be most
useful if it was shared ('f' gets overloaded so 'f' gets replaced and the
AST refers to this OverloadedFunctionDecl from now on), but, instead, each
expression gets its own OverloadedFunctionDecl object.

Ah, understood.

So what is the advantage of using
- CallExpr -> DeclRefExpr -> OverloadedFunctionDecl (unique for this
expression)
instead of
- CallExpr -> OverloadedFunctionExpr
?

The second avoids the ownership issues of the first (plus it's a bit more
compact).

The advantage of the first is that name lookup has something it can
return when it finds a set of overloaded functions. Name lookup is
going to find a Decl of some sort, and when that Decl is really a set
of overloaded function declarations, we need a Decl to represent it.
Hence, we have OverloadedFunctionDecl to refer to those declarations.

OverloadedFunctionExpr can serve as an expression that refers to a set
of overloaded functions, but it can't help tell Sema that the lookup
of a name returned a set of overloaded functions.

That issue is separable from the representation of
OverloadedFunctionDecl. In that case, I completely agree that it would
be better to have a DeclContext* + IdentifierInfo* representation, but
IIUC we don't yet have the ability to do that. That's probably because
I have yet to review your qualified-id parsing patch (sorry).

Another related question:
I assume that all FunctionDecls that would be found via ADL in the template
definition context will also be found via ADL in the template instantiation
context, is this correct ?

Yes, for a non-exported template. With an exported template, the
template definition is in a different translation unit, so it could
see different FunctionDecls.

And consider another way that having OverloadedFunctionDecl complicates
things:

namespace A {
void f(int); #1
}

void f(char); #2

void g() {
using namespace A; #3
f(0);
}

#4

At #3, 'f' becomes an overloaded function (#1 and #2)
At #4, 'f' is not an overloaded function.

If an OverloadedFunctionDecl is created at #3, then it should be reverted
for #4.
Much simpler is to just inquire the IdentifierResolver about the set of
overloaded functions at each context, and avoid any OverloadedFunctionDecl
manipulation.

As it stands, the OverloadedFunctionDecls are managed by the
Sema::PushOnScopeChains, which creates or updates the
OverloadedFunctionDecl as functions are added into the scope. That
would likely mean that the lookup of "f" after #3 would create a new
OverloadedFunctionDecl... but there's no sense in arguing this point,
because I think we agree on the representation issue already. It's the
OverloadedFunctionExpr vs. OverloadedFunctionDecl question that
remains.

  - Doug

Doug Gregor wrote:

  

So what is the advantage of using
- CallExpr -> DeclRefExpr -> OverloadedFunctionDecl (unique for this
expression)
instead of
- CallExpr -> OverloadedFunctionExpr
?

The second avoids the ownership issues of the first (plus it's a bit more
compact).
    
The advantage of the first is that name lookup has something it can
return when it finds a set of overloaded functions. Name lookup is
going to find a Decl of some sort, and when that Decl is really a set
of overloaded function declarations, we need a Decl to represent it.
Hence, we have OverloadedFunctionDecl to refer to those declarations.
  
Ok, I think I see the main usefulness of OverloadedFunctionDecl.
LookupDecl returns a Decl; the reasoning is to keep it that way and deal with multiple overloaded FunctionDecls by having LookupDecl return a single OverloadedFunctionDecl. The callers of LookupDecl can still reason of name lookup as returning a single Decl.
While there's a bit of a cost involved, it certainly makes things cleaner.

About the current implementation:
OverloadedFunctionDecl is created in "anticipation" of a possible name lookup and not as the result of "executing" a name lookup (and it doesn't seem to have an owner). Isn't it better to have OverloadedFunctionDecl be created by LookupDecl and have it represent the result of a particular name lookup search ?

e.g: as in the using directive example:

namespace A {
void f(int); #1
}

void f(char); #2

void g() {
using namespace A; #3
}

With the current implementation, a OverloadedFunctionDecl should be created when #3 is encountered, just in case there's a lookup for 'f' later on. Then when 'g()' is exited, the OverloadedFunctionDecl should be discarded and the FunctionDecl be made to take its place.
If the OverloadedFunctionDecl is created only when needed (when there's a name lookup of 'f' involved) no OverloadedFunctionDecl will be created in this case (which is the optimal thing to do).

Another advantage is that using this way, the ownership issue of OverloadedFunctionDecl can be handled in a clean way, like this:
-OverloadedFunctionDecls returned by LookupDecl are newly created objects not owned by anyone. The caller creates a OverloadedFunctionDeclRefExpr expression that holds and owns this specific OverloadedFunctionDecl object.
-OverloadedFunctionDeclRefExpr is a subclass of DeclRefExpr and the main difference is that it owns the referenced Decl.
-When the overloads get resolved (in non-template code), OverloadedFunctionDeclRefExpr is removed and destroyed, which in turn destroys the OverloadedFunctionDecl that it references.
-For templates, OverloadedFunctionDeclRefExpr remain in the AST.

What do you think ?

-Argiris

Doug Gregor wrote:

So what is the advantage of using
- CallExpr -> DeclRefExpr -> OverloadedFunctionDecl (unique for this
expression)
instead of
- CallExpr -> OverloadedFunctionExpr
?

The second avoids the ownership issues of the first (plus it's a bit more
compact).

The advantage of the first is that name lookup has something it can
return when it finds a set of overloaded functions. Name lookup is
going to find a Decl of some sort, and when that Decl is really a set
of overloaded function declarations, we need a Decl to represent it.
Hence, we have OverloadedFunctionDecl to refer to those declarations.

Ok, I think I see the main usefulness of OverloadedFunctionDecl.
LookupDecl returns a Decl; the reasoning is to keep it that way and deal
with multiple overloaded FunctionDecls by having LookupDecl return a single
OverloadedFunctionDecl. The callers of LookupDecl can still reason of name
lookup as returning a single Decl.
While there's a bit of a cost involved, it certainly makes things cleaner.

Yep. And I think we can eliminate most of the cost.

About the current implementation:
OverloadedFunctionDecl is created in "anticipation" of a possible name
lookup and not as the result of "executing" a name lookup (and it doesn't
seem to have an owner). Isn't it better to have OverloadedFunctionDecl be
created by LookupDecl and have it represent the result of a particular name
lookup search ?

e.g: as in the using directive example:

namespace A {
void f(int); #1
}

void f(char); #2

void g() {
using namespace A; #3
}

With the current implementation, a OverloadedFunctionDecl should be created
when #3 is encountered, just in case there's a lookup for 'f' later on. Then
when 'g()' is exited, the OverloadedFunctionDecl should be discarded and the
FunctionDecl be made to take its place.
If the OverloadedFunctionDecl is created only when needed (when there's a
name lookup of 'f' involved) no OverloadedFunctionDecl will be created in
this case (which is the optimal thing to do).

Right. Although, since we don't actually support using directives
right now, we can't really say how it handles them :slight_smile:

Another advantage is that using this way, the ownership issue of
OverloadedFunctionDecl can be handled in a clean way, like this:
-OverloadedFunctionDecls returned by LookupDecl are newly created objects
not owned by anyone. The caller creates a OverloadedFunctionDeclRefExpr
expression that holds and owns this specific OverloadedFunctionDecl object.
-OverloadedFunctionDeclRefExpr is a subclass of DeclRefExpr and the main
difference is that it owns the referenced Decl.
-When the overloads get resolved (in non-template code),
OverloadedFunctionDeclRefExpr is removed and destroyed, which in turn
destroys the OverloadedFunctionDecl that it references.
-For templates, OverloadedFunctionDeclRefExpr remain in the AST.

What do you think ?

Generally, I like it, although I think we can do things a bit more
efficiently. I'd like the DeclContext to own the
OverloadedFunctionDecls, and cache them so that we aren't constantly
re-creating them. This means that, for example, the first lookup of
"f" for an overloaded function would find a number of FunctionDecl
nodes, for which we'd create an OverloadedFunctionDecl. At that point,
we should push the OverloadedFunctionDecl into the scope, so that it
can be found and reused. This would be a lazy form of what the code is
doing now.

In the case of a template, where we need to store the
OverloadedFunctionDecl for later use, I suggest that we have the code
that needs to save the OverloadedFunctionDecl make a copy of it that
will be owned by the DeclRefExpr or OverloadedFunctionDeclRefExpr, so
that we're not wasting time creating and destroying
OverloadedFunctionDecls that will only be used once before being
destroyed.

Another minor comment: we could probably eliminate
OverloadedFunctionDeclRefExpr and instead use the lowest bit of the
NamedDecl* in DeclRefExpr to record whether the DeclRefExpr owns the
declaration or not.

  - Doug

Doug Gregor wrote:

Generally, I like it, although I think we can do things a bit more
efficiently. I'd like the DeclContext to own the
OverloadedFunctionDecls, and cache them so that we aren't constantly
re-creating them. This means that, for example, the first lookup of
"f" for an overloaded function would find a number of FunctionDecl
nodes, for which we'd create an OverloadedFunctionDecl. At that point,
we should push the OverloadedFunctionDecl into the scope, so that it
can be found and reused. This would be a lazy form of what the code is
doing now.

In the case of a template, where we need to store the
OverloadedFunctionDecl for later use, I suggest that we have the code
that needs to save the OverloadedFunctionDecl make a copy of it that
will be owned by the DeclRefExpr or OverloadedFunctionDeclRefExpr, so
that we're not wasting time creating and destroying
OverloadedFunctionDecls that will only be used once before being
destroyed.

Another minor comment: we could probably eliminate
OverloadedFunctionDeclRefExpr and instead use the lowest bit of the
NamedDecl* in DeclRefExpr to record whether the DeclRefExpr owns the
declaration or not.
  
Sounds great!

Thank you for your patience!
I may pester you later on about the possibility of using just the IdentifierInfo instead of a list of FunctionDecls for OverloadedFunctionDecl (which will save space for templates), but not until more name lookup rules (like the 'using' directive) are supported and its clear whether an IdentifierInfo is sufficient or not.

-Argiris