PATCH: Overloaded function declarations in C++

This patch implements support for declaring overloaded functions. For
example, one can now write

  void f();
  void f(int);
  void f(int, float);

and have these functions overloaded appropriately. Name lookup for "f"
will refer to the overload set, although at this point you can't actually do
anything with the overload set (that patch is coming later).

All of the semantic analysis required to distinguish between
redeclarations and overloads is implemented. Most of the interesting
logic is in the new Sema::IsOverload, which determines whether two
function declarations are overloaded and in Sema::MergeFunctionDecl,
which produces diagnostics when two function declarations with the
same signature (meaning: they aren't overloads) aren't declaring the
same function.

The new OverloadedFunctionDecl class stores a set of overloaded
functions. It's generated by Sema when one declares an overload of an
existing function. Name bindings for functions can refer to
OverloadedFunctionDecl nodes, which will store all of the functions in
that scope with that name.

Patch notes:

- An OverloadedFunctionDecl never owns the FunctionDecls it
  stores. The scope where those FunctionDecls were declared owns
  them. However, who owns the OverloadedFunctionDecl? It's required
  for semantic analysis, and will need to be retained for use in
  qualified name lookup later, but it doesn't belong with all of the
  decls written by users.

- The type of an OverloadedFunctionDecl is a special builtin
  "Overload" type, used as a placeholder for OverloadedFunctionDecls.

- When we redeclare a function, we now replace the existing name
  binding for that function with the new declaration. This way, there
  will only be one name binding for a given function in a given scope
  at a time, rather than one for each (re)declaration of that
  function. (This is important for function overloading, because we
  don't want redeclarations to look like overloads). In this patch,
  all of the function declarations and redeclarations show up as they
  always have, the AST will contain references to whichever function
  declaration was active that the time that part of the AST was
  generated [*], but we don't end up with multiple bindings to the
  same function in the same scope.

[*] http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-May/001801.html

- When we overload a function, we replace the existing name binding
  (which will point to a FunctionDecl) with an OverloadedFunctionDecl
  that contains the original FunctionDecl and the new FunctionDecl.

- In C++ mode, we now create a FunctionTypeProto for "void f()" rather
  than a FunctionTypeNoProto.

Comments greatly appreciated.

  Cheers,
  Doug

clang-overloaded-function-decls.patch (29.6 KB)

Hi Doug,

I don't quite see why it's necessary to have an "Overload" type and an OverloadedFunctionDecl AST node.
It seems to me that these are mainly used for name lookup, and IdentifierResolver can be used for such a thing without introducing OverloadedFunctionDecl.
I've attached a patch where I replace the use of OverloadedFunctionDecl by using IdentifierResolver (passes the test case), to show you what I mean.

AST clients may need a set of overloaded functions but I think that this can be provided by utility functions (or by chaining the FunctionDecls).

Let me know what you think.

-Argiris

Doug Gregor wrote:

func-overload.patch (13.7 KB)

Hi Argiris,

I don't quite see why it's necessary to have an "Overload" type and an
OverloadedFunctionDecl AST node.

It is extremely convenient, because it allows the semantic analysis to
see the result of name lookup as "a set of overloaded functions",
without requiring every client of name lookup to walk through the list
of declarations bound to that name with this boilerplate:

+ DeclContext *DC = New->getDeclContext();
+ IdentifierResolver::iterator
+ I = IdResolver.begin(New->getIdentifier(), DC, false/*LookInParentCtx*/);
+ for (; I != IdResolver.end(); ++I) {
+ if (!IdResolver.isDeclInScope(*I, DC, S))
+ return true;

It seems to me that these are mainly used for name lookup, and
IdentifierResolver can be used for such a thing without introducing
OverloadedFunctionDecl.
I've attached a patch where I replace the use of OverloadedFunctionDecl by
using IdentifierResolver (passes the test case), to show you what I mean.

Yes, OverloadedFunctionDecl is mainly used for name lookup, but the
name lookup is often far from the place where the result of name
lookup is consumed. For example, when parsing

  void f(int, float);
  void f(int, int);

  f(1, 2.0);

The identifier "f" refers to both functions "f" declared above, and
will become a DeclRefExpr in the AST. Later, once we've parsed the
call arguments, ActOnCallExpr will need to see that the function "f"
we're referring to is actually a set of overloaded functions. Name
lookup occurred long before, so we're not going to be walking the
IdentifierResolver's results. Something in the DeclRefExpr needs to
have all of the information about all of the functions named "f" in
this scope.

AST clients may need a set of overloaded functions but I think that this can
be provided by utility functions (or by chaining the FunctionDecls).

I don't think chaining the FunctionDecls is the way to go, for two reasons:

  1) It's very error-prone: If the DeclRefExpr points to a
FunctionDecl, a naive client could accidentally look at the
FunctionDecl while forgetting to check whether it has any overloads.
I'm guessing that we'll end up with a lot of places where our
accidental overload-resolution strategy is "grab the most recent
FunctionDecl", because we forgot to walk that chain. Having an
OverloadedFunctionDecl node inside that DeclRefExpr forces clients to
pay attention to overloading, and most of the time they'll just go
perform overload resolution to get it down to a single FunctionDecl.

  2) It's going to interact with function redeclarations.

       void f(int); // #1
       void f(); // #2; overload chain points to #1
       void f(int); // #3; redeclaration of #1. Does the overload
chain in #2 need to be updated? Does this have an overload chain?

    This issue can probably be solved, but it's made redeclarations
that much hairier.

With this patch in isolation, it's a bit hard to see the uses of
OverloadedFunctionDecl. I'll finish up the initial overload-resolution
patch this afternoon/evening and send it out to the list. I hope that
will make the design clearer.

  - Doug

Hi Doug,

I now see much more clearly the design purpose of "Overload" type/OverloadedFunctionDecl.
If you have something like this:

void f(int); // #1
void f(float); // #2
void (*g)(int) = f; // #3

you'd want the "f" expression at #3 to have "Overload" type so that you can do "overload conversion".

What would the expression at #3 look like after the overload resolution ? Will the "DeclRefExpr(OverloadedFunctionDecl)" be replaced by a "DeclRefExpr(FunctionDecl #1)" ?

-Argiris

Doug Gregor wrote:

I agree with Argiris that having an OverloadType and Decl is counterintuitive,
although I think I understand the reason.

It sounds to me like a cleaner solution would be to change the interface with
the parser to just be returning some richer (non-AST) type during name lookup
which can encapsulate the information about overloads. Adding "fake" AST nodes
to accomodate this seems outside the current design spirit and I think may
spread complexity into places that don't deserve it.

- Daniel

Hi Argiris,

I now see much more clearly the design purpose of "Overload"
type/OverloadedFunctionDecl.
If you have something like this:

void f(int); // #1
void f(float); // #2
void (*g)(int) = f; // #3

you'd want the "f" expression at #3 to have "Overload" type so that you can
do "overload conversion".

Exactly.

What would the expression at #3 look like after the overload resolution ?
Will the "DeclRefExpr(OverloadedFunctionDecl)" be replaced by a
"DeclRefExpr(FunctionDecl #1)" ?

Yes, exactly. When we see a DeclRefExpr(OverloadedFunctionDecl) in
Sema, the first thing we'll do is perform overload resolution (in one
of its seven forms) and replace it with a DeclRefExpr(FunctionDecl)
that corresponds to the function selected by overload resolution. The
rest of the Sema routine can then continue is if it had seen the
FunctionDecl.

  - Doug

I agree with Argiris that having an OverloadType and Decl is counterintuitive,
although I think I understand the reason.

Does Argiris still agree with this? I think I may have converted him :slight_smile:

It sounds to me like a cleaner solution would be to change the interface with
the parser to just be returning some richer (non-AST) type during name lookup
which can encapsulate the information about overloads.

The problem is that OverloadedFunctionDecls can show up as children of
other AST nodes. For example, in

  void f(int); // #1
  void f(float); // #2
  (f)(1.5);

we'll end up building a
ParenExpr(DeclRefExpr(OverloadedFunctionDecl)). Keeping track of the
(f) without using AST nodes seems like it could get messy.

Adding "fake" AST nodes
to accomodate this seems outside the current design spirit and I think may
spread complexity into places that don't deserve it.

I don't agree that it's a "fake" AST node. An overloaded function name
is a full-fledged concept in C++ semantic analysis. And even though
semantic analysis can remove these OverloadedFunctionDecl nodes now,
they'll need to be stored in the AST for templates (which must keep
track of overload sets explicitly, to be used again at instantiation
time).

  - Doug

The problem is that OverloadedFunctionDecls can show up as children of
other AST nodes. For example, in

void f(int); // #1
void f(float); // #2
(f)(1.5);

we'll end up building a
ParenExpr(DeclRefExpr(OverloadedFunctionDecl)). Keeping track of the
(f) without using AST nodes seems like it could get messy.

Ok. I agree this might get messy.

Adding "fake" AST nodes
to accomodate this seems outside the current design spirit and I think may
spread complexity into places that don't deserve it.

I don't agree that it's a "fake" AST node. An overloaded function name
is a full-fledged concept in C++ semantic analysis. And even though
semantic analysis can remove these OverloadedFunctionDecl nodes now,
they'll need to be stored in the AST for templates (which must keep
track of overload sets explicitly, to be used again at instantiation
time).

My impression was that these nodes were only going to exist during Parsing/Sema
and not actually ever show up in the tree. If they are actually going
to need to exist
in places in the AST then it seems more reasonable to actually introduce them
(since downstream clients will have to be able to handle them anyway).

The other part that seemed fake was that they are not a "decl" in the sense of
actually defining anything or existing per se in the source. Are they
a "Decl" for
any reason other than shoehorning into the types the Parser expects? And if so,
do they need to be a ValueDecl?

Aside from its use in the decl, I don't really see why OverloadType is needed...

- Daniel

The other part that seemed fake was that they are not a "decl" in the sense of
actually defining anything or existing per se in the source. Are they
a "Decl" for
any reason other than shoehorning into the types the Parser expects? And if so,
do they need to be a ValueDecl?

I think they are declarations in the same sense that using
declarations are declarations: while they don't necessarily introduce
a new entity with a name, they do declare a binding for a particular
name to one or more entities.

OverloadedFunctionDecls are ValueDecls because DeclRefExpr needs to be
able to refer to them. Plus, OverloadedFunctionDecls are "values" just
like other functions... the only difference being that we don't know
which value we're referring to just yet.

Aside from its use in the decl, I don't really see why OverloadType is needed...

Well, because OverloadedFunctionDecl is a ValueDecl, it needs to have
a type. One might consider giving it a "null" type (QualType()), but
the introduction of OverloadType is mainly defensive: if we end up
getting an OverloadedFunctionDecl somewhere we don't expect and the
result is a type error, we'll get a friendly <overloaded function

type name rather than a crash.

I expect that, later down the road, we'll see other placeholder types
like this, such as a DependentType for dependent types with templates.

  - Doug

Doug Gregor wrote:

  

I agree with Argiris that having an OverloadType and Decl is counterintuitive,
although I think I understand the reason.
    
Does Argiris still agree with this? I think I may have converted him :slight_smile:
  
Yes, praise the lord, I am a "believer" now :slight_smile:

I can't think of anywhere that we would need an OverloadedFunctionDecl
to be a ValueDecl, except to satisfy DeclRefExpr.

I like the idea of making DeclRefExpr only require a NamedDecl. It
allows OverloadedFunctionDecl to shrink considerably, and shouldn't
affect DeclRefExpr at all. Sema::ActOnIdentifierExpr will need a
slight tweak to fill in the OverloadType for OverloadedFunctionDecls,
but that's trivial. Good idea!

  - Doug

This worked out, although there were quite a few places that depended
on DeclRefExpr requiring a ValueDecl. I've patched all those up;
updated patch attached. It's a bit too big to commit without approval
from one of the Apple guys.

  - Doug

clang-overloaded-function-decls-2.patch (39.5 KB)