Nested functions

Was there already some reflection about how to lower the concept of nested
functions (and the corresponding static links) into llvm?

Dirk Muysers.

I have not seen a discussion of this and none of our current front-ends need it. A straightforward way to add support for this would be:

(a) Lower nested procedures into ordinary top-level functions in LLVM, with name-mangling to enforce scopes for each name.

(b) I assume you don't want to use a display (a global static link array) because of threads. For explicit static links:

  (i) Add an intrinsic function to get/set the access link out of the current stack frame. It should take the current stack frame as an argument (so you can walk the access link chain on the stack). Also, add an intrinsic to get a pointer to the current stack frame. I don't know a portable way to implement this in C, but the other back-ends should not have a problem.
  (ii) Add the access link as the first parameter to each nested function and use the intrinsic to set the link in the stack frame on entry. This argument is not traditionally needed but it makes things easier in LLVM, and may even make the common case (accessing a variable in the immediate parent function) faster.
  (iii) Before a call, walk the link chain to find the right access link (or the current stack frame) and pass it to the callee.

As an optimization for shallow nested functions (e.g., 3 levels or less), it seems to me you could just avoid the stack walking entirely and add $k-1$ arguments to each function at level $k$, i.e., at most 2 arguments in all. This may even be an easy first implementation.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

This was discussed back in August, which discusses it, provides a solution
and an example:
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-August/001828.html
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-August/001829.html
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-August/001834.html
http://mail.cs.uiuc.edu/pipermail/llvmdev/2004-August/001836.html

I don't think that we need anything added to LLVM to support nested
functions.

-Chris

That's a rather simplistic example. I wouldn't recommend doing it this way -- I see at least 2 problems with it:

(1) With recursion, the inlining being assumed here would not happen. This means the struct initialization won't be optimized away so nicely and it can get really expensive. Even without recursion, inlining would be impractical if you have many nested procedures and deep nesting.

(2) If you don't rely on inlining, initializing the structs you have to pass around will get horrendously expensive with large numbers of local variables. (Note that without inlining, the structs will *not* be scalar-replaced because they have to passed as arguments.)

There's a reason why code generators use static links.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

As an optimization for shallow nested functions (e.g., 3 levels or less), it seems to me you could just avoid the stack walking entirely and add $k-1$ arguments to each function at level $k$, i.e., at most 2 arguments in all. This may even be an easy first implementation.

I use this method with filtering unused in nested function args and local vars
in my YAFL frontend (not finished :(( - at this moment generate LLVM bytecode but doesn't have all runtime suport code writed)

Vladimir

So do you mean you pass static links for parent function stack frames as arguments? How do you generate them?

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

I use this method with filtering unused in nested function args and local vars
in my YAFL frontend (not finished :(( - at this moment generate LLVM bytecode but doesn't have all runtime suport code writed)

So do you mean you pass static links for parent function stack frames as arguments? How do you generate them?

Sorry, I must more careful read text.
I use diff. method.
All definitions in my YAFL parsed program AST have set of references to expressions where it used.
All expressions know what function body contain it.
All functions know what function contain it (if nested).

I iterate by nested functions in deep and cache information about used declarations from parent function:
args, locals, used grantparent declarations.
Declaration used iff declaration used in code current processed nested function or in it nested function bodies
or in it parent(grantparents) nested functions called from (it or it nested functions).

Collect this information for non-nested function and all it nested functions
i genereted LLVM code for function and all nested functions adding additional args to nested function for used declarations from parent function.

Sorry for maybe not clear algorithm but it work for me.

Vladimir

This makes sense and sounds like it should work pretty well (assuming that only a few non-local variables have to be passed). I think it requires that all nested functions are available, which should usually be true.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/