Nested functions in Clang?

Does Clang have an option that allows C nested functions, as in gcc?
So far, I can't find it. I just want to look at generated llvm IR, for
ideas on the best way to produce it from another front end.

Does Clang have an option that allows C nested functions, as in gcc?

No.

So far, I can't find it. I just want to look at generated llvm IR, for
ideas on the best way to produce it from another front end.

Consider using the lambda approach instead, e.g. provide a hidden
argument for the context.

Joerg

From http://clang.llvm.org/docs/UsersManual.html

clang does not support nested functions; this is a complex feature which is infrequently used, so it is unlikely to be implemented anytime soon. In C++11 it can be emulated by assigning lambda functions to local variables, e.g:

auto const local_function = [&](int parameter) {
// Do something
};

local_function(1);

Clang doesn’t support it, but llvm-gcc did (and DragonEgg, possibly?), and this is what the trampoline intrinsics are for. The way that GCC implements it is very dangerous - it requires having an executable stack, which makes various attacks significantly easier.

The nested function approach is *only* required if you need the ability for unmodified C code (or code using a C FFI) to be able to call the generated functions. If you do not have this requirement, then using an explicit argument in the LLVM IR to contain the state for the closure (as [Objective-]C blocks and C++ lambdas do) is probably a better option.

David

My Pascal compiler (for obvious reasons) supper nested functions.

I originally added hidden extra, hidden, arguments for the variables used in the outer arguments, but had to add support for function pointers for functions inside a nested structure, and ended up using the LLVM suport for trampolines.

For (probably not ideal) reference:
https://github.com/Leporacanthicus/lacsap/commit/d04411fb37f584970210dc0f6e9d567f91a14443

I want to avoid trampolines for two reasons. They have to be built in memory that is
both writable and executable, which has already been a problem on some targets. Also,
a debugger needs to both construct and interpret them, which creates difficult target
dependencies and compiler code-generation dependencies.

I intend to pass some kind of environment pointer, like a static link. The real question
is, what does it point to? As I understand, llvm locals can not be used outside the
function they are created in, so using a static link from outside can't access the
alloca values. I am thinking the best way is to wrap at least up-level referenceable
formals and locals in a struct, and have the static link point to that. This would
include the next outer static link.

So, my code builds a closure structure, which contains pointers to the variables shared between the nested function and the outer function.

As long as you don’t need function pointers that are compatible with function pointers without local variable usage, you don’t need trampolines, just the closure structure to pass the arguments. Or you can just pass extra arguments with those local variables as references.

But if you want something that in C would look something like this:

void func1(void (*fptr)(int), int x)
{
fptr(x);
}

void func2(int a)
{
int y = a;

void foo(int x)
{
printf(“x=%d, y=%d\n”, x, y);

}

func1(foo, 42);
}

void bar(int x);

int main()
{
func1(bar);
func2(17);
return 0;
}

For the foo function to be compatible with bar, you need a trampoline. If you don’t have this use-case, then you just need some way to pass y into foo.

[This is almost identical to the function in the Pascal conformance test that I had to implement trampolines for]

The pointer will be to a struct that contains either the variables or the addresses of the variables, depending on the binding rules of your language. Apple’s blocks ABI promotes all local accesses to use an indirection pointer and puts pointers in the (on-stack) context structure, and updates all of the pointers to be heap values if the block is copied. If your blocks are only downward funargs, then allocating all relevant locals in the struct on the stack and passing a pointer to it down can have the same effect. If your blocks are expected to persist after the function returns then you will need some form of memory management.

The MysoreScript language that I use for teaching implements a simple form of closure and has heavily commented code, which you might want to use as an example:

https://github.com/CompilerTeaching/MysoreScript

David

So, my code builds a closure structure, which contains pointers to the variables shared between the nested function and the outer function.

As long as you don't need function pointers that are compatible with function pointers without local variable usage, you don't need trampolines, just the closure structure to pass the arguments. Or you can just pass extra arguments with those local variables as references.

Actually, we do already use something named a "closure", but it's not the usual closure. I find
it rather strange, and will resist boring people with the details, (it starts to get rather OT),
but it does handle mixtures of nested functions, top-level functions, and functions from other
languages, e.g., C. My issue is not how get an environment into a nested procedure, it's how
can it be used from there when the outer variables to be accessed are independent alloca values
in their own scope.

I intend to pass some kind of environment pointer, like a static link. The real question
is, what does it point to? As I understand, llvm locals can not be used outside the
function they are created in, so using a static link from outside can't access the
alloca values. I am thinking the best way is to wrap at least up-level referenceable
formals and locals in a struct, and have the static link point to that. This would
include the next outer static link.

The pointer will be to a struct that contains either the variables or the addresses of the variables, depending on the binding rules of your language. Apple’s blocks ABI promotes all local accesses to use an indirection pointer and puts pointers in the (on-stack) context structure, and updates all of the pointers to be heap values if the block is copied. If your blocks are only downward funargs, then allocating all relevant locals in the struct on the stack and passing a pointer to it down can have the same effect. If your blocks are expected to persist after the function returns then you will need some form of memory management.

We don't need persistence of activation records, so a single pointer to one of them will work, if
it can access all the variables. It looks like wrapping everything explicitly declared in a scope
into a struct is the way to do it.

In essence:
Store the value of the alloca (in other words, the local address) into a
structure (what I call a closure).
Then re-introduce those variables as references in the inner function
context.

There is no real magic to it.

Some of my code:

Create a closure structure, and store the address of the variable:
https://github.com/Leporacanthicus/lacsap/blob/master/expr.cpp#L3294

Unpack (re-introduce) the closure argument in the new context:
https://github.com/Leporacanthicus/lacsap/blob/master/expr.cpp#L1358

Create data structure for the closure:
https://github.com/Leporacanthicus/lacsap/blob/master/expr.cpp#L1665

I hope this is answering your question, and of at least some help...

Best regards,
Alexey Bataev

You could take a look at captured statements. I think they already have almost everything for nested functions.

Best regards,
Alexey Bataev