Hi All,
Steve has started working on an implementation of a language feature named 'Blocks'. The back story on this was that it was prototyped in an private Clang fork (because it is much easier to experiment with clang than with GCC), then implemented in GCC (where it evolved a lot), and now we're re-implementing it in Clang. The language feature is already supported by mainline llvm-gcc, but we don't have up-to-date documentation for it. When that documentation is updated, it will definitely be checked into the main clang repo (in clang/docs). Note that llvm-gcc supports a bunch of deprecated syntax from the evolution of Blocks, but we don't plan to support that old stuff in Clang.
Until there is more real documentation, this is a basic idea of Blocks: it is closures for C. It lets you pass around units of computation that can be executed later. For example:
void call_a_block(void (^blockptr)(int)) {
blockptr(4);
}
void test() {
int X = ...
call_a_block(^(int y){ print(X+y); }); // references stack var snapshot
call_a_block(^(int y){ print(y*y); });
}
In this example, when the first block is formed, it snapshots the value of X into the block and builds a small structure on the stack. Passing the block pointer down to call_a_block passes a pointer to this stack object. Invoking a block (with function call syntax) loads the relevant info out of the struct and calls it. call_a_block can obviously be passed different blocks as long as they have the same type.
From a technical perspective, blocks fit into C in a couple places: 1) a new declaration type (the caret) which work very much like a magic kind of pointer that can only point to function types. 2) block literals, which capture the computation 3) a new storage class __block 4) a really tiny runtime library.
The new storage class comes into play when you want to get mutable access to variables on the stack. Basically you can mark an otherwise-auto variable with __block (which is currently a macro that expands to an attribute), for example:
void test() {
int X = ...
__block int Y = ...
^{ X = 4; }; // error, can't modify a const snapshot.
^{ Y = 4; }; // ok!
}
From the implementation standpoint, roughly the address of a __block object is captured by the block instead of its value.
The is tricky though because blocks are on the stack, and you may want to refer to some computation (and its __block captured variables) after the function returns. To do this, we have a simple form of reference counting to manage the lifetimes of these. For example, in this case:
void (^P)(int); // global var
void gets_a_block(void (^blockptr)(int)) {
P = blockptr;
}
void called_sometime_later() {
P(4);
}
if gets_a_block is called with a block on the stack, and called_sometime_later is called after that stack frame is popped, badness happens (yay for C!). Instead, we use:
void (^P)(int); // global var
void gets_a_block(void (^blockptr)(int)) {
P = _Block_copy(blockptr); // copies to heap if on the stack with refcount +1, otherwise increments refcount.
}
void called_sometime_later() {
P(4);
_Block_release(P); // decrements refcount.
P = 0;
}
The semantics of this is that it copies the block off the stack *as well as any __block variables it references*, and the shared __block variables are themselves freed when all referencing blocks go away. The really tiny runtime library implements things like _Block_copy and friends.
Other interesting things are that the blocks themselves do limited/optional type inference of the result type:
foo(^int(){ return 4; }); // takes nothing, returns int.
foo(^(){ return 4; }); // same thing, inferred to return int.
If you're interested in some more low-level details, it looks like gcc/testsuite/gcc.apple/block-blocks-test-8.c in the llvm-gcc testsuite has some of the underlying layout info, though I have no idea if it is up-to-date.
To head off the obvious question: this syntax and implementation has nothing to do with C++ lambdas. Blocks are designed to work well with C and Objective-C, and unfortunately C++ lambdas really require a language with templates to be very useful. The syntax of blocks and C++ lambdas are completely different, so we expect to eventually support both in the same compiler.
In any case, more detailed documentation will be forthcoming, but I would be happy to answer specific questions (before Friday, at which point I disappear for two weeks on vacation, woo!)
-Chris