Question about arrays in blocks

I am now testing blocks usage in OpenCL.

I noticed that Clang forbids using an arrays within blocks (Clang forbid array capturing)

For Example, the following code will fail to compile because there is a reference to an array within a block.

But if the line referring to block will comment out and the line with reference to integer pointer (‘j’) will comment in, the code will pass.

void block_arr()

{

int res = 0;

int i[4] = { 3, 4, 4, 1 };

int* j = i;

int (^test_block )(int) = ^(int num)

{

return num + i[1]; // This is an error: “error: cannot refer to declaration with an array type inside block”

// return num + j[1]; // This would work

};

res = test_block(7);

}

Do you know what is the reason behind this limitation?

And why it is possible to use a regular pointers but arrays are forbidden?

Thanks,

Arik

IIRC, it's just so that people don't accidentally make implicit copies of
large arrays.

-Eli

It is an efficiency issue. It requires copying entire array into the block descriptor.
- Fariborz

Does Clang supposed to disallow such feature for the sake of efficiency?

Isn’t a warning message will be more suitable in this case?

Another alternative is to copy the pointer and not the whole array (as it done in C)

This is problematic, because it will work fine if the block is only passed down the stack, but cause stack corruption when the block is captured, which is a clear POLA violation. If the pointer (not the array) is used in the block, then it's safe to assume that the user knows what she is doing in terms of memory management (or, at least, doesn't get to complain if she doesn't).

Turning this into a warning, perhaps with a configurable size for the smallest thing to complain about might be interesting. It seems odd to me that we can put a 64 element array inside a stack-allocated C++ object and have that transparently moved to the heap if required, but we can't have a 4-element array by itself.

It's also worth noting that each time you move a block from the stack to the heap you are doing at least two heap allocations[1], which are likely to be significantly more than a short memcpy to move this array to the new allocation. The extra overhead when on the stack of indirecting via the block pointer is constant, not proportional to the size of the allocation.

I'm a bit confused by the original premise of using blocks within OpenCL, however, as much of the hardware that OpenCL is intended to target does not permit dynamic allocations. Many of these constraints would go away if this were the case. We'd also be able to eliminate the second level of indirection in the blocks ABI and much of the metadata, as this is only required to permit copying blocks to the heap. There are other issues related to the ABI that require dereferencing arbitrary pointers to stack memory, which are not permitted by some GPU architectures and are definitely something that you'd want to allow for WebCL (because they make validation of the IR a nightmare - not an impossible task, but certainly a task of sufficient complexity that I wouldn't want security to be dependent on a correct implementation of it).

David

[1] The blocks ABI was carefully designed to allow multiple captured objects to be stored in a single byref structure, but I believe that we currently only store a single one. If we have multiple objects that are captured by the same set of blocks, then we can put them in the same byref structure and save some memory allocations (and fragmentation), but I don't believe that we do this.

Another alternative is to copy the pointer and not the whole array (as it done in C)

But this is not the same as capturing the array.
You can get the effect you mentioned by declaring it as __block. Or you can capture the array by putting it in a struct and
capturing the struct instead.

- Fariborz