allocating an array

HyperQuantum · July 16, 2008, 6:07pm

How can I allocate an array with a size that is not known at compile time?

The language reference says that the array size must be a constant
integer value. It also says that variable sized arrays are represented
by using zero as the number of elements. Obviously I cannot use zero
in the array type when it is allocated. llvm-gcc seems to use a
pointer type instead of an array type, so compiling a simple C program
that uses an array with a runtime size and looking at the resulting
bitcode didn't provide an answer to my question.

Regards,
Kevin André

resistor · July 16, 2008, 6:17pm

Call malloc()?

--Owen

void · July 16, 2008, 6:19pm

Do you mean a variable length array (as in C99) or something else?
LLVM will convert VLAs into an "alloca":

int foo(int X) {
int a;
...
}

define i32 @foo(i32 %X) nounwind {
entry:
%tmp23 = alloca i32, i32 %X ; <i32*> [#uses=2]
...
}

If you are doing the zero-sized array hack, then you'll have to call
malloc on it anyway. Something like:

struct a {
char c;
int d[0];
};

And then in your program do something like this:

void bar(int z) {
struct a *fnord = malloc(sizeof(struct a) + z * sizeof(int));
...
}

LLVM will generate the obvious code for that.

-bw

HyperQuantum · July 16, 2008, 8:13pm

I guess I should have explained what I'm trying to do. I'm working on
a front end for my programming language and using the LLVM libraries
as back end. The arrays in my language are currently represented by an
LLVM structure type that contains integer fields to hold the size for
each array dimension, and then an array type with the actual data for
the array.

A simple example, an array with 3 integers:

var x : [3] int

For this example the following code is generated:

%x = alloca { i32, [3 x i32] } ; <{ i32, [3 x i32] }*> [#uses=4]
getelementptr { i32, [3 x i32] }* %x, i32 0, i32 0 ; <i32*>:1 [#uses=1]
store i32 3, i32* %1

(and followed by a loop that initializes the three elements)

But I only tested this with arrays that have an integer constant as
size. Now if I try it with a size that comes from an expression like
this:

var size : nat = 9
var x : [size] int

Then my compiler does not know the size of the array and uses a zero
for the type:

%size = alloca i32 ; <i32*> [#uses=2]
store i32 9, i32* %size
%x = alloca { i32, [0 x i32] } ; <{ i32, [0 x i32] }*> [#uses=4]
load i32* %size ; <i32>:1 [#uses=1]
getelementptr { i32, [0 x i32] }* %x, i32 0, i32 0 ;
<i32*>:2 [#uses=1]
store i32 %1, i32* %2

Obviously this is wrong, but I don't see how I can use the value from
'size' to allocate the necessary space for this array with alloca. I
could use a pointer to an array in the struct type, instead of a
direct array, and use two allocas (one for the struct and one for the
array data). But then there is no real reason to use the array type
anymore, a pointer to the first element (like llvm-gcc generates for
C/C++ arrays) would be much simpler. Or am I missing something?

Regards,
Kevin André

void · July 16, 2008, 8:44pm

Okay, so it's more like the second example I gave. The equivalent C
program would be something like this:

int x = 37;

struct a {
int f1;
int a1[0];
};

int foo(int b, int c) {
struct a *s = malloc(sizeof(struct a) + x * sizeof(10));
int i;

for (i = 0; i < x; ++i)
s->a1[i] = i;

return s->a1[b] * c;
}

int bar(int b, int c, int d) {
struct a *s = malloc(sizeof(struct a) + b * sizeof(10));
int i;

for (i = 0; i < b; ++i)
s->a1[i] = i;

return s->a1[c] * d;
}

Compile this program at -O0 to see the initial code. Then compile it
at -O3 to see what it is after optimizations. At -O3, there's still
the calls to malloc, but you might not be able to get around that. One
thing that gets rid of the malloc in "foo" is to mark x as static. The
compile then knows it's size and that it's not modified in the current
file, so it can use the value it's been assigned. If your programming
language has this capacity, it would improve your code generation.

Hope this helps.

-bw

Duncan_Sands · July 16, 2008, 8:59pm

Hi,

Obviously this is wrong, but I don't see how I can use the value from
'size' to allocate the necessary space for this array with alloca. I
could use a pointer to an array in the struct type, instead of a
direct array, and use two allocas (one for the struct and one for the
array data). But then there is no real reason to use the array type
anymore, a pointer to the first element (like llvm-gcc generates for
C/C++ arrays) would be much simpler. Or am I missing something?

if the length is %n, you can alloca %n i8's:
%x = alloca i8, i32 %n
[Well, I guess you need to allocate a bit more to include
the overhead of the length field in your struct.]
Then bitcast %x (which is an i8*) to a pointer to
your struct type.

Ciao,

Duncan.

Duncan_Sands · July 17, 2008, 1:01am

if the length is %n, you can alloca %n i8's:
%x = alloca i8, i32 %n

If the array element type is i32 (like the size
field), then you may as well use an alloca i32;
if you use an alloca i8 then you would have to
use 4*%n rather than %n in order to get the right
amount of memory.

Ciao,

Duncan.

HyperQuantum · July 22, 2008, 11:25am

Thanks, Bill and Duncan, for your suggestions. I've changed my
approach, and use a pointer instead of an array now.

So this code:

var size : nat = 9
var x : [size] int

now becomes:

%size = alloca i32 ; <i32*> [#uses=2]
store i32 9, i32* %size
%x = alloca { i32, i32* } ; <{ i32, i32* }*> [#uses=6]
load i32* %size ; <i32>:1 [#uses=1]
getelementptr { i32, i32* }* %x, i32 0, i32 0 ; <i32*>:2 [#uses=1]
store i32 %1, i32* %2
%arr_storage = alloca i32, i32 %1 ; <i32*> [#uses=1]
%arr_dataptr_addr = getelementptr { i32, i32* }* %x, i32 0, i32 1 ;
<i32**> [#uses=1]
store i32* %arr_storage, i32** %arr_dataptr_addr

(did some manual optimizations on this code, as my front end generates
really dumb code)

So I need two allocas now, one for the structure and one for the
storage of the elements.

Now I'm wondering: do you think it's a bad idea to use alloca for
allocating an unknown number of elements? In that case, should I just
always use the malloc instruction instead, or do a runtime decision to
choose between alloca and malloc?

Regards,
Kevin André

Duncan_Sands · July 22, 2008, 12:34pm

Hi,

Now I'm wondering: do you think it's a bad idea to use alloca for
allocating an unknown number of elements? In that case, should I just
always use the malloc instruction instead, or do a runtime decision to
choose between alloca and malloc?

on some operating systems there is a limit to how much you can allocate
on the stack. This is 16k on windows and something bigger but not huge
on Sun OS (IIRC). On the other hand, allocating things on the stack is
usually more efficient than using malloc.

Ciao,

Duncan.

Duncan_Sands · July 22, 2008, 12:34pm

PS: Not to mention that if you use malloc then
you will need to explicitly reclaim memory using
free when the variable goes out of scope, while
this happens automatically if you use alloca.

Topic		Replies	Views
AllocaInst issue LLVM Dev List Archives	1	81	August 16, 2012
How to codegen an LLVM-IR that has dynamic arrays in it? LLVM Dev List Archives	3	113	February 19, 2014
Question about variable sized arrays LLVM Dev List Archives	1	75	February 26, 2009
LLVM-C LLDB	3	62	June 6, 2018
variable sized structs in LLVM LLVM Dev List Archives	23	107	June 25, 2005

allocating an array

Related Topics