Runtime Array-Length


I am building my own language with llvm as the base.

I was working on string concatenation (where a string is just an array of characters cast to a pointer to a character (i8*) ). Given two strings, it is possible to determine the length of new string by summing the number of characters until the null terminator and adding one.

Unfortunately, I have no idea how to use the c-api to store this. As the length of the new string is not a compile-time constant (e.g. stored in a Value*), I cannot determine at compile-time what length the llvm array-type will be? Therefore, I cannot create the GlobalVariable since I do not know the type.

One possible solution I thought of was linking to the malloc function and calling that, but I’m sure there’s a better way. If any of you have implemented a similar sort of string concatenation, I would much appreciate any advice that you could give.


The toy language I’ve been playing around with represents all strings as a struct in llvm;

struct string{

char *ptr;

int str_len;

int buffer_len;


And my AST has an interface like;


int measure();

void copy(char *dest);

struct string get_value();

A constant string can be measured at compile time, for a string variable measure() just extracts str_len. Strings passed in from other external sources are measured immediately, but llvm optimisations will eliminate the call if the return value isn’t used.

The implementation of get_value() for a concatenation AST node can generate code to evaluate each sub string, measure them, allocate the final buffer length, and only then copy each sub string directly into the final buffer.

I also support a string append operation that will reallocate the buffer only if the existing one is too small.

Ultimately you will need to work out if you want pascal / java style strings like mine, or C style NULL terminated strings. And how the memory for these strings will be managed.

However, how would one allocate the buffer for a string if you did not know the length of the string at compile time?

For instance, using the api how would one reproduce the code for the following c++ function?

std::string add(std::string a, std::string b){
return a+b;

When allocating the buffer required for the new string, one can determine the length at runtime, however I do not know how one can allocate a global array with its size determined by a Value*.

However, how would one allocate the buffer for a string if you did not
know the length of the string at compile time?

FYI, LLVM doesn't provide a "platform" or "VM" or "runtime". You will need
to be familiar with your target platform's API's; on most platforms you can
probably get away with just calling malloc for the case at hand though.

-- Sean Silva