Taking pointer of array parameter

Is it possible to take the address of an array parameter using GEP or must you store the parameter to a local variable first? The code in comments works (if %x is provided instead of %0) but it doesn’t seem right I need to alloc + store just to get the address of the parameter which already exists in memory.

declare i32 @printf(ptr %0, ...)

define void @PrintMessage([255 x i8] %0) {
entry:
  ; %x = alloca [255 x i8], align 1
  ; store [255 x i8] %0, ptr %x, align 1
  %first = getelementptr inbounds [255 x i8], [255 x i8] %0, i8 0
  %1 = call i32 (ptr, ...) @printf(ptr %first)
  ret void
}

You’re passing the array by value so LLVM will treat each i8 as a separate argument and pass that according to the normal ABI rules. This means the array usually doesn’t exist in memory, for example on AArch64 the first 8 bytes will be passed in x0-x7 and only then will the rest be stored on the stack.

So with that scalar argument, an alloca+store is the best way to get a pointer.

If you wanted the whole array to go on the stack (so there would be a pointer to use directly) then you could convert the argument to something like ptr byval([256 x i8]) %0, where the byval tells LLVM to copy the argument into new argument stack-space in the caller.

The byval solution really just moves the store from the callee to the caller though. Often language ABIs will drop the byval entirely for types bigger than a certain size and say it’s the caller’s responsibility in actual LLVM IR to allocate a buffer it’s happy with and then pass a pointer to the callee (so it can reuse an existing buffer containing the array if it has one handy).

Since the caller needs a pointer to the array in either case (ptr with byval or just plain ptr), the basic difference is that byval will make LLVM memcpy the argument into a new array so the callee can’t change any values in the array. Dropping the byval means the you have to worry about that instead, but it can eliminate the copy.

2 Likes

Thanks for your detailed response I appreciate it. Lots of this goes over my head though. So argument is not really an array and this the GEP doesn’t work, right? That means alloca+store is copying the entire block (i8 * 255) to a local (which is an array) and thus I can use the GEP. I think that means sense if I got that right.

It sounds like that’s not even best practice though and I should be using byval so I actually have memory to reference in the first place. That’s better than making a needless copy to a local right?

Hmm so byval has a side effect that the memory is constant/read only? In that case unless the caller specifies the parameter as constant you DO want to copy the whole array to a local. Which means to me a normal “pass by value”. Correct?

That begs the question, how does a “pass by reference” work? I thought that’s what byval did though so I’m confused now.

I’ve used Godbolt to do some investigation also. With the c program:

#include <stdio.h>
int * index_first(int nums[10]) {
    return NULL;
}

Produces:

define dso_local i32* @index_first(i32* %nums) {
entry:
  %nums.addr = alloca i32*, align 8
  store i32* %nums, i32** %nums.addr, align 8
  ret i32* null
}

So clang seems to have the policy always pass the array by reference (pointer) and then store the param in a local. Is that the best policy maybe?

C and C++ semantics say that an array “decays” to a pointer, so what Clang is doing here is correct for C/C++. Note that the IR is not copying the array, it is copying the pointer parameter. This is normal for -O0 code. Optimizations may be able to eliminate that.

hmmm that suggests writing to the parameter “nums” would change the memory which was passed to the function? Maybe there’s another level of indirection on the caller that prevents that though.

EDIT: ok I don’t understand how arrays in C work but I read they are always passed by pointer so my example doesn’t apply to C.

Thanks!

Following up here, so pass by reference is just a pointer, that makes sense. I posted another question on how to use byval in the C API since I didn’t see any answers here and I didn’t want to get it buried in this thread. I think I need to just play with that and see how it works compared to copying to a local as I didn’t quiet understand the “so the callee can’t change any values in the array” remark.

But I think in general if you have parameters than can be written to you need to copy to a local anyways so that seems like the best route. Thanks for your help.