Why passing a structure by value is compiled into passing by reference without byval attribute?

Hi all,

Could somebody explain, please, why if I pass a structure by value to a function, it’s sometimes passed by reference without byval attribute?

Consider an example (Compiler Explorer). Say we have C code:

struct S {
    int a[100];
    int b;
};

void func1(struct S s) {
    s.b = 111;
}

void func2(struct S *s) {
    s->b = 222;
}

If I compile with clang 15 for arm64:

-O0 -Xclang -no-opaque-pointers -emit-llvm -S

I get something like:

define dso_local void @func1(%struct.S* noundef %0) #0 !dbg !10 {
  %2 = getelementptr inbounds %struct.S, %struct.S* %0, i32 0, i32 1, !dbg !25
  store i32 111, i32* %2, align 4, !dbg !26
  ret void, !dbg !27
}

define dso_local void @func2(%struct.S* noundef %0) #0 !dbg !28 {
  %2 = alloca %struct.S*, align 8
  store %struct.S* %0, %struct.S** %2, align 8
  %3 = load %struct.S*, %struct.S** %2, align 8, !dbg !34
  %4 = getelementptr inbounds %struct.S, %struct.S* %3, i32 0, i32 1, !dbg !35
  store i32 222, i32* %4, align 4, !dbg !36
  ret void, !dbg !37
}

From here, I cannot understand, 1) why func1 IR works as passing by value, while func2 provides passing by reference,
2) why store i32 111, i32* %2, align 4 from func1 does not modify the structure field,
3) and why do we even need extra store and load in func2:

  %2 = alloca %struct.S*, align 8
  store %struct.S* %0, %struct.S** %2, align 8
  %3 = load %struct.S*, %struct.S** %2, align 8, !dbg !34

While I don’t have a definitive answer for you – I can only speculate without diving in to Clang’s codegen source – I can hopefully provide some insight.

There is more happening with func1 that what is seen here. If we take a look at the source and IR for main:

int main() {
    struct S s;
    func1(s);
    func2(&s);
    return 0;
}

We see this IR (with debug intrinsics removed):

define dso_local i32 @main() #0 !dbg !38 {
  %1 = alloca i32, align 4
  %2 = alloca %struct.S, align 4
  %3 = alloca %struct.S, align 4
  store i32 0, i32* %1, align 4
  %4 = bitcast %struct.S* %3 to i8*, !dbg !43
  %5 = bitcast %struct.S* %2 to i8*, !dbg !43
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %4, i8* align 4 %5, i64 404, i1 false), !dbg !43
  call void @func1(%struct.S* noundef %3), !dbg !43
  call void @func2(%struct.S* noundef %2), !dbg !44
  ret i32 0, !dbg !45
}

As we can see, a %struct.S is allocated twice despite only being defined once in the source. The first allocation (%2) is memcpy’d to the second one (%3), and %3 is passed by reference in the call to func1. This explains why the original struct values don’t change: LLVM has passed a second, hidden pointer.

I agree this is not passing by value, but as I stated earlier, I can’t say for sure why this specific logic is invoked. This is effectively (but not literally) passing by value. I assume this is done because this fulfills the standard C language rules of the effects of passing by-value. Again, this is pure speculation; I hope someone else can jump in and correct me if I’m wrong

I believe this is default behavior – all arguments get an allocation – even if that argument is just a pointer. Further optimization would remove this. I’m also speculating the reason it’s done in func2 and not func1 is because func2 is intended to be pass-by-reference, whereas the logic for func1 is generated specifically for certain circumstances of structs being passed by-value.

2 Likes

For func2, the argument is a pointer; we pass the pointer directly. clang creates an “alloca” for it in the callee in case the callee reassigns the variable, or takes its address. (Optimization will usually eliminate the variable.)

For func1, ABI rules for many targets say that instead of passing large structs directly on the stack, they are instead passed indirectly (as a pointer). This tends to be more efficient than byval/inalloca because the compiler can choose where/when to allocate space for the variable. (Note this is target-dependent. Each target documents its own rules, for example abi-aa/aapcs64.rst at main · ARM-software/abi-aa · GitHub .) There is no alloca in this case because the caller already allocated the stack space.

1 Like