Why Clang is unpacking my StructType Function arguments

We are currently building an internal solution based on LLVM/Clang technologies which involves generating LLVM IR based on Bitcode generated by Clang from C codes.
We have a C function which prototype looks like the following:

typedef struct {
int (allocFunction)(void x, void* y);
void* YY;
} Type4;

int bar (Type1 *x, Type2 *y, Type3 *z, Type4 bar)

When we compile this code uses Clang 6 into x86_64-apple-macos, the generated function in IR has the type:

; Function Attrs: noinline nounwind optnone ssp uwtable
define i32 @bar(%struct.Type1*, %struct.Type2*, %struct.Type3*,i32 (i8*, i8*), i8)

Which basically split the two members of Type4 into two arguments, strange, but still understandable.

However when we are compiling this exact same piece of code into thumbv7-none-linux-android , things gets even stranger:
define i32 @bar(%struct.Type1*, %struct.Type2*, %struct.Type3*,[2 x i32])

and when caller is using this function, an alloca with type Type4 is bitcasted to [2 x i32]*, loaded and then passed as argument.

It would be great if someone could tell me how to tell clang from unpacking my arguments and if that’s not possible , the correct way to handle this diffrence.


Hi Zhang,

It would be great if someone could tell me how to tell clang from unpacking my arguments and if that's not possible , the correct way to handle this diffrence.

There's no way to control this I'm afraid. The issue is that each
platform has an ABI document that specifies in detail where arguments
have to be passed in registers and on the stack. LLVM IR isn't really
detailed enough to represent all of these nuances properly.
Especially, you can't always just use the struct type itself or you'll
be incompatible with other compilers and libraries.

So there is unfortunately a hidden contract between Clang and each
backend: Clang (in lib/CodeGen/TargetInfo.cpp) knows how each backend
will treat simple parameters and does things like the splitting up
you've seen, and adding unused padding arguments to make sure the
arguments go where they should. And because this kind of effort is
sometimes needed, it's viewed as a pretty low priority to preserve the
type where it would be sufficient. Correctness is the most important
metric, after all.

It's not an ideal situation and there's been talk in the past of
moving some of that logic into a utility library in LLVM itself, or
even enhancing LLVM IR to handle it more elegantly, but nothing has
come of it yet I'm afraid so at the moment you have to either
duplicate that kind of logic yourself or use Clang as some kind of
library to do it for you (I'm not sure of the details, but I believe
Swift takes this approach).



Hi Tim:
Thanks for the input. I guess we have to handle each arch individually using the Module’s Triple now.
But how exactly do those differ? By Vendor?By Arch or by OS? I’m planning on writing a giant switch with the Target triple but I’m not entirely sure which element I should be switching upon : (.


Both OS and architecture certainly. I'm not aware of any cases where
the vendor matters; possibly Cygwin vs MSVC, but I don't think that's
represented as a vendor in the LLVM triple.