C returning struct by value

I'm new to using LLVM and I've started work on a compiler for a language that can interface with C. One thing that caught me off guard was returning a struct from a function by value. It seems that when calling a C function I need to emit llvm ir that, in the caller, emits an alloca for the returned structure, and the C function signature should return void and take a first argument with the pointer annotated with the "sret" attribute.

Can someone help me understand why this detail needs to be understood by the frontend, and isn't handled by maybe annotating the C function with some attribute that says it should conform to a certain calling convention - then just having the signature be to naturally return the struct by value, but the backend knows how to transform it to what it should be?

Cheers
-Mike

Unfortunately, the understanding of the C calling convention is split between LLVM and Clang in a rather difficult to understand (and poorly documented) manner. Needing to put the struct return value in an ‘sret’ argument is but one of the many ways in which you need to adjust your LLVM function definition from the “obvious” lowering of a C function.

I don’t think anyone feels it ideal, but it’s what we have. There are a number of reasons why this is not completely trivial to fix.

Many of the backends can do automatic demotion to sret, but the
front-end still needs to be aware of the issues (particularly around
unions, since whether demotion is necessary can depend on more than
just the size of the type).

I'd also expect marginally better code in some cases by using sret
explicitly: the demotion occurs pretty late on and a "%type *sret"
parameter better represents what will actually be happening later on.

Cheers.

Tim.

Thanks for the explanation. It’s good to hear the situation isn’t felt to be ideal.

The details here are going to be sensitive to the OS + target that I’m compiling for, right? So the effort here will be to understand and get right the calling convention details for each supported target, yes?

Is there any current plan to change the way this works, or is it more of a dreamy cleanup item that maybe will get addressed some day?

Appreciate the tip.

Thanks for the explanation. It's good to hear the situation isn't felt to
be ideal.

The details here are going to be sensitive to the OS + target that I'm
compiling for, right? So the effort here will be to understand and get
right the calling convention details for each supported target, yes?

Yes, it is target specific. You should check
clang/lib/CodeGen/TargetInfo.cpp for details of how calls are lowered from
C for various targets. For some targets (e.g. x86-64), it's decidedly
non-trivial. You can also run clang -emit-llvm to see what actually gets
emitted for particular functions. If you stay away from passing
structs/unions by value, it becomes a *lot* simpler, though...

Some people have tried to make libraries for doing the ABI lowering
available in a way that's not tied to clang. Here's one: <
https://github.com/scross99/llvm-abi&gt; (I have no idea how well, or if, it
works).

I think I've also seen mention of someone constructing the proper classes
to pass to clang to have it emit the C ABI calls from their non-C language,
although I'm not sure where I saw that.

Is there any current plan to change the way this works, or is it more of a

dreamy cleanup item that maybe will get addressed some day?

I don't know of anybody working on changing the way this works. I'd
personally love to work on cleaning it up, someday...but that's a wish, not
a plan.

Appreciate the tip.

I think this is what Swift does, so a poke around their repository
could be interesting.

Tim.

Thanks for the explanation. It's good to hear the situation isn't felt to
be ideal.

The details here are going to be sensitive to the OS + target that I'm
compiling for, right? So the effort here will be to understand and get right
the calling convention details for each supported target, yes?

Yes, it is target specific. You should check
clang/lib/CodeGen/TargetInfo.cpp for details of how calls are lowered from C
for various targets. For some targets (e.g. x86-64), it's decidedly
non-trivial. You can also run clang -emit-llvm to see what actually gets
emitted for particular functions. If you stay away from passing
structs/unions by value, it becomes a *lot* simpler, though...

Some people have tried to make libraries for doing the ABI lowering
available in a way that's not tied to clang. Here's one:
<https://github.com/scross99/llvm-abi&gt; (I have no idea how well, or if, it
works).

I think I've also seen mention of someone constructing the proper classes to
pass to clang to have it emit the C ABI calls from their non-C language,
although I'm not sure where I saw that.

There was an interesting talk a couple years ago about how Swift embeds clang:
  http://llvm.org/devmtg/2014-10/Slides/Skip%20the%20FFI.pdf
Now that it's open-source the code might be of interest.

-Ahmed

Appreciate all of the suggestions. I mostly like the idea of using clang to parse C and extract the info from there. I'm not ready to add clang as a dependency, so I'll take care of the ABI details myself for now.

Cheers