Calling Conventions Cont'd

What is the correct procedure for translating a function signature from a high-order language to LLVM? It looks like I replace each struct/array parameter with a 'byval' pointer parameter, and I replace a result struct/array with an 'sret' pointer parameter. The reason I ask is that each calling convention has subtle variations for each architecture and platform. For example, cdecl in Win32 returns small (8 bytes or less) structures in EAX or EAX:EDX; whereas, cdecl in Linux always returns structures "in memory" (Source: http://www.programmersheaven.com/2/Calling-conventions). Does LLVM handle all of these nuances for me?

Thanks,
Jon

Hi Jon,

What is the correct procedure for translating a function signature from
a high-order language to LLVM?

you have to do it carefully so that LLVM will end up producing object code
that conforms to the platform ABI. For example, suppose you are using cdecl
with a small struct (a pair of ints, say). Then your function should get two
integer parameters, which LLVM will assign to registers. If using a large
struct then use byval which will pass it on the stack.

I know this is painful, hopefully LLVM will get some helpers for this one
day. The reason is that some ABIs make distinctions that don't exist at
the LLVM level. For example at least one ABI says that "complex"
( { double, double } ) should be passed differently to a struct containing
a pair of doubles.

It looks like I replace each
struct/array parameter with a 'byval' pointer parameter, and I replace a
result struct/array with an 'sret' pointer parameter.

No, it's not so easy, sorry.

The reason I ask
is that each calling convention has subtle variations for each
architecture and platform. For example, cdecl in Win32 returns small (8
bytes or less) structures in EAX or EAX:EDX; whereas, cdecl in Linux
always returns structures "in memory" (Source:
http://www.programmersheaven.com/2/Calling-conventions). Does LLVM
handle all of these nuances for me?

No it doesn't.

Best wishes,

Duncan.

Duncan Sands wrote:

Hi Jon,

What is the correct procedure for translating a function signature from a high-order language to LLVM?

you have to do it carefully so that LLVM will end up producing object code
that conforms to the platform ABI. For example, suppose you are using cdecl
with a small struct (a pair of ints, say). Then your function should get two
integer parameters, which LLVM will assign to registers. If using a large
struct then use byval which will pass it on the stack.

I know this is painful, hopefully LLVM will get some helpers for this one
day. The reason is that some ABIs make distinctions that don't exist at
the LLVM level. For example at least one ABI says that "complex"
( { double, double } ) should be passed differently to a struct containing
a pair of doubles.

Ugh, this isn't what I wanted to hear. Passing "complex" differently than a structure containing two doubles is a bad design, but alas, calling conventions are beyond our control. How many special cases like this are there? If "complex" is the only special case, LLVM could provide a complex type, which behaves like {double,double} in all respects except for calls. I recommend handling calling conventions entirely in the back end. Apart from calling conventions, the front end doesn't need to know the specific target, only the data layout and which intrinsics the target supports. This approach makes a clean division between the front end and back end.

Best Regards,
Jon

Ugh, this isn't what I wanted to hear. Passing "complex" differently
than a structure containing two doubles is a bad design, but alas,
calling conventions are beyond our control. How many special cases like
this are there? If "complex" is the only special case, LLVM could

There are a huge number of special cases. Take a look at the craziness in the x86-64 ABI structure passing rules for an example.

entirely in the back end. Apart from calling conventions, the front end
doesn't need to know the specific target, only the data layout and which
intrinsics the target supports. This approach makes a clean division
between the front end and back end.

If you want to map from a C type/calling convetion to LLVM IR, clang is a good way to go. It isn't fully up to snuff with all the ABIs out there, but will be doing much better over the next few months.

-Chris

> entirely in the back end. Apart from calling conventions, the front end
> doesn't need to know the specific target, only the data layout and which
> intrinsics the target supports. This approach makes a clean division
> between the front end and back end.

If you want to map from a C type/calling convetion to LLVM IR, clang is a
good way to go. It isn't fully up to snuff with all the ABIs out there,
but will be doing much better over the next few months.

How about extracting the clang code and turning it into a set of helper
routines for people who want to be able to generate C compatible function
signatures?

Ciao,

Duncan.

Duncan Sands wrote:

What is the correct procedure for translating a function signature from a high-order language to LLVM?

you have to do it carefully so that LLVM will end up producing object code that conforms to the platform ABI. For example, suppose you are using cdecl with a small struct (a pair of ints, say). Then your function should get two integer parameters, which LLVM will assign to registers. If using a large struct then use byval which will pass it on the stack.

I know this is painful, hopefully LLVM will get some helpers for this one day. The reason is that some ABIs make distinctions that don’t exist at the LLVM level. For example at least one ABI says that “complex” ( { double, double } ) should be passed differently to a struct containing a pair of doubles.

Ugh, this isn’t what I wanted to hear. Passing “complex” differently than a structure containing two doubles is a bad design, but alas, calling conventions are beyond our control. How many special cases like this are there?

llvm-gcc is probably your best reference on this matter.

If “complex” is the only special case, LLVM could provide a complex type, which behaves like {double,double} in all respects except for calls. I recommend handling calling conventions entirely in the back end.

This would be convenient, but is unfortunately not realistic. Consider that platform calling conventions are generally defined in terms of C data types. Since LLVM’s data types are by design lower-level than C, they are insufficient to specify platform calling conventions. Therefore, the LLVM IR needs to be annotated. This is done through a variety of mechanisms:

  • ‘cc’
  • ‘byval’
  • ‘sret’
  • aggregate return
  • breaking up structs into multiple parameters
  • merging structs fields into single parameters
  • probably more.

On the other hand, you need only consider this complexity when interoperating with C/C++; your language’s own functions need only be self-consistent.

— Gordon

While other people have pointed out some of the intricacies involved in ABI compliance, I'd just like to point that you only need to be concerned with this if you're trying to interoperate with C/C++. If you're making the function signatures for use entirely with your own language, then none of this matters. You're free to invent your own calling convention and, as long as you apply it consistently, everything will work.

--Owen

Fortunately, clang is built as a set of libraries. This would just be one more interface built on top of the asts + llvm code generator.

-Chris