On passing structures in registers

There are several ways to pass structures in and out of functions, and the fastest will presumably usually be in registers.

Return:
System V lets us use 2 64-bit registers for structure returns, where figuring this out seems to be the frontend’s problem; If the packed struct is <= 128 bits, pass it by value as a llvm aggregate (and use extractvalue and insertvalue to manipulate it) otherwise write to a pointer with “sret” attribute and return void. It seems that once I commit to the sret style, llvm can no longer return the struct in registers, but I am also suspicious of always returning structs by value, and even recall reading somewhere that llvm is not good at handling large aggregates (maybe that applies mainly to loads and stores).

Struct arguments:
Here too we have a choice between passing a llvm aggregate or passing down a pointer argument with “byval” attribute. As before it is unclear which should be preferred.

I’m also curious about how best to represent bitvectors, and have noticed clang sometimes casting structures to arrays; I currently use raw arrays of %i1 and extract|insert value.

My final concern is that the convention needs to be predictable, both for ABI compatibility reasons and to inform appropriate Bitcasting of function pointers. I would also like to understand what exactly informs the threshold for switching strategy.

Thanks for any insights,
James

There are several ways to pass structures in and out of functions, and the fastest will presumably usually be in registers.

Return:
System V lets us use 2 64-bit registers for structure returns

SysV does not. A psABI supplement to SysV does. I guess that you're talking about the x86-64 psABI here?

, where

figuring this out seems to be the frontend's problem; If the packed struct is <= 128 bits, pass it by value as a llvm aggregate (and use extractvalue and insertvalue to manipulate it) otherwise write to a pointer with "sret" attribute and return void. It seems that once I commit to the sret style, llvm can no longer return the struct in registers, but I am also suspicious of always returning structs by value, and even recall reading somewhere that llvm is not good at handling large aggregates (maybe that applies mainly to loads and stores).

Struct arguments:
Here too we have a choice between passing a llvm aggregate or passing down a pointer argument with "byval" attribute. As before it is unclear which should be preferred.

It is unclear. This is, unfortunately, a known problem with LLVM. There is an implicit contract between the back end and front end on how ABI-specific information is lowered and this often impacts mid-level optimisers. For example, if you want to return a structure of two 32-bit values (for example, a pair of pointers) in registers on x86 (as the BSD / macOS 32-bit ABIs do, but the Linux 32-bit ABI does not, though I think Linux does this for _Complex(int)) then this contract says that you should return them in an i64. This then causes problems for alias analysis because the ptrtoint / inttoptr pair are treated as escaping.

The best advice (which, I admit, is very bad) is to copy whatever clang does.

Various people have discussed adding an ABI library or a better way of expressing ABI constraints (e.g. function / parameter attributes requiring specific registers / stack locations) into the IR. If you wanted to work on this, I personally would be incredibly happy, but it's a fairly large amount of work.

I'm also curious about how best to represent bitvectors, and have noticed clang sometimes casting structures to arrays; I currently use raw arrays of %i1 and extract|insert value.

In general, avoid using anything other than i{N*8} as an in-memory representation. i1 is typically legalised to i8 at some point in the back-end lowering, so if you want to guarantee that something is a single bit in memory then you should use an array of i8s and masking operations. The back end will infer bitfield insert / extract instructions where available.

My final concern is that the convention needs to be predictable, both for ABI compatibility reasons and to inform appropriate Bitcasting of function pointers. I would also like to understand what exactly informs the threshold for switching strategy.

I completely agree that the convention should be predictable. Unfortunately, there is currently little consistency in LLVM over how these implicit contracts between back and front-end are expressed.

The root cause of a lot of these problems is that LLVM occasionally pretends that the contents of memory is typed, but does not do so very well. The opaque pointer work is slowly fixing the most obvious problems here.

David

I failed to send this to the right address, so forwarding to llvm-dev