Pass and return of large objects

Suppose you have something like 'struct foo { char data[10000000]; }',
and you want to pass such objects as function parameters, and return
them as function results, by value, does this currently work correctly
on all supported target platforms?

Hi Russell,

Suppose you have something like 'struct foo { char data[10000000]; }',
and you want to pass such objects as function parameters, and return
them as function results, by value, does this currently work correctly
on all supported target platforms?

as far as I know this only works on x86, and only with LLVM from svn.

Ciao,

Duncan.

Oh... :frowning:

Are there any plans to change this? It's needed for a correct
implementation of C, after all.

Suppose you have something like 'struct foo { char data[10000000]; }',
and you want to pass such objects as function parameters, and return
them as function results, by value, does this currently work correctly
on all supported target platforms?

Yes, it should. But on many (almost all platforms) this will surely
overflow the stack.
Also, these objects are not supported on 16-bit targets :slight_smile:

Hi Russel,

Are there any plans to change this? It's needed for a correct
implementation of C, after all.

I guess it will happen when someone needs it enough. My understanding is that
it wouldn't be hard to implement for most processors. By the way, you are wrong
to say that it is needed for a correct implementation in C. Here's an analogy:
consider the x86 instruction set. It does not have an instruction for returning
arbitrarily large arrays. In fact it doesn't have any instructions for
manipulating arrays or structs at all. Even worse, it doesn't even have a
notion of function! So clearly C programs cannot run on x86 processors! Yet
still they do, how can this be? The same answer applies to LLVM.

Ciao,

Duncan.

Yes, it should.

Hmm, Duncan Sands says otherwise?

But on many (almost all platforms) this will surely
overflow the stack.

Well, the stack size can usually be tweaked at runtime. But suppose we
have the default stack size, and replace the 10 MB object with 100 kB,
the question stands.

Also, these objects are not supported on 16-bit targets :slight_smile:

True :slight_smile: well, on a 16-bit system, can you pass and return a 1 kB object?

Well yes, LLVM is Turing complete :slight_smile: but I take your point, one could
in a pinch hack the same functionality in the front-end. Would that
likely be easier or harder than doing it properly in the code
generator? Other things equal, of course the latter would be
preferable.

How do LLVM-GCC and Clang handle it?

Oh... :frowning:

Are there any plans to change this? It's needed for a correct
implementation of C, after all.

No, C only requires support for objects up to 65535 bytes in size. C99 5.2.4.1.

IIRC they lower it themselves, doing whatever the ABI says they
should, which is usually adding a hidden sret parameter to the
function once you get beyond small structs. It'd be nice to move some
of that logic back into LLVM, but it's tricky because C99 says things
about complex numbers which requires special frontend type knowledge
that LLVM doesn't have.

I'm just echoing previous discussion, and you can probably get a more
reliable answer by finding the original discussion in the archives.

Reid

Anton assumed you were talking about compiling C using llvm-gcc or clang,
I assumed you were talking about LLVM IR. The C front-ends support this
construct on all platforms, but the analogous LLVM IR construct is not
supported on some platforms.

Ciao,

Duncan.

65535 bytes would be reasonably sufficient; what's the largest
supported by the LLVM code generator?

The common code shouldn't have problems with it AFAIK. How well it works on a particular target/ABI combination depends on whether anybody has been interested enough to implement it; free software is like that, I'm afraid. But you are not likely to run into a limitation based on size; if it works for an object of size 50 or so it should work for a bigger one.

On any target/ABI combination you may run into runtime stack overflows if you try to do too much, particularly if recursion is involved (as Anton mentioned). This is not a compiler issue, but a limitation of the target environment.

IIRC they lower it themselves, doing whatever the ABI says they
should, which is usually adding a hidden sret parameter to the
function once you get beyond small structs.

Okay, so we seem to be saying sret or suchlike is how you pass and
return large objects by value in LLVM.

What exactly counts as large? As I understand it, the largest integers
correctly handled as first-class values are the size of two pointers,
i.e. usually either 64 or 128 bits. Is the same true of structs?

It'd be nice to move some
of that logic back into LLVM, but it's tricky because C99 says things
about complex numbers which requires special frontend type knowledge
that LLVM doesn't have.

I hadn't realized that, I would've expected complex numbers to be
doable as just a pair of scalar values. What's the fly in the ointment
with C99 complex numbers?

I'm just echoing previous discussion, and you can probably get a more
reliable answer by finding the original discussion in the archives.

I don't suppose you've any idea what search keywords might work?

It'd be nice to move some
of that logic back into LLVM, but it's tricky because C99 says things
about complex numbers which requires special frontend type knowledge
that LLVM doesn't have.

I hadn't realized that, I would've expected complex numbers to be
doable as just a pair of scalar values. What's the fly in the ointment
with C99 complex numbers?

I don't really know, I've just remember that it's been brought out
before as one of the difficulties of fully supporting passing and
returning large structs by value. Hopefully someone more
knowledgeable can answer this question?

I'm just echoing previous discussion, and you can probably get a more
reliable answer by finding the original discussion in the archives.

I don't suppose you've any idea what search keywords might work?

I searched "sret" which found most of the discussion. Kenneth
Uildricks wrote the support for lowering large struct returns on x86
to a hidden sret parameter in accordance with the x86 ABI, but no one
has stepped forward to add support for other targets.

Reid

If you're wondering what this limit is, you're probably heading down
the wrong path, unless you're ultimate interest here is to work on
optimizer techniques for transforming this kind of code into
something usable.

Dan

No, I'm just trying for correctness -- most of the time, only small
objects will be passed by value, for obvious efficiency reasons. If it
turns out the limit is the same as for integers, for example, i.e. the
size of two pointers, then I can say, OK, that's the threshold, and
anything bigger than that gets explicitly implemented as sret.

Hi Reid,

I hadn't realized that, I would've expected complex numbers to be
doable as just a pair of scalar values. What's the fly in the ointment
with C99 complex numbers?

the x86-64 ABI requires complex numbers to be passed *differently* to
a pair of scalar values. So if the LLVM code generators were to do the
lowering, then there would need to be a way to say: this pair of scalars
is just a pair of scalars - pass it normally; but this other pair of scalars
is really a complex number - pass it using a different method. If you take a
pessimistic view, then in order for LLVM to do the lowering then the entire C
type system would somehow have to be injected into the LLVM IR.

Ciao,

Duncan.

Oh!

Hmm. I've already decided to implement complex numbers basically the
classic C++ way, as a template. On the bright side, hopefully that
means the LLVM code generators will end up putting them in pairs of
registers. On the downside, I suppose that means I'll be honoring the
x86-64 ABI for complex numbers in the breach rather than the
observance. Is that going to have any horribly bad consequences?

I would be inclined to guess no, the ABI mostly matters when you're
calling libraries to do various kinds of IO/protocol translation, and
those wouldn't normally use complex numbers.

Graphics libraries do use things most other kinds of libraries don't.
Are complex numbers among them? Not that I can remember hearing; I'm
not an expert in that domain.

If I'm painting myself into a corner here - or if it's the case that
this is fine for what I'm doing, but a big potential pitfall for other
people generating x86-64 code in other contexts - someone please point
it out?

Hi Russell,

No, I'm just trying for correctness -- most of the time, only small
objects will be passed by value, for obvious efficiency reasons. If it
turns out the limit is the same as for integers, for example, i.e. the
size of two pointers, then I can say, OK, that's the threshold, and
anything bigger than that gets explicitly implemented as sret.

if you want correct ABI conformance, then (sadly) the front-end has to do
the lowering itself. For example, even on platforms for which the code
generators can handle returning arbitrarily large values from a function,
the way in which it returns them may not be how the ABI says they should
be returned. If you don't want correct ABI conformance, then you can do
whatever you like. For example, there is no need to use the sret attribute,
you can just pass the pointer - the sret attribute only exists to get ABI
conformance on some platforms.

Ciao,

Duncan.

Hi Russell,

Hmm. I've already decided to implement complex numbers basically the
classic C++ way, as a template. On the bright side, hopefully that
means the LLVM code generators will end up putting them in pairs of
registers. On the downside, I suppose that means I'll be honoring the
x86-64 ABI for complex numbers in the breach rather than the
observance. Is that going to have any horribly bad consequences?

it only matters if you will be linking with complex number code compiled
with a different compiler, eg BLAS/lapack, complex math functions.

I would be inclined to guess no, the ABI mostly matters when you're
calling libraries to do various kinds of IO/protocol translation, and
those wouldn't normally use complex numbers.

The ABI specifies how parameters are passed when calling a function,
and how values are returned from function calls. If your code calls
eg "cabs" then you are probably dead.

Ciao,

Duncan.