struct returns

In the latest snapshot from SVN on X86, llc refuses to compile
functions returning structs larger than two i32 members.

According to the docs, such limitations can be expected to exist on
other platforms.

This leads to a number of questions and observations:

1. Is there a good way to retrieve the current target limitations on
struct return sizes?

2. The sretpromotion pass does not take struct size limitations into
account; it will happily convert an sret parameter with five members
into a return value that llc chokes on.

3. There is no sretdemotion pass.

4. If the answer to #1 is "no", perhaps we need platform-specific
sretpromotion and sretdemotion passes to allow small struct returns to
happen efficiently while large struct returns can be successfully
codegen'd, all without having to build such platform-specific
knowledge into all front-ends.

In the latest snapshot from SVN on X86, llc refuses to compile
functions returning structs larger than two i32 members.

According to the docs, such limitations can be expected to exist on
other platforms.

This leads to a number of questions and observations:

1. Is there a good way to retrieve the current target limitations on
struct return sizes?

No. The information can be inferred from what's in the *CallingConv.td
files, though there are currently no utilities specifically for this.

2. The sretpromotion pass does not take struct size limitations into
account; it will happily convert an sret parameter with five members
into a return value that llc chokes on.

3. There is no sretdemotion pass.

4. If the answer to #1 is "no", perhaps we need platform-specific
sretpromotion and sretdemotion passes to allow small struct returns to
happen efficiently while large struct returns can be successfully
codegen'd, all without having to build such platform-specific
knowledge into all front-ends.

Sure. Alternatively, we could fix codegen to do this itself. I'd be
happy to help anyone interested in working on this.

I recently made a major reorganization of the calling-convention
lowering code which cleared away one of the major obstacles to
doing this within codegen.

Dan

That would be even better. I suppose codegen would have to do
something like the sret parameter demotion when the number of members
exceeds the number of available registers? Or did you have something
else in mind?

Fixing this would make my front-end noticeably simpler, and would
probably benefit other front-ends as well, so I'd be willing to spend
some time on it.

Kenneth Uildriks wrote:

I recently made a major reorganization of the calling-convention
lowering code which cleared away one of the major obstacles to
doing this within codegen.

Dan

So what was the obstacle, and how was it cleared? And how do you see
the large struct return working in codegen?

Anything you care to tell me would be welcome. I will be starting on
this today or tomorrow.

I recently made a major reorganization of the calling-convention
lowering code which cleared away one of the major obstacles to
doing this within codegen.

Dan

So what was the obstacle, and how was it cleared?

The biggest obstacle is that there used to be two different methods
for lowering call arguments; some of the targets used on and some
used another. There wasn't a good reason for having two, but no one
had taken the time to update all the targets. Now, all targets are
using the same basic set of hooks. And, the hooks are more
straight-forward than the mechanisms they replaced.

And how do you see
the large struct return working in codegen?

One part of the action will be in
lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp. This is where LLVM IR
is translated into the special-purpose instruction-selection IR, which
is lower-level. Calls are split up into multiple parts which are
eventually lowered into the actual instructions for the calling
sequence. The main areas of attention will be

SelectionDAGISel::LowerArguments
SelectionDAGLowering::LowerCallTo
SelectionDAGLowering::visitRet

These functions are responsible for breaking up LLVM IR values into
register-sized pieces and handing them off to target-specific code
through these virtual functions:

TLI.LowerFormalArguments
TLI.LowerCall
TLI.LowerReturn

(Actually, SelectionDAGLowering::LowerCallTo calls
TargetLowering::LowerCallTo, which calls TargetLowering::LowerCall,
for historical reasons.)

Basically, the task here is to interpose code which will recognize
when an automatic sret is needed, set up a static alloca to hold the
value (see the StaticAllocaMap), and adjust the argument list and
return code accordingly.

For recognizing when an sret is needed, it'll be necessary to know
what the target supports. This is described in the targets'
*CallingConv.td files. Currently the consumer of this information
is the CallingConvLowering code in

include/llvm/CodeGen/CallingConvLower.h
lib/CodeGen/SelectionDAG/CallingConvLower.cpp

This code is currently used from within the target-specific code
inside LowerFormalArguments and friends. However, it could also
be called from the SelectionDAGBuild directly to determine
if there are sufficient registers. It'll need to be extended
some, because it calls llvm_unreachable() when it runs
out of registers, which is the behavior we're trying to avoid
here :-).

If you're not familiar with the SelectionDAG IR, feel free to
ask questions. I recommend using the -view-dag-combine1-dags
option, which provides a visualization of the SelectionDAG for
each basic block immediately after it has been constructed, to
get an idea of what's being built.

Anything you care to tell me would be welcome. I will be starting on
this today or tomorrow.

Ok, let me know if I can answer any questions.

Dan

I wish to assure you that I have not forgotten this task, nor failed
to start on it, but I cannot give even a rough estimate on when it
will be completed.

It occurs to me that all declarations of a function pointer, and all
bitcasts to a function pointer, could possibly refer to a function
whose signature must be altered by this fix. Is the function
signature relevant to the SelectionDAG representation of said function
pointers, or can it be safely ignored when lowering loads, stores, and
bitcasts involving such pointers?

Also, I cannot build the test suite: the option "-disable-llvm-optzns"
passed to llvm-gcc produces several warnings (cc1 seems to think every
letter after 'd' is an individual option), and the option "-m32"
passed to llvm-gcc produces an "unknown command line argument" error
from "cc1". I have been using llvm-gcc extensively to build my own
front-end project, and have not had a problem with it. I am reluctant
to make further changes to the source without being able to run the
test suite and satisfy myself that I have not broken something. I am
running version 4.2.1 of llvm-gcc from the 2.5 release... should I
take a later development snapshot of llvm-gcc?

I wish to assure you that I have not forgotten this task, nor failed
to start on it, but I cannot give even a rough estimate on when it
will be completed.

Ok, that's fine. Thanks for keeping me up to date.

It occurs to me that all declarations of a function pointer, and all
bitcasts to a function pointer, could possibly refer to a function
whose signature must be altered by this fix. Is the function
signature relevant to the SelectionDAG representation of said function
pointers, or can it be safely ignored when lowering loads, stores, and
bitcasts involving such pointers?

No. Fortunately, you don't have to worry about complicated bitcast
situations here. There are only two constructs which are affected:
function definitions and function calls. And in the case of calls,
the only thing that matters is the type of the call operand itself,
not what the operand might have been bitcasted from.

LLVM can't always see what the operand may have been bitcasted from,
so it just has to trust the user. If the dynamic callee's type doesn't
match the static operand type on the call, it's undefined behavior.

Also, I cannot build the test suite: the option "-disable-llvm-optzns"
passed to llvm-gcc produces several warnings (cc1 seems to think every
letter after 'd' is an individual option), and the option "-m32"
passed to llvm-gcc produces an "unknown command line argument" error
from "cc1". I have been using llvm-gcc extensively to build my own
front-end project, and have not had a problem with it. I am reluctant
to make further changes to the source without being able to run the
test suite and satisfy myself that I have not broken something. I am
running version 4.2.1 of llvm-gcc from the 2.5 release... should I
take a later development snapshot of llvm-gcc?

The -disable-llvm-optzns is preceded by a -mllvm, but it's likely
that that didn't work in 2.5 llvm-gcc. If you don't want to live on
the latest snapshot, the 2.6 pre-release (and the 2.6 release, once
it exists) should work here. As a temporary workaround, you might
also be able to replace "-mllvm -disable-llvm-optzns" with "-O0",
which isn't exactly the same, but basically works.

I'm not as familiar with what might be going on with -m32 option.
What host are you on, and what targets is your llvm-gcc configured
for? Does it include 64-bit support? It may be that an llvm-gcc
configured for 32-bit only doesn't recognize -m32. I'm not sure
what to suggest there. Perhaps the Makefile needs to be smarter.

Dan

It occurs to me that all declarations of a function pointer, and all
bitcasts to a function pointer, could possibly refer to a function
whose signature must be altered by this fix. Is the function
signature relevant to the SelectionDAG representation of said function
pointers, or can it be safely ignored when lowering loads, stores, and
bitcasts involving such pointers?

No. Fortunately, you don't have to worry about complicated bitcast
situations here. There are only two constructs which are affected:
function definitions and function calls. And in the case of calls,
the only thing that matters is the type of the call operand itself,
not what the operand might have been bitcasted from.

What about the type of the ptr-to-function-ptr that the call operand
was *loaded* from? This will come up whenever a function pointer is
stored in callback situations. If I change the call operand, it won't
match the element type of the pointer it was loaded from. Does this
matter in a SelectionDAG?

Also, I cannot build the test suite: the option "-disable-llvm-optzns"
passed to llvm-gcc produces several warnings (cc1 seems to think every
letter after 'd' is an individual option), and the option "-m32"
passed to llvm-gcc produces an "unknown command line argument" error
from "cc1". I have been using llvm-gcc extensively to build my own
front-end project, and have not had a problem with it. I am reluctant
to make further changes to the source without being able to run the
test suite and satisfy myself that I have not broken something. I am
running version 4.2.1 of llvm-gcc from the 2.5 release... should I
take a later development snapshot of llvm-gcc?

The -disable-llvm-optzns is preceded by a -mllvm, but it's likely
that that didn't work in 2.5 llvm-gcc. If you don't want to live on
the latest snapshot, the 2.6 pre-release (and the 2.6 release, once
it exists) should work here. As a temporary workaround, you might
also be able to replace "-mllvm -disable-llvm-optzns" with "-O0",
which isn't exactly the same, but basically works.

I'm not as familiar with what might be going on with -m32 option.
What host are you on, and what targets is your llvm-gcc configured
for? Does it include 64-bit support? It may be that an llvm-gcc
configured for 32-bit only doesn't recognize -m32. I'm not sure
what to suggest there. Perhaps the Makefile needs to be smarter.

I am on linux x86 32 bit... no 64 bit support at all. llvm-gcc is
configured for C and C++... I didn't add any other languages or
targets onto the defaults for LLVM or llvm-gcc. I was hoping not to
install another version of llvm-gcc, since it is quite a beast and I
don't want to break the one I already have running. (Giving it a new
prefix should be safe, right?) I pretty much have to live on the
snapshot for the folder that I'm working on LLVM code in, so I'll just
have to bite the bullet and make it work.

No, and it doesn't matter in LLVM IR either. It's a front-end's
responsibility to ensure that the (static) type of the call operand
is compatible with the type of all actual callees that it can call
at runtime.

Dan

And mine!

Does it not handle two double-precision floats for the C99 complex type? Or
did you mean "larger" as in more fields rather than larger fields?

In the latest snapshot from SVN on X86, llc refuses to compile
functions returning structs larger than two i32 members.

Does it not handle two double-precision floats for the C99 complex type? Or
did you mean "larger" as in more fields rather than larger fields?

I have not tested it with two double-precision floats. I tested it
with two i32's and with three i32's. My overall point is that there
are built-in passes which turn code that llc likes into code that llc
chokes on, and that handling large struct returns is better handled in
LLVM than in each individual front-end.

Rather than installing and running the dev version of llvm-gcc, I will
wait until the 2.6 release and upgrade LLVM and llvm-gcc at that time,
and once that is done, all LLVM tests pass, and my own (unrelated)
front-end project builds and runs successfully with the upgrade, I
will continue working on struct return.

If anyone has a better LLVM development environment than mine and
wishes to take over this task in order to hurry its completion, please
let me know.