Status of first-class aggregate types

What’s the current status on support for first-class structs? The last I heard was:

  • Structs which are smaller or equal to two pointers can be passed / returned / loaded / stored by value.
  • There are plans to expand this in the future to support arbitrary-sized structs as first class values. (Probably via some transformation pass the converts the return value into a hidden parameter.)
    First question - are those statements still true?

Another thing I noticed recently was that although we have insert/extract value, there’s no ‘constructvalue’ instruction. I’m thinking of something that would take a structure type + a list of individual field values and returns a first class aggregate. Right now, if you want to construct a new aggregate, it seems that you either have to alloca it, use GEP to fill in the fields, and then load it - or you have to alloca it, load the uninitialized struct, and then use successive insertvalue instructions to initialize it. At least, I can’t think of another way to do it. (Of course, you can get SSA struct values from other places, but I’m talking about creating a fresh value.)

The reason I am asking all of this is that I am starting to work on supporting tuple types in my language - and I’m trying to decide what is the best way to implement them. Since these are value types, not reference types, they need to be copied when passing to or returning from a function. The two approaches I am considering is (a) always pass them by reference internally, using a hidden parameter for the return value, and then have the receiver of the value copy it into an SSA value before use, and (b) for types <= 2 pointers, use SSA values only, and then pass larger types as in case (a). Of course, if support for large, first-class SSA values was finished, then I wouldn’t have to care about any of these issues :slight_smile:

What’s the current status on support for first-class structs? The last I heard was:

  • Structs which are smaller or equal to two pointers can be passed / returned / loaded / stored by value.
  • There are plans to expand this in the future to support arbitrary-sized structs as first class values. (Probably via some transformation pass the converts the return value into a hidden parameter.)
    First question - are those statements still true?

I can’t answer this or most of your other questions, but:

Another thing I noticed recently was that although we have insert/extract value, there’s no ‘constructvalue’ instruction. I’m thinking of something that would take a structure type + a list of individual field values and returns a first class aggregate. Right now, if you want to construct a new aggregate, it seems that you either have to alloca it, use GEP to fill in the fields, and then load it - or you have to alloca it, load the uninitialized struct, and then use successive insertvalue instructions to initialize it. At least, I can’t think of another way to do it. (Of course, you can get SSA struct values from other places, but I’m talking about creating a fresh value.)

The way to do this is to start with undef and repeatedly insert into it. For example:
%0 = insertvalue undef {i1,i8*}, i32 0, i1 0
%1 = insertvalue {i1,i8*} %0, i32 1, i8* %pointer

Arguably there should be a helper method for this on IRBuilder.

John.

What’s the current status on support for first-class structs? The last I heard was:

  • Structs which are smaller or equal to two pointers can be passed / returned / loaded / stored by value.
  • There are plans to expand this in the future to support arbitrary-sized structs as first class values. (Probably via some transformation pass the converts the return value into a hidden parameter.)
    First question - are those statements still true?

Yes, I think so. Also support is actually better than that, the sticking issue occurs when passing and returning large aggregates. I think that improved recently but am not sure of the status.

The reason I am asking all of this is that I am starting to work on supporting tuple types in my language - and I’m trying to decide what is the best way to implement them. Since these are value types, not reference types, they need to be copied when passing to or returning from a function. The two approaches I am considering is (a) always pass them by reference internally, using a hidden parameter for the return value, and then have the receiver of the value copy it into an SSA value before use, and (b) for types <= 2 pointers, use SSA values only, and then pass larger types as in case (a). Of course, if support for large, first-class SSA values was finished, then I wouldn’t have to care about any of these issues :slight_smile:

I’d pass them by value if they are small but by reference if they are large. Passing large tuples by value isn’t going to provide a win.

-Chris

What’s the current status on support for first-class structs? The last I heard was:

  • Structs which are smaller or equal to two pointers can be passed / returned / loaded / stored by value.
  • There are plans to expand this in the future to support arbitrary-sized structs as first class values. (Probably via some transformation pass the converts the return value into a hidden parameter.)
    First question - are those statements still true?

Yes, I think so. Also support is actually better than that, the sticking issue occurs when passing and returning large aggregates. I think that improved recently but am not sure of the status.

The reason I am asking all of this is that I am starting to work on supporting tuple types in my language - and I’m trying to decide what is the best way to implement them. Since these are value types, not reference types, they need to be copied when passing to or returning from a function. The two approaches I am considering is (a) always pass them by reference internally, using a hidden parameter for the return value, and then have the receiver of the value copy it into an SSA value before use, and (b) for types <= 2 pointers, use SSA values only, and then pass larger types as in case (a). Of course, if support for large, first-class SSA values was finished, then I wouldn’t have to care about any of these issues :slight_smile:

I’d pass them by value if they are small but by reference if they are large. Passing large tuples by value isn’t going to provide a win.

OK, thanks for that confirmation, now I can proceed ahead with less trepidation. :slight_smile:

For large aggregates (well, not huge, but the size of a typical structure or class), do you recommend representing them as SSA values internally within a function, or using allocas? That is, even if they are going to be passed by reference, the question is whether they should also be internally represented that way.

So for example, if I have say a tuple of (double, double, double) representing a 3d coordinate which is a value type, and I pass that to a function, the first thing I want to do is copy it so that mutations don’t affect the caller. I can either load it into an SSA value and keep it there, or I can create an alloca, load it and then store it to the alloca. What I’d like to know is whether either approach is likely to produce more efficient code (it’s about the same amount of work for me to implement either way.)

The other question I’d have is whether there’s a good target-independent test (it can be a conservative test) to know whether or not something can be passed / returned as an SSA value. At the moment I am using a really dodgy heuristic. Of course, I realize that this would be a much better decision to make after I’ve selected a target, but that requires my linker to re-write function signatures which frankly I am not sure how to do :slight_smile:

I’d pass them by value if they are small but by reference if they are large. Passing large tuples by value isn’t going to provide a win.

OK, thanks for that confirmation, now I can proceed ahead with less trepidation. :slight_smile:

For large aggregates (well, not huge, but the size of a typical structure or class), do you recommend representing them as SSA values internally within a function, or using allocas? That is, even if they are going to be passed by reference, the question is whether they should also be internally represented that way.

Either way should work fine, do whatever works best in your frontend. Manipulating it as a first class value will probably generate slightly better code because the optimizer will scalarize it completely. Treating it like an alloca will depend on alias analysis more to get better code.

The other question I’d have is whether there’s a good target-independent test (it can be a conservative test) to know whether or not something can be passed / returned as an SSA value. At the moment I am using a really dodgy heuristic. Of course, I realize that this would be a much better decision to make after I’ve selected a target, but that requires my linker to re-write function signatures which frankly I am not sure how to do :slight_smile:

I don’t think so,

-Chris

I’d pass them by value if they are small but by reference if they are large. Passing large tuples by value isn’t going to provide a win.

OK, thanks for that confirmation, now I can proceed ahead with less trepidation. :slight_smile:

For large aggregates (well, not huge, but the size of a typical structure or class), do you recommend representing them as SSA values internally within a function, or using allocas? That is, even if they are going to be passed by reference, the question is whether they should also be internally represented that way.

Either way should work fine, do whatever works best in your frontend. Manipulating it as a first class value will probably generate slightly better code because the optimizer will scalarize it completely. Treating it like an alloca will depend on alias analysis more to get better code.

In case anyone on this list is interested, I wrote up my solution to the treatment of small aggregates on my blog here: http://machinewords.blogspot.com/2009/12/more-progress-on-tuples.html