[RFC] Attribute that can be used to instruct clang to pass and return non-trivial structs directly

I’d like to propose an attribute that can be used to instruct clang to pass and return non-trivial C++ structs directly when it’s possible to do so. Any feedback would be greatly appreciated.

Motivation for the attribute

We have a user who wants to adopt smart pointers to manage lifetime of objects. The current code looks like this:

struct Object {

void release();
};

Object *getObject();

void test() {
Object *t = getObject();

// Users have to call ‘release’ manually.
t->release();
}

The smart pointer the user plans to use is a thin C++ template wrapper whose only data member is the raw pointer that points to the object being managed:

template
struct SmartPtr {
T *ptr;
SmartPtr(T *p);
SmartPtr(const SmartPtr &);
~SmartPtr() {
ptr->release();
}
};

SmartPtr getObject();

void test() {
SmartPtr t = getObject();

// The object ‘t’ points to is automatically released when ~SmartPtr is called.
}

The problem with the code above is that, since SmartPtr is considered to be non-trivial for the purpose of calls according to the C++ ABI, function getObject now returns its return value indirectly via an implicit pointer parameter passed to the function even though the struct has the same in-memory representation as a raw pointer. This breaks ABI compatibility for users who use getObject in their code but cannot rewrite their code to use the new API.

The proposed attribute

The attribute (tentatively named “trivial_abi”) will be used to annotate C++ structs and instruct clang to pass and return the struct indirectly when it’s possible to do so. This means that the presence of a non-trivial destructor or copy constructor will not force the struct to be passed or returned indirectly, but it doesn’t guarantee the struct will always be passed or returned directly (the struct have to be passed indirectly if its size is too large or there is another data member that is non-trivial, for example).

Besides avoiding ABI breakage, the attribute can potentially improve performance since the extra load instruction to get the value of the pointer isn’t needed when the struct is passed or returned directly.

I'd like to propose an attribute that can be used to instruct clang to
pass and return non-trivial C++ structs directly when it's possible to do
so. Any feedback would be greatly appreciated.

### Motivation for the attribute
We have a user who wants to adopt smart pointers to manage lifetime of
objects. The current code looks like this:

struct Object {
  ...
  void release();
};

Object *getObject();

void test() {
  Object *t = getObject();
  ...
  // Users have to call 'release' manually.
  t->release();
}

The smart pointer the user plans to use is a thin C++ template wrapper
whose only data member is the raw pointer that points to the object being
managed:

template<class T>
struct SmartPtr {
  T *ptr;
  SmartPtr(T *p);
  SmartPtr(const SmartPtr &);
  ~SmartPtr() {
    ptr->release();
  }
};

SmartPtr<Object> getObject();

void test() {
  SmartPtr<Object> t = getObject();
  ...
  // The object 't' points to is automatically released when ~SmartPtr is
called.
}

The problem with the code above is that, since SmartPtr is considered to
be non-trivial for the purpose of calls according to the C++ ABI, function
getObject now returns its return value indirectly via an implicit pointer
parameter passed to the function even though the struct has the same
in-memory representation as a raw pointer. This breaks ABI compatibility
for users who use getObject in their code but cannot rewrite their code to
use the new API.

### The proposed attribute
The attribute (tentatively named "trivial_abi") will be used to annotate
C++ structs and instruct clang to pass and return the struct indirectly
when it's possible to do so.

I meant to say "instruct clang to pass and return the struct directly" here.

I'd like to propose an attribute that can be used to instruct clang to pass and return non-trivial C++ structs directly when it's possible to do so. Any feedback would be greatly appreciated.

### Motivation for the attribute
We have a user who wants to adopt smart pointers to manage lifetime of objects. The current code looks like this:

struct Object {
  ...
  void release();
};

Object *getObject();

void test() {
  Object *t = getObject();
  ...
  // Users have to call 'release' manually.
  t->release();
}

The smart pointer the user plans to use is a thin C++ template wrapper whose only data member is the raw pointer that points to the object being managed:

template<class T>
struct SmartPtr {
  T *ptr;
  SmartPtr(T *p);
  SmartPtr(const SmartPtr &);
  ~SmartPtr() {
    ptr->release();
  }
};

SmartPtr<Object> getObject();

void test() {
  SmartPtr<Object> t = getObject();
  ...
  // The object 't' points to is automatically released when ~SmartPtr is called.
}

The problem with the code above is that, since SmartPtr is considered to be non-trivial for the purpose of calls according to the C++ ABI, function getObject now returns its return value indirectly via an implicit pointer parameter passed to the function even though the struct has the same in-memory representation as a raw pointer. This breaks ABI compatibility for users who use getObject in their code but cannot rewrite their code to use the new API.

### The proposed attribute
The attribute (tentatively named "trivial_abi") will be used to annotate C++ structs and instruct clang to pass and return the struct indirectly when it's possible to do so. This means that the presence of a non-trivial destructor or copy constructor will not force the struct to be passed or returned indirectly, but it doesn't guarantee the struct will always be passed or returned directly (the struct have to be passed indirectly if its size is too large or there is another data member that is non-trivial, for example).

To clarify, the expected behavior here is that the type would get passed using the C ABI for the underlying struct type instead of being forced to use an indirect ABI. There isn't a direct guarantee that the ABI will match that of a pointer if the struct happens to only contain a pointer. Our internal client is satisfied with that.

I guess this attribute would be illegal on a class with virtual methods or virtual bases? That's probably simplest for now.

In semantic terms, this attribute is a guarantee that the type can be destructively moved with memcpy, combined with an assertion that there is no code in the program that isn't aware of that (and therefore that it's acceptable to rely on that in the ABI). By a "destructively move" I mean, effectively, a move-initialization immediately followed by the destruction of the source.

The destruction convention for arguments of trivial_abi types would be that the callee is responsible for destroying the object (regardless of how it is passed under the C ABI). This allows the callee to just initialize the parameter variable with the passed representation and then destroy that on exit. If the caller were responsible for destroying the object, we would need to implicitly copy in order to protect against the callee changing the value of its parameter variable before returning. Destroying a parameter at callee exit is not the Itanium rule, but it is the MSVC rule, and it is permitted by [expr.call]p4. We will arguably be required to violate the rule that the parameter's destruction "occurs within the context of the calling function", but I believe we can implement this to at least not violate the rule that the parameter is destroyed outside of the scope of a function-try-block in the callee.

John.

Yes, this idea has been kicking around for a long time. We should definitely have an attribute for this.

We talked for a long time about the idea that this attribute marks a type as not having address identity, kind of like unnamed_addr works for globals. However, I think some identity-related name would probably be confusing to users. Most people don’t understand the vagaries of why std::unique_ptr<T*> isn’t passed the same as T*, and they want a fix that says something like “pass_in_registers” or “shut_up_compiler_do_what_I_want”, even though pass_in_registers is kind of meaningless when you’re out of or never had any parameter registers.

Anyway, trivial_abi sounds fine. At least, as someone who has worked in clang call lowering, I understand what it means.

+1 to the idea. I think trivial_abi is a reasonable name for the
attribute as well.

~Aaron

Even when not initializing a function parameter / return object, I assume? (The attribute name seems a little too specific for such cases, but it’s probably fine for it to describe only the primary intended use.)

I think you also need to extend this rule to cover the attributed types: http://eel.is/c++draft/special#class.temporary-3

Would an aggregate containing only such types and trivial types inherit the same property? (More accurately, a type whose copy or move ctor and dtor would be trivial if not for calling corresponding members on such a type.) Or is that a bridge too far for what you’re trying to achieve? Once we implement this, will it be considered part of our immutable ABI surface, or will we have time to iterate on such design questions?

Seems very reasonable to me. I’d love for libc++ to start using this for unique_ptr in its unstable ABI mode.

Even when not initializing a function parameter / return object, I assume? (The attribute name seems a little too specific for such cases, but it’s probably fine for it to describe only the primary intended use.)

Yes, I think it would enable such an optimization. Of course, if we were pursuing that optimization in general, I think we’d want to also add a non-ABI-breaking attribute that could be applied to basically everything in the STL.

I think you also need to extend this rule to cover the attributed types: http://eel.is/c++draft/special#class.temporary-3

Agreed. We may need to insert an implicit copy of the value to perform the call, and since that can be detectable (by ill-advised code), we need a call-out that says that’s legal.

I don’t think we’d thought about documenting this (additionally) in terms of precise edits to the standard, but that’s not a bad idea.

Would an aggregate containing only such types and trivial types inherit the same property? (More accurately, a type whose copy or move ctor and dtor would be trivial if not for calling corresponding members on such a type.)

Yes. I would phrase it the other way: being required to be passed indirectly is a property that’s inherited by containing aggregates.

John.

Even when not initializing a function parameter / return object, I assume? (The attribute name seems a little too specific for such cases, but it’s probably fine for it to describe only the primary intended use.)

Yes, I think it would enable such an optimization. Of course, if we were pursuing that optimization in general, I think we’d want to also add a non-ABI-breaking attribute that could be applied to basically everything in the STL.

I think you also need to extend this rule to cover the attributed types: http://eel.is/c++draft/special#class.temporary-3

Agreed. We may need to insert an implicit copy of the value to perform the call, and since that can be detectable (by ill-advised code), we need a call-out that says that’s legal.

I don’t think we’d thought about documenting this (additionally) in terms of precise edits to the standard, but that’s not a bad idea.

Wasn’t there a Clang policy about language extensions that required they be at least proposed for standardization? (I can’t seem to find that anymore, but I think Doug proposed it/wrote it up at some point)

Maybe attributes don’t fall under this policy? Not sure.

That’s a very good question. I remember us talking about that, but I don’t think it ever turned into a firm policy. I think the important points about language features are:

  1. We don’t want to take a feature that we don’t like the design of unless we’re forced to by a language standard. We’re allowed to be opinionated about language design! Required, even.

  2. We don’t want to take a feature that’s poorly-specified (again, unless we’re forced to by a language standard :)). The specification doesn’t have to be expressed in terms of precise edits to a standard — among other things, this would often be really annoying, since a lot of features are intended to apply in both C and C++, and they may have implications for other extensions like ObjC/OpenMP/whatever — but it should be at a point where such edits are reasonably extrapolable. I wouldn’t say that it needs to be something that we can imagine an actual standards committee taking, since there are a lot of reasons a committee might reject a feature that don’t necessarily imply a lack of quality; also, this would be rather inconsistent of us, since we’ve certainly taken features in the past that I’m not sure have much chance of standardization.

  3. We want to be very cautious about accepting new language syntax because it could infringe on future language evolution. This is one place where attribute-only features have a substantial advantage.

  4. We want major language features to be maintained. The concern here grows with the amount of code contributed and how tightly it needs to be integrated with the rest of the compiler. This is one of those area where life is not really fair, because we can’t realistically assume that any single contributor is going to be able to commit to maintaining a feature the same way that an organization can. For example, I personally have a long history of contributing to Clang, and I think the language designs I’ve contributed have been relatively good — but if I proposed a language feature on my own behalf, without any commitment from Apple or anyone else to continue maintaining that contribution if e.g. I got hit by a bus, I’m not sure it would be reasonable for the project to accept my proposal.

And implicit in all of these is that the feature ought to be “open-source” — if you’re going to propose a novel, non-standard feature, you need to be willing to accept feedback about both the specification and its basic design, and it really shouldn’t depend on anything proprietary like a closed-source runtime library. We’re allowed as a project to be opinionated about this sort of thing, too.

But I think if we like the feature, and we like its specification, and we don’t think it infringes on language evolution, and we have strong reason to think it’s going to be maintained, we don’t need to hew tightly to a “no new features” mandate.

John.

I don't think we'd thought about documenting this (additionally) in terms
of precise edits to the standard, but that's not a bad idea.

Wasn't there a Clang policy about language extensions that required they
be at least proposed for standardization? (I can't seem to find that
anymore, but I think Doug proposed it/wrote it up at some point)

Maybe attributes don't fall under this policy? Not sure.

That's a very good question. I remember us talking about that, but I
don't think it ever turned into a firm policy.

It's in a somewhat non-obvious place on the website:
http://clang.llvm.org/get_involved.html

I think the important points about language features are:

1. We don't want to take a feature that we don't like the design of unless
we're forced to by a language standard. We're allowed to be opinionated
about language design! Required, even.

2. We don't want to take a feature that's poorly-specified (again, unless
we're forced to by a language standard :)). The specification doesn't have
to be expressed in terms of precise edits to a standard — among other
things, this would often be really annoying, since a lot of features are
intended to apply in both C and C++, and they may have implications for
other extensions like ObjC/OpenMP/whatever — but it should be at a point
where such edits are reasonably extrapolable. I wouldn't say that it needs
to be something that we can imagine an actual standards committee taking,
since there are a lot of reasons a committee might reject a feature that
don't necessarily imply a lack of quality; also, this would be rather
inconsistent of us, since we've certainly taken features in the past that
I'm not sure have much chance of standardization.

3. We want to be very cautious about accepting new language syntax because
it could infringe on future language evolution. This is one place where
attribute-only features have a substantial advantage.

4. We want major language features to be maintained. The concern here
grows with the amount of code contributed and how tightly it needs to be
integrated with the rest of the compiler. This is one of those area where
life is not really fair, because we can't realistically assume that any
single contributor is going to be able to commit to maintaining a feature
the same way that an organization can. For example, I personally have a
long history of contributing to Clang, and I think the language designs
I've contributed have been relatively good — but if I proposed a language
feature on my own behalf, without any commitment from Apple or anyone else
to continue maintaining that contribution if e.g. I got hit by a bus, I'm
not sure it would be reasonable for the project to accept my proposal.

And implicit in all of these is that the feature ought to be "open-source"
— if you're going to propose a novel, non-standard feature, you need to be
willing to accept feedback about both the specification and its basic
design, and it really shouldn't depend on anything proprietary like a
closed-source runtime library. We're allowed as a project to be
opinionated about this sort of thing, too.

But I think if we like the feature, and we like its specification, and we
don't think it infringes on language evolution, and we have strong reason
to think it's going to be maintained, we don't need to hew tightly to a "no
new features" mandate.

That seems essentially reasonable to me. We also need to be cognizant of
the possibility of fracturing the developer community if our extensions
fundamentally change the way that code is written. (Which is not the same
as saying we can't have such extensions, just that they need to be
especially welll-considered, and we should have a very good reason if we're
not attempting to standardize them. Clang's header modules support falls
somewhat into this category.)

Attribute-based features don't get an automatic pass, but by their nature
they're much more likely to meet these criteria.

Aha.

Yes, this is an excellent point. A feature that can be #ifdef’ed out is definitely more reasonable to provide as a vendor extension.

John.

Ah, thanks! I kept getting search results listing that page & figured it was a false positive… I should’ve looked more closely.

Sounds like (4) is the only sticky one - and I guess this would be partly the C++ standards committee and the Itanium ABI group. (perhaps this is worth a discussion/proposal on the latter - to avoid anything like the abi_tag difficulties (which I only have a vague sense of)?)

Mostly I’m just want to make sure we hold ourselves to a similar standard than we expect from everyone else - rather than using the rules as a way to keep people out while not necessarily meeting the same bar ourselves. & this seems like a good chance to reflect on the rules, see if they do/still fit, etc. Sounds like they mostly do.

Don’t mean to bog anything down in bureaucracy or anything.

  • Dave

Ah, thanks! I kept getting search results listing that page & figured it was a false positive… I should’ve looked more closely.

Sounds like (4) is the only sticky one - and I guess this would be partly the C++ standards committee and the Itanium ABI group. (perhaps this is worth a discussion/proposal on the latter - to avoid anything like the abi_tag difficulties (which I only have a vague sense of)?)

I’m not quite sure what connection you’re drawing here. In this case, I think (4) is satisfied because Apple is promising to maintain the feature, as we will have internal clients that will rely on it — although it sounds like enough other people are interested in it that that really won’t be a problem.

I think what you may be suggesting is that, if we’re indeed going to revise Clang’s policy about features, we should include a codicil that encourages people to seek standardization of any new Clang vendor extensions when that’s reasonably possible. In practice, the committees will not standardize the Clang feature, they’ll standardize their own hopefully-similar feature, but it’s still useful to seek standardization. This is closely related to Richard’s point about trying to ensure that new features don’t drive fragmentation.

Mostly I’m just want to make sure we hold ourselves to a similar standard than we expect from everyone else - rather than using the rules as a way to keep people out while not necessarily meeting the same bar ourselves. & this seems like a good chance to reflect on the rules, see if they do/still fit, etc. Sounds like they mostly do.

Don’t mean to bog anything down in bureaucracy or anything.

No, you’re absolutely right to bring it up.

Should I draft a revision to the policy? Any other initial commentary before I do?

John.

Ah, thanks! I kept getting search results listing that page & figured it was a false positive… I should’ve looked more closely.

Sounds like (4) is the only sticky one - and I guess this would be partly the C++ standards committee and the Itanium ABI group. (perhaps this is worth a discussion/proposal on the latter - to avoid anything like the abi_tag difficulties (which I only have a vague sense of)?)

I’m not quite sure what connection you’re drawing here. In this case, I think (4) is satisfied because Apple is promising to maintain the feature, as we will have internal clients that will rely on it — although it sounds like enough other people are interested in it that that really won’t be a problem.

Ah, sorry, I didn’t mean to refer to (4) in your list, but (4) in the list on “Contributing Extensions to Clang” here: http://clang.llvm.org/get_involved.html

Which says fairly unambiguously: “the extension itself must have an active proposal and proponent within that committee and have a reasonable chance of acceptance. … This criterion does not apply to all extensions, since some extensions fall outside of the realm of the standards bodies.”

It sounds like this bullet point could just be softened a little somehow to accommodate this situation?

Or maybe it does apply & we should make a point of bringing it up to the Itanium ABI group?

(I think since the only surface area in C++ is an attribute, and C++ has syntax entirely designed for extensibility here, there’s no immediate need to bring it up there - it’d barely be meaningful to standardize it since it’s such an implementation detail, I’d guess… - though I suppose the contract is useful “this type doesn’t care about where its bits are in memory” is probably a generically useful property to discuss in the C++ standard, but easy to move from custom attribute to standard attribute, etc)

I don't think we'd thought about documenting this (additionally) in
terms of precise edits to the standard, but that's not a bad idea.

Wasn't there a Clang policy about language extensions that required
they be at least proposed for standardization? (I can't seem to find that
anymore, but I think Doug proposed it/wrote it up at some point)

Maybe attributes don't fall under this policy? Not sure.

That's a very good question. I remember us talking about that, but I
don't think it ever turned into a firm policy.

It's in a somewhat non-obvious place on the website: http://clang.llvm.
org/get_involved.html

Ah, thanks! I kept getting search results listing that page & figured it
was a false positive... I should've looked more closely.

Sounds like (4) is the only sticky one - and I guess this would be partly
the C++ standards committee and the Itanium ABI group. (perhaps this is
worth a discussion/proposal on the latter - to avoid anything like the
abi_tag difficulties (which I only have a vague sense of)?)

I'm not quite sure what connection you're drawing here. In this case, I
think (4) is satisfied because Apple is promising to maintain the feature,
as we will have internal clients that will rely on it — although it sounds
like enough other people are interested in it that that really won't be a
problem.

Ah, sorry, I didn't mean to refer to (4) in your list, but (4) in the list
on "Contributing Extensions to Clang" here: http://clang.llvm.org/
get_involved.html

Which says fairly unambiguously: "the extension itself must have an
active proposal and proponent within that committee and have a reasonable
chance of acceptance. ... This criterion does not apply to all extensions,
since some extensions fall outside of the realm of the standards bodies."

It sounds like this bullet point could just be softened a little somehow
to accommodate this situation?

Or maybe it does apply & we should make a point of bringing it up to the
Itanium ABI group?

I think it would make a lot of sense to talk about this extension in the
Itanium C++ ABI, in the same way we talk about the abi_tag attribute. (And
I would consider such documentation to be a prerequisite for using the
attribute in libc++, having learned from our experiences with abi_tag,
where we were on the other side of a vendor extension necessary for ABI
compatibility with a target's standard library.)

(I think since the only surface area in C++ is an attribute, and C++ has
syntax entirely designed for extensibility here, there's no immediate need
to bring it up there - it'd barely be meaningful to standardize it since
it's such an implementation detail, I'd guess... - though I suppose the
contract is useful "this type doesn't care about where its bits are in
memory" is probably a generically useful property to discuss in the C++
standard, but easy to move from custom attribute to standard attribute, etc)

I actually think it would make some sense to standardize this. It's really
unfortunate that unique_ptr<T> imposes an unnecessary abstraction penalty,
and providing a way to avoid that seems very much in scope for
standardization to me. (It's also not unprecedented; at the previous
meeting EWG approved a [[no_unique_address]] attribute for providing the
EBO layout rules for data members).