Detecting (N)RVO

Hi all,

I'm experimenting with a patch to let users mark functions with an attribute

[[clang::requires_rvo]]
std::string f() {
    return std::string("Hello world");
}

and in the presence of this attribute, warn if (N)RVO didn't happen.
This would be useful to prevent small code changes from disabling RVO
in critical paths.

I have the attribute in place, and I've seen that I can check if it
exists with FunctionDecl::hasAttr. But finding where NRVO is actually
decided for a given function turns out to be more difficult...

SemaInit appears to do something for NRVO/copy elision more generally
when constructors are called. But it only marks the CXXConstructExpr
as elidable, as far as I can see, and I'm not sure if this is used to
effect RVO later (I found something in
CodeGenFunction::EmitCXXConstructExpr, but there seems to be more
playing into the decision of whether to construct at caller or not.

Also, I'm not sure if my terminology is right; I haven't found any
references to RVO-sans-the-N, is that refered to as copy elision? NRVO
only seems to address VarDecls and their lifetime.

Thanks for any ideas,
- Kim

I guess the more direct question is: is there a single place where I
can detect if copy elision/NRVO/RVO has happened or will definitely
happen for the return value of a function?

If not, I guess I'm thinking about the problem the wrong way; is there
a more correct way to ask this question?

Thanks,
- Kim

>
> I'm experimenting with a patch to let users mark functions with an
attribute
>
> [[clang::requires_rvo]]
> std::string f() {
> return std::string("Hello world");
> }
>
> and in the presence of this attribute, warn if (N)RVO didn't happen.
> This would be useful to prevent small code changes from disabling RVO
> in critical paths.
>
> I have the attribute in place, and I've seen that I can check if it
> exists with FunctionDecl::hasAttr. But finding where NRVO is actually
> decided for a given function turns out to be more difficult...
>
> SemaInit appears to do something for NRVO/copy elision more generally
> when constructors are called. But it only marks the CXXConstructExpr
> as elidable, as far as I can see, and I'm not sure if this is used to
> effect RVO later (I found something in
> CodeGenFunction::EmitCXXConstructExpr, but there seems to be more
> playing into the decision of whether to construct at caller or not.
>
> Also, I'm not sure if my terminology is right; I haven't found any
> references to RVO-sans-the-N, is that refered to as copy elision? NRVO
> only seems to address VarDecls and their lifetime.

RVO is a special case of copy elision, where the elided copy is
initializing the returned object.

I guess the more direct question is: is there a single place where I
can detect if copy elision/NRVO/RVO has happened or will definitely
happen for the return value of a function?

Not really, no, our current model marks the places where it's permissible
then leaves the details up to IR generation.

If not, I guess I'm thinking about the problem the wrong way; is there
a more correct way to ask this question?

P0135R0 was accepted by C++'s evolution working group in Kona, so RVO is
likely to be guaranteed by the language semantics in C++17 onwards.

For NRVO, I think the best model would be an attribute on the variable:

  std::string f() {
    [[clang::returned]] std::string x = "Hello world";
    return x;
  }

... which would give an error if we can't apply NRVO to it.

(Also of note: clang's NRVO heuristic is currently quite weak, and
improving it would seem like a good idea. In principle, we should be able
to put a variable into the return slot in all cases where all
return-statements within its scope return it.)

Hi Richard,

Thanks, good info!

>

> Also, I'm not sure if my terminology is right; I haven't found any
> references to RVO-sans-the-N, is that refered to as copy elision? NRVO
> only seems to address VarDecls and their lifetime.

RVO is a special case of copy elision, where the elided copy is initializing
the returned object.

I figured as much, makes sense, thanks!

I guess the more direct question is: is there a single place where I
can detect if copy elision/NRVO/RVO has happened or will definitely
happen for the return value of a function?

Not really, no, our current model marks the places where it's permissible
then leaves the details up to IR generation.

OK, so IR generation can basically choose to make a copy even if
Sema/CodeGen has marked something as eligible for copy elision? Does
this interact with other optimizations?

P0135R0 was accepted by C++'s evolution working group in Kona, so RVO is
likely to be guaranteed by the language semantics in C++17 onwards.

Cool, that's a nice development! This makes it possible to reason more
formally about when elision will happen.

My attribute-guided warning was intended more for user guidance. A
function marked as `requires_rvo` will be able to maintain that
property even if someone changes the code in a way that RVO can't
happen, e.g. from

   [[clang::requires_rvo]]
   std::string f() {
      // copy elision
      return std::string("Hello world");
   }

to

   [[clang::requires_rvo]]
   std::string f(int v) {
      std::string a = "Hello world", b = "Farewell to arms";

      if (v < 20)
        return b;

      return a;
   }

(assuming the latter disables NRVO, maybe it doesn't?)

For NRVO, I think the best model would be an attribute on the variable:

  std::string f() {
    [[clang::returned]] std::string x = "Hello world";
    return x;
  }

... which would give an error if we can't apply NRVO to it.

This looks easier to implement, but less useful for the case I
presented above. I'm worried about code that assumes RVO is working,
but where user changes suddenly render the compiler unable to deliver.
I generally trust the compiler to do the right thing.

A per-variable attribute would be more of a guide to the compiler to
make the NRVO analysis user-directed, right?

(Also of note: clang's NRVO heuristic is currently quite weak, and improving
it would seem like a good idea. In principle, we should be able to put a
variable into the return slot in all cases where all return-statements
within its scope return it.)

I think that's the current behavior, you and Nick Lewycky seem to have
fixed that in:
https://github.com/llvm-mirror/clang/commit/130d63a029bc588cb61a6aaea2db6b9ed4c5cc56

Thanks,
- Kim