[RFC] Lifetime bound check for parameters of coroutines

Motivation

The problem starts with a coroutine which accepts a reference parameter. It is the responsibility of the caller of the coroutine to ensure that the reference argument lives until the coroutine completes (and not just until it returns after the first suspension point).

Reference parameters can unintentionally bind temporaries and local variables of the caller.

For example, a wrapper function, using plain return for coroutine types, makes it easier to blunder and introduce dangling references to locals. The local variable lives only as long as the wrapper function and is destroyed after the return. The coroutine returned by the wrapper would now have dangling references (after its first suspension point).

task<int> coro(const int& a) { co_return a + 1; }

// godbolt.org/z/5Ed5hsETM : stack-use-after-return.
task<int> wrapper(int a) { return coro(a); }

// Ok. 'a' is part of coroutine frame of 'safe_wrapper'.
task<int> safe_wrapper(int a) { co_return co_await coro(a); }

This gets more problematic with template libraries which were previously perfectly safe for synchronous code execution. Now with coroutines, such libraries would become unsafe with the potential introduction of dangling references. std::function and std::bind are such examples.

task<int> coro(const int& a) { co_return a + 1; }

int main() {
    std::function<task<int>(int)> unsafe = coro;
    sync_wait(unsafe(1)); // godbolt.org/z/q557hb35G: stack-use-after-return.

    std::function<task<int>(const int&)> safe = coro;
    sync_wait(safe(1));  // Ok.
    std::function<task<int>(const int&)> unsafe_again = unsafe;
    sync_wait(unsafe_again(1));  // godbolt.org/z/znd8Pqn9d: stack-use-after-return.
}

This is because the implementation of std::function has multiple such wrappers using plain return. See example.


Proposal: Lifetime bound check for parameters of coroutines

Clang already supports sophisticated lifetime bound checks for function parameters annotated with [[clang::lifetimebound]]. This annotation could be used to annotate function parameters to indicate that the entity referred to by that parameter may also be referred to by the return value of the function.

The reference parameters of a coroutine are, basically, lifetime bound to the coroutine return object.

The key proposals of this document are:

  1. Extend lifetime-bound analysis to also find lifetime issues in calls to a coroutine.
  2. Perform this analysis implicitly for calls to a coroutine without needing to explicitly mark coroutine parameters as [[clang::lifetimebound]].
  3. Perform this analysis not just for coroutines but also for plain functions returning a coroutine type.
  4. Opt-in: Allow coroutine implementations to opt-in for such analysis (should not be default as it can have false positives). An implementation can opt-in by annotating the coroutine result type with [[clang::coroutine_lifetimebound]] (new annotation).
  5. Opt-out: It should be possible to disable this analysis for parameters marked explicitly with [[clang::not_lifetimebound]] (new annotation).

Details

Firstly, we need to implicitly perform these checks for coroutines instead of relying on explicitly annotated coroutines. Explicit annotations for every parameter in every coroutine declaration take away readability and are also error-prone if we miss applying them.

Secondly, we need to perform these checks not just for coroutines but also for function wrappers. This is because:

  1. Functions wrapping a coroutine have (mostly) the same lifetime requirements as the wrapped coroutine. In principle, it is possible for a function to not pass its reference parameter to the wrapped coroutine. But it should be fine to be conservative here and consider such params as lifetime bound as well.
  2. More importantly, it is not possible to distinguish between a function wrapper (returning a coroutine object) and a coroutine by merely looking at the function declaration. This information is only available in a function definition (and a definition might not be available at the callsite).

Thirdly, we would want to allow a coroutine library to opt-in for such lifetime checks. We do not want to enroll all coroutine types since these could give false positives due to function wrappers.

False positives

It is possible for this analysis to produce false warnings in certain scenarios. These include

  1. A wrapping function accepting a reference parameter but not passing it to the wrapped coroutine.
  2. A coroutine uses a reference parameter only before its first suspension point.

Proposal: This should be fine to accept in most cases. In order to opt-out, params annotated with [[clang::not_lifetimebound]] should be skipped from this lifetime-bound analysis.

task<int> coro1() { co_return 1; }
task<int> coro2() { co_return 2; }
// False warning for calls to this wrapper.
task<int> foo(const int& val) { return val > 0 ? coro1() : coro2(); }
task<int> fine([[clang::not_lifetimebound]] const int& val) { return val > 0 ? coro1() : coro2(); } 

Future work and improvements

  1. Reference wrapper: Lifetime bound issues with value types behaving like references are not currently handled. Handling such types properly would be beneficial to memory safety in general and not just to coroutines.
  2. Function pointers: Current lifetime analysis only works with call expressions associated to a function declaration. This could be extended to perform the proposed implicit analysis for calls involving function pointers.
task<int> coro(const int&);
using FP = task<int>(const int&);
task<int> foo() {
    FP* fp = &coro;
    return fp(1);
}

MVP of this approach is proposed in this PR.

Thanks for bringing this!

It is indeed an interesting topic. Here are some comments:

It is the responsibility of the caller of the coroutine to ensure that the reference argument lives until the coroutine completes (and not just until it returns after the first suspension point).

I like the assumption. But the problem is that the implementation ([RFC] Perform lifetime bound checks for arguments to coroutine by usx95 · Pull Request #69360 · llvm/llvm-project · GitHub) doesn’t handle this actually. IIRC, the implementation reuses the logic from [[clang::lifetimebound]]. But it is problematic since the lifetime of coroutines are much different than normal functions. Then we can’t reuse such analysis simply. For example:

another_coro_type b() { ... } // this is a coroutine

coro_type a(const int &arg) {
      b(arg);

      co_await something();
}

In this example, we can’t assume the coroutine a would complete after coroutine b got completed since the behavior of coroutines highly depends on the library implementations. Then we may have a lot of false positives and a lot of false negatives.

Perform this analysis implicitly for calls to a coroutine without needing to explicitly mark coroutine parameters as [[clang::lifetimebound]].

This is my biggest concern. We can’t perform implicit things which may bring a lot false positives. If we want such things, I think we must need to make it explicit.

Explicit annotations for every parameter in every coroutine declaration take away readability and are also error-prone if we miss applying them.

It may not be erorr-prone if we forgot to mark something since it won’t break the behaviors of the program. Also we can mark such annotations to the definitions of the coroutine types directly.

Perform this analysis not just for coroutines but also for plain functions returning a coroutine type.

Definitely no.

Secondly , we need to perform these checks not just for coroutines but also for function wrappers.

A function returning a “coroutine type” is not equal to a wrapper. For example, we can define a future class which can be used as the return type of normal functions or we can make it as the return type of coroutines.

But it should be fine to be conservative here and consider such params as lifetime bound as well.

But it may bring more false postivies?

it is not possible to distinguish between a function wrapper (returning a coroutine object) and a coroutine by merely looking at the function declaration. This information is only available in a function definition (and a definition might not be available at the callsite).

Yes. This is a key problem of analysing coroutines. But we can’t/shouldn’t workaround it by pretending other non-coroutine functions as coroutine too.


My point is that we can’t make such things implicit. We should introduce a new attribute (maybe [[clang::lifetimebound_coro]]) and define its semantics clearly (instead of reusing the semantics from [[clang::lifetimebound]]). Then the library writers or programming practice designer can introduce such attributes to their code bases after they understand what it actually means.

I mean, the C++20 coroutines is actually a catagory of coroutines types. Different coroutines may have very different semantics. Personally, it is generally not a thing to talk about the concrete properties for overall C++20 coroutines. It is much more meaningful to talk about the semantics for concrete coroutine types, e.g., std::generator, folly::Task, …

In another word, if we want to bring new semantics to coroutines, it must be an opt-in choice for library writers. And we shouldn’t make decisions for them.


Another similar case I met more frequently is about the lifetime issues of lambda captures in coroutine lambdas. I am wondering if we can reuse such things to help that.

My point is that we can’t make such things implicit. We should introduce a new attribute (maybe [[clang::lifetimebound_coro]]) and define its semantics clearly (instead of reusing the semantics from [[clang::lifetimebound]]). Then the library writers or programming practice designer can introduce such attributes to their code bases after they understand what it actually means.

As discussed privately, we both agree here. The proposal is to introduce a clang annotation and allow library writers to opt-in their implementation for such lifetime checks.

I would suggest to allow use the attribute to coroutine return types. (Or better, the promise_type) .

Could you please give an example where annotating the promise type would be more helpful ? AFAICT annotating coroutine return type should be enough to flag such lifetime issues (even inside coroutines).

Coroutine vs functions
We can’t know if a function is a coroutine by its signature.
One quick solution may be to treat all functions as potential coroutines if their return type is marked with [[clang::lifetimebound_coro]]. But I don’t like the idea since it looks pretty dirty.
I think the potential solution may be to add new attribute [[clang::coro_return_type]] or similar things [[clang::function_must_be_coro]] to let the compiler know a function declaration must be the declaration for coroutines.

In conjunction with the lifetime checks, I think this looks like a nice workaround. Here is a slightly modified proposal:

  • A coroutine implementation could use [[clang::function_must_be_coro]] to enforce that a function returning the marked coroutine return type should be a coroutine.
  • We should also allow users to define wrapping functions if required. Such functions should be explicitly marked with, say, [[clang::coroutine_wrapper]]. This would be primarily helpful for library writers.

Another decision, which is open to debate, could be that explicitly marked coroutine wrappers should be considered coroutines, and lifetimes of arguments to reference parameters of such functions should be checked as well.

One of the results of disabling unmarked coroutine wrappers is that function wrappers would not work with coroutines. These include the likes of std::bind, std::function. We could argue that this is not very different from the approach of not differentiating between coroutines and functions (as it has false positives for such wrappers in many cases). Moreover, banning std::function for coroutines could even be argued to be beneficial as it is extremely prone to lifetime bugs.

the implementation reuses the logic from [[clang::lifetimebound]] . But it is problematic since the lifetime of coroutines are much different than normal functions. Then we can’t reuse such analysis simply.

Is it still a problem even when this is an opt-in check. The motivation behind reusing existing lifetimebound analysis is that reference parameters of a coroutine are, by definition, “lifetimebound” to the coroutine return object (except if they are only used before first suspension).

Another similar case I met more frequently is about the lifetime issues of lambda captures in coroutine lambdas. I am wondering if we can reuse such things to help that.

Could you give an example? It definitely makes sense to accommodate coroutine lambdas in these checks. To an extent, lambda captures are already part of this analysis in general (eg).

For example, if we have a class type as:

class task { ... };
class promise_type_for_task { ... };

And we specialize std::coroutine_traits as:

class coroutine_traits<task, int> {
    using promise_type = promise_type_for_task;
};

Then only if the function returning a task type and it has exactly one parameter with type int can be coroutines. For example, the following code is invalid since the compiler can’t find the promise_type.

task f(double, double) {
    co_await anything();
}

Another similar case I met more frequently is about the lifetime issues of lambda captures in coroutine lambdas. I am wondering if we can reuse such things to help that.

Could you give an example? It definitely makes sense to accommodate coroutine lambdas in these checks. To an extent, lambda captures are already part of this analysis in general (eg).

For example, in a non-coroutine context we’re issuing a coroutine task asynchonously:

   int a = ...;
   auto task = [&a]() -> Task {
       ....
   };
   task.via(an_executor).start();
}

This is a common practice to start a coroutine task asynchonously. The execution of task will be scheduled later by the executors. But as we can see in the example, the lifetime of the referenced local variable a is going to end just after we put the task to the executor.


I skipped the discussion about coroutine wrappers. Since I feel it is not a thing from the language perspective. In the top of the post , it says a function wrapper (returning a coroutine object). But what is a coroutine object? It is not defined in the language already and we don’t define it here neither.

I want to say that we’re designing something as a language extension. So we need to revisit things from the language’s perspective. And make that compatible with the actual use cases. In fact, I can understand what you want to say. But I just want to make it more formal.

I mean, I’d like to see a proposal looking like:

  • Define what is a coroutine object formally.
  • Define what is a coroutine type formally.
  • Define what is a coroutine wrapper formally.
  • Then finally we can define the semantics of [[clang::lifetimebound_coroutine]] formally.

There are cases we can simply achieve. For example, how can we treat a function as a coroutine from its signature purely? We can’t solve the problem from the language side purely. In such case, we can propose a new language extension to solve the issue, like the things that we already mentioned.

For example, if we can introduce a new attribute like [[clang::coro_class]] to the class definition and all the functions that returning the class are treated as coroutine. Then we can find the definition of coroutine wrapper is automatically covered.

My key point is that we should make things clear, explicit and self contained. If we made that, I think I’ll be pretty open to that.

The problem you are trying to solve is indeed a good one. There’s definitely something in the real of annotations that can help around. To get more accurate, soon enough you need some level of idiomatic checks and control-flow information, we catch some of the same issues using clangir by leveraging on both when possible. Overall, my feedback is similar to the points mentioned by @ChuanqiXu

+1

(Thanks for your comments and apologies for the late reply as I was on vacation.)

I have put the key points in a Google Doc for better iterations on feedback. Please take a look and let me know which details you would like to be added.

(Please let me know if you prefer any other channel of discussion.)

Coroutine Return Type (CRT)

A function R func(P1, .., PN) has a coroutine return type R if R has a promise type associated to it, i.e., std​::​coroutine_traits<R, P1, .., PN>​::​promise_type is a valid promise type.

class promise_type_for_task { … };
class task { using promise_type = promise_type_for_future; };

task anything(P1, P2); // task is a CRT always.
class future { ... };
class promise_type_for_future { ... };
class coroutine_traits<future, int> {
using promise_type = promise_type_for_future;
};

future foo(int); // future is a CRT here.
future foo(double, double); // future is not a CRT here.

Coroutine Wrapper

A function which returns a CRT but is itself not a coroutine.

task coro() { co_return 1; } // A coroutine.

task coro_wrapper() { return coro(); } // A coroutine wrapper.

future foo(int) { co_return; } // A coroutine.

future bar(int i) { return foo(i); } // A coroutine wrapper.

future bar(double, double) { return future{}; } // Neither coroutine nor coroutine wrapper.

Note that, from a language perspective, it is not possible to differentiate between a coroutine and a coroutine wrapper by merely looking at their signature.

Lifetime checks: clang::coro_lifetimebound

All arguments to a function are considered to be lifetime bound if the function returns a CRT which is annotated with clang::coro_lifetimebound.

Both coroutines and coroutine wrappers are in scope of such lifetime bound analysis.

Explicit coroutine wrappers: clang::only_explicit_coroutine_wrappers and clang::coroutine_wrapper

A CRT annotated with clang::only_explicit_coroutine_wrappers can only be used as a return type of a coroutine or a coroutine wrapper annotated with clang::coroutine_wrapper.

Such a CRT cannot be used with coroutine wrappers which are not explicitly annotated with clang::coroutine_wrapper.

Thanks. It looks much better now.

A concern here is that the CRT and coroutine wrapper look like language concept but it is not the case. I suggest to add an attribute [[clang::coro_return_type]] to make it explicit. So that the definitions of CRT becomes:

A function R func(P1, .., PN) has a coroutine return type R if **R is marked by [[clang::coro_return_type]] ** and R has a promise type associated to it, i.e., std​::​coroutine_traits<R, P1, .., PN>​::​promise_type is a valid promise type.

Then people can avoid introducing concepts to their code surprisingly. And we can avoid [[clang::only_explicit_coroutine_wrappers]] and [[clang::coroutine_wrapper]] in this way. How do you think about it?

I am fine with adding [[clang::coro_return_type]] to clarify that CRT and coroutine wrappers are clang concepts.

I think it is important to distinguish coroutine wrappers from coroutines and even allow libraries to ban unexplicit or unintentional coroutine wrappers.

If you are suggesting that we also enforce that CRT should only be returned by a coroutine and not by coroutine wrapper then I partially agree here. I would still want [[clang::coroutine_wrapper]] to have some form of explicit allowlist strategy to enable library writers and users to define coroutine wrappers consciously.

Then how about:

  • If the return type of a function is marked with [[clang::coro_return_type]] only, the function must be a coroutine. Otherwise it is invalid.
  • If the return type of a function is marked with [[clang::coro_return_type]] and [[clang::coroutine_wrapper]], the function must be a coroutine or a coroutine wrapper.

And I want to make the concept of coroutines wrapper more formally liike:

  • The coroutine wrapper should create a coroutine frame, and the lifetime of the coroutine frame should be the same with the return object of the coroutine wrapper. Otherwise the program is invalid. No diagnostic is required.

I did not intend to mark the CRT itself with [[clang::coroutine_wrapper]]. In my head, this is supposed to be a function annotation.

Formally,

  • If the return type of a function is a CRT then the function must be a coroutine. Otherwise it is invalid.
  • It is allowed for a non-coroutine to return a CRT if the function is marked with [[clang::coroutine_wrapper]].

Coroutine wrapper

A coroutine wrapper is a function which returns a CRT, is not a coroutine itself and is marked with [[clang::coroutine_wrapper]].
The coroutine wrapper should create a coroutine frame, and the lifetime of the coroutine frame should be the same with the return object of the coroutine wrapper. Otherwise the program is invalid. No diagnostic is required.

WDYT ?

Sounds good to me : )

Great. I have made the changes to doc to reflect the latest agreed proposal.

I will make changes to the PR to implement this.

I have a question about [[clang::coro_return_type]].

In my use case, the return type is an existing type, that is made available as a coroutine return type by specializing std::coroutine_traits, similar to the example given in std::coroutine_traits - cppreference.com, in which coroutine_traits is specialized for std::future .

Where should I in such case put the [[clang::coro_return_type]] marker?

You have to mark the definition of the existing return type as [[clang::coro_return_type]].
If you do not control the definition of the existing return type, I am afraid there is no way to do this analysis in your use case.

Thank you for prompt response.

Hmm, although I could have control over the existing return type, the problem is that this return type is sometimes returned from a coroutine, but sometimes from another, non-coroutine function (this function could actually be a ‘coroutine wrapper’ but more often it is not).

My use case is the seastar::future type from Seastar, which is an async library. Seastar’s development started in the C++14 era, and used continuation passing style (CPS) with lambdas (with seastar::future as the return type). Only later, when C++20 came out, they added coroutine support by specializing std::coroutine_traits for seastar::future.

If I understand correctly, if I would mark the seastar::future type with [[clang::coro_return_type]], all uses of seastar::future as a return type in non-coroutine and non-coroutine-wrapper functions would become invalid/illegal, right…? (And would break all existing continuation-passing-style seastar::future-returning code).

Could your lifetime-check feature still work in my setting? Or would you perhaps see a way how your check can be extended/generalized to support this use case?

For reference:

That is correct. This is an unfortunate downside of C++20 coroutines. It is not possible to tell whether a function is a coroutine or a coroutine wrapper by merely looking at the function signature.

After looking at the function body, we can differentiate between coroutine and coroutine wrappers (a plain function returning a coroutine return type). By definition, there are no other “non-coroutine & non-coroutine-wrapper” functions. All non-coroutines fall under the category of coroutine-wrappers as they are indistinguishable even after looking at the function body (using CPS or otherwise).

Some not-so-great suggestions:

  • This error would not trigger in C++17. Codebases having CPS-style functions would build fine in C++17. To upgrade to C++20, they would need to either mark such functions as coroutine wrappers or migrate themselves to use coroutines instead.
  • Introduce new type seastar::co_future which is same as future but has this annotation.

Could your lifetime-check feature still work in my setting? Or would you perhaps see a way how your check can be extended/generalized to support this use case?

One idea could be to downgrade the error of coroutine wrappers to warnings that you can turn off.
Lifetime analysis would still trigger for all functions and coroutines that return seastar::future.
(But this idea goes against the core proposal of doing such analysis explicitly and opting out explicitly.)

By definition, there are no other “non-coroutine & non-coroutine-wrapper” functions.

Ah, I misunderstood that, and seems to contradict your example in [RFC] Lifetime bound check for parameters of coroutines - #7 by usx95 :

future bar(double, double) { return future{}; } // Neither coroutine nor coroutine wrapper.

Anyway, I can make it work with your not-so-great solution by introducing a thin co_future type around seastar::future . Thanks a lot for your help!

Link to the example in Compiler Explorer:

Hi Utkarsh,
I also tested another footgun, namely lambda-captures with a coroutine lambda. Your analysis does not yet detect the buggy pattern below:

        int z = 10;

        auto k = [z]() -> co_future<int> {
            using namespace std::chrono_literals;
            co_await seastar::sleep(1s);
            co_return z;
        }();

        auto z_is_garbage = co_await std::move(k);

Here, the lambda object (including its captures) is already dead after the first co_await, so the capture z will be garbage when it is returned.

One possible fix is to store the lambda itself in the coroutine frame (before invoking)

        int z = 10;

        auto lamb = [z]() -> co_future<int> {
            using namespace std::chrono_literals;
            co_await seastar::sleep(1s);
            co_return z;
        };

        auto k = lamb();

        auto equal_to_z = co_await std::move(k);

Would you see any way to extend your lifetime analysis so that it can detect this footgun pattern with coroutine-lambdas ?

Oh you are right. A function R func(P1, .., PN) is neither a coroutine nor coro-wrapper if std​::​coroutine_traits<R,P1, .. PN>​::​promise_type is invalid promise type. (Although this is not yet enforced by clang).
But I guess in your case std​::​coroutine_traits<seastar::future,Args...>​::​promise_type is a valid promise type for all Args.... So this case might not be relevant for you.

You are right. This approach definitely has limitations. Lambda captures are quite problematic.
We have a Clangtidy check enforcing core guideline to not use capturing lambda coroutines.
I think that is the right strategy since auto k = [value]() -> co_future<int> {/**/}(); is invalid for almost all interesting cases (if not called directly under a co_await).

If it is valuable, I can try to surface this guideline as a clang warning instead of a tidy check.