TLDR: Reviving the discussion [RFC] Debug info for coroutine suspension locations from 2022; proposing to emit artificial DW_TAG_label
s from the CoroSplit
pass to map suspension point ids to the corresponding source code lines / code addresses. Draft implementation in #141937.
Motivation and use case
When inspecting a generator
generator<int> foo() {
co_yield 1;
co_yield 2;
co_yield 1;
}
gen = generator();
it is not currently possible to write a debugger script to figure out at which line the generator gen
is currently suspended.
A similar problem also arises when using coroutines for asynchronous programming. It is possible to get an asynchronous stack trace using debugger scripts (see, e.g., Debugging C++ Coroutines â Clang 21.0.0git documentation for an example). However, those scripts currently only show the function names, but not the exact line numbers where a coroutine was suspended.
E.g., when having a coroutine
task foo() {
co_await bar();
co_await baz();
co_await bar();
}
the debugger script can only give us the backtrace "bar" was called from "foo"
, but it canât tell us whether we are suspended at the 1st or the 2nd call to bar
.
Both issues can be solved if we have a way to get the exact source location where a coroutine is suspended by inspecting a std::coroutine_handle
in the debugger.
Current State of debugging coroutines
With the DWARF debug info generated by LLVM, one can already get the current suspension id __coro_index
by inspecting the coroutine frame (currently requires manual reinterpret-casting in the debugger; with #141516, the pretty printer for coroutine_handle
will also show the __coro_index
out-of-the-box).
However, the suspension point id __coro_index
is an opaque integer. There is currently no good way to map this compiler-generated, internal id back to a source location.
Proposed solution
Create artificial DW_TAG_label
debug information. The labels would have well-known names, following the pattern __coro_resume_<N>
.
The labels can then be looked up in gdb using either info line -function my_coroutine -label __coro_resume_2
or via gdbâs Python API gdb.lookup_symbol
with domain=gdb.SYMBOL_LABEL_DOMAIN
.
All the necessary infrastructure to emit debuginfo for labels is already in place, such that the corresponding code change is rather straightforward.
The generated DWARF looks like
0x00000c53: DW_TAG_subprogram
DW_AT_low_pc (0x00000000000018f0)
DW_AT_high_pc (0x0000000000001de1)
DW_AT_frame_base (DW_OP_reg6 RBP)
DW_AT_linkage_name ("_ZL9coro_taski")
DW_AT_name ("coro_task")
DW_AT_decl_file ("/home/avogelsgesang/Documents/corotest/llvm-example.cpp")
DW_AT_decl_line (36)
DW_AT_type (0x0000005c "task")
[...]
0x00000c8e: DW_TAG_label
DW_AT_name ("__coro_resume_0")
DW_AT_decl_file ("/home/avogelsgesang/Documents/corotest/llvm-example.cpp")
DW_AT_decl_line (36)
DW_AT_low_pc (0x0000000000001952)
0x00000c94: DW_TAG_label
DW_AT_name ("__coro_resume_1")
DW_AT_decl_file ("/home/avogelsgesang/Documents/corotest/llvm-example.cpp")
DW_AT_decl_line (38)
DW_AT_low_pc (0x00000000000019db)
0x00000c9a: DW_TAG_label
DW_AT_name ("__coro_resume_2")
[...]
GCCâs behavior
Notably, gcc-14 also emits labels for the individual suspend points. However, gcc does not associate line number / low_pc with its labels, which currently makes them a bit pointless.
0x000027ec: DW_TAG_subprogram
DW_AT_name ("coro_task")
DW_AT_artificial (true)
DW_AT_low_pc (0x0000000000001391)
DW_AT_high_pc (0x000000000000180b)
DW_AT_frame_base (DW_OP_call_frame_cfa)
DW_AT_call_all_tail_calls (true)
DW_AT_sibling (0x0000294c)
0x0000297c: DW_TAG_label
DW_AT_name ("resume.8")
0x00002981: DW_TAG_label
DW_AT_name ("destroy.8")
0x00002986: DW_TAG_label
DW_AT_name ("resume.6")
0x0000298b: DW_TAG_label
DW_AT_name ("destroy.6")
0x00002990: DW_TAG_label
DW_AT_name ("resume.4")
0x00002995: DW_TAG_label
DW_AT_name ("destroy.4")
0x0000299a: DW_TAG_label
DW_AT_name ("resume.2")
0x0000299f: DW_TAG_label
DW_AT_name ("destroy.2")
0x000029a4: DW_TAG_label
DW_AT_name ("coro.delete.frame")
0x000029a9: DW_TAG_label
DW_AT_name ("coro.delete.promise")
0x000029ae: DW_TAG_label
DW_AT_name ("actor.continue.ret")
0x000029b3: DW_TAG_label
DW_AT_name ("actor.suspend.ret")
0x000029b8: DW_TAG_label
DW_AT_name ("actor.begin")
0x000029bd: DW_TAG_label
DW_AT_name ("final.suspend")
Considered alternatives
It is possible to adapt the C++ code to keep track of the suspension point explicitly. However, this approach requires changes to the coroutine types and will be cumbersome for library-defined coroutine types like std::generator
. Also it adds runtime overhead.
In [RFC] Debug info for coroutine suspension locations I previously proposed to represent __coro_index
as an enum instead of an integer. The various enum entries would be represented as enumerators, where the name would reveal the line / column numbers. @dwblaikie expanded this proposal to attach DW_AT_low_pc
annotations to the individual enumerators. One downside with this is that we would need to introduce some way to track a position of an enum-enumerator within the IR code. Not sure how to do so. For DW_TAG_label
, all the required position tracking is already in place.
Open questions
- Do we think that using
DW_TAG_label
for this purpose makes sense? - Should we coordinate with gcc, such that we write the same / similar debug information for coroutines between gcc and clang? If so, who would be good people from gcc to involve here? (I donât know any gcc contributors)
- LLDB does currently simply ignores
DW_TAG_label
. As such, this information is currently only useful for gcc. I wonder how much work it would be to handleDW_TAG_label
also in lldb. I will be writing a separate RFC on that shortly - The
DW_TAG_label
for coroutines are compiler-generated, artificial labels. Intuitively, I would have attachedDW_AT_artificial
to it. However, the DWARFv5 standard, only mentions that types can be annotated as artificial, but not labels (see, e.g., âAppendix A. Attributes by Tagâ).- should we propose a change to the DWARF standard to also allow
DW_AT_artifical
onDW_TAG_LABEL
? - should we emit
DW_AT_artifical
, as a custom extension to the standard, already before this was officially blessed by the Dwarf standard? (Not sure how debuggers deal with unexpected attributes. Do they ignore them? Do they choke on them?)
- should we propose a change to the DWARF standard to also allow
CC @jyknight @adrian.prantl @dblaikie @dmitryduka @ChuanqiXu since you participated in the earlier thread on this topic. Also CC @hokein, since you were also looking into debuggability of coroutines in lldb