[Coroutines] How to handle arguments passed by value semantically?

Hi all,

Recently I am trying to work on coroutine bug here: https://reviews.llvm.org/D101980.
This patch is trying to copy the arguments who had been marked with byval into the coroutine frame.
There summary in the link tells some background.
Simply,
(1) By the language standard, the arguments of coroutine should be copied into the coroutine state.
(2) The implementation now in LLVM would copy the arguments of coroutine into the coroutine frame if needed.
(3) However, the arguments in IR are not equal to the arguments in the source. For example, the following code in C++

void foo() {
coro_func(A());
}

would be translated into something like:


void foo() {

%a = alloca A

; initializing for %a

coro_func(%a);
}

So the actual argument type in IR becomes A* instead of A. And the current implementation in LLVM coroutine
module would only copy the pointer to A into the coroutine frame instead of A, which is the problem I want to solve.

I had thought we could handle these case by handling the arguments marked with byval. But as @rjmccall said, there are
cases where arguments are passed by value semantically and passed by reference actually.

So here is the decision we need to make. Should we mark all they arguments passed by value semantically in the frontend?
Or should we just emit the copy in the callee site for coroutines in the frontend?

It looks like the both solution would take many efforts. Then there are drawbacks for each of them.
First, for the solution based on attribute marks. The drawbacks are:
(1) What’s the attribute should we use? The byval is deprecated actually. So we need to design a new attribute to mark the arguments
of coroutine. I am not sure if it is a good idea to use a new attribute when we meet a new problem. It looks like there are too many
attributes and metadata in Clang/LLVM now.
(2) The extra copy. There would be an extra copy in the coroutine to copy the arguments to the frame. Maybe we can do some peephole
optimization to mitigate the problem.
(3) Multiple destruction problem. I am not sure if this the same problem @rjmccall mentioned about move semantics. My understanding is

class A{
void* data_ptr;
A(A&&);
~A() { if(data_ptr) delete data_ptr; }
};
void foo() {
A a;
coro_func(std::move(a));
}

The problem here is that the data_ptr of A copied in the coroutine frame may be invalided by the destruction in the temporary produced
in the caller site if we choose to copy A into the frame instead of move. But we can’t get the semantics about move in the middle end.
The final problem about destruction looks the hardest to handle for the attribute based solutions.

And for the solution based on emitting the copy in the callee site for coroutines,
(1) It would violate the principle of LLVM to handle the copy of arguments. LLVM would emit the copy in the caller site instead of callee site.
I believe there must be some reasons that LLVM decides to emit the copy in the caller site. I hope someone could tell me if any one knows.
And it would make the styles to handle arguments broken in Clang/LLVM.
(2) It looks not easy to refactor this even only for coroutine. This is not a strong argument. I just say that we need a long time to do this refactoring.
And there may be some bugs remained during this process.
(3) We need to teach other passes. If we choose to emit the copy in the callee site, the IR for coroutine now would look like:

void coro_func(A* %a) {
%a.copy = alloca A
copy %a to %a.copy
; use %a.copy instead of %a
}

This pattern should be eliminated by LLVM passes. So we need to teach LLVM passes to remain this structure, which means we need to insear guard codes
to many other passes.
BTW, if we compile C++ code in debug mode, we would see a similar pattern:


void coro_func(A* %a) {
%a.addr = alloca A*
store %a, %a.addr

%a.val = load %a.addr

; use %a.copy instead of %a.val
}

This pattern only copies the address of %a into the new allocated variables instead of copying the contents. I don’t know the meaning of doing so now.

So it is clearly that both solutions had some drawbacks and not easy to implement. Do you guys have any thoughts?

Thanks,
Chuanqi

Hi all,

Recently I am trying to work on coroutine bug here: https://reviews.llvm.org/D101980.
This patch is trying to copy the arguments who had been marked with `byval` into the coroutine frame.
There summary in the link tells some background.
Simply,
(1) By the language standard, the arguments of coroutine should be copied into the `coroutine state`.
(2) The implementation now in LLVM would copy the arguments of coroutine into the coroutine frame if needed.
(3) However, the arguments in IR are not equal to the arguments in the source. For example, the following code in C++

I don't see how you can do this solely in the middle end -- the by-value parms have to be (move|copy)-constructed from their incoming location to the coro frame. Only the FE knows how to do that, just as it is the FE that determines whether these by-value parms are actually passed by invisible reference (as you describe).

so, perhaps the FE should do the initial break out of the ramp function, and have the middle end perform the frame layout? To make things simpler, maybe the FE just breaks out the minimal ramp -- copy parameters, build the promis object and init coro state to 'starting', and then call the actor fn, which will proceed to the first suspend.

The description of the problem doesn't sound right to me.
Even when parameters are in the form of a byval pointer, their content
will still be copied into the local stack for coroutines correctly,
not just copying pointers. This is currently handled by the front-end.
For example: https://godbolt.org/z/EGoaMEje4
If you look at the IR of foo, there is a memcpy that copies the
parameter to local stack.

I see. It looks you are right. Sorry to disturbing. I would check if the frontend would emit the copies for other arguments passed by value implicitly.