[coro] Using lifetime markers to allocate values on the stack vs the frame

ayzhao · April 18, 2024, 11:08pm

In [coroutines] LLVM incorrectly allocates coroutines on a variable stack · Issue #86580 · llvm/llvm-project · GitHub we discovered that CoroSplit incorrectly puts an object on the coroutine stack instead of the frame object on the heap if it has a llvm.lifetime.start(...) without a corresponding llvm.lifetime.end(...) even if the object’s address is escaped. We can end up with this input if we perform a SimplifyCFG pass on a presplit coroutine.

I have a PR ([coroutines] LLVM incorrectly allocates coroutines on a variable stack · Issue #86580 · llvm/llvm-project · GitHub) that naively fixes the immediate problem we’re seeing, but there’s been some discussion on whether it’s the right way forward, and I’d like the community’s input on how to proceed. Here are the options I see.

Option 1: Land [coroutines] LLVM incorrectly allocates coroutines on a variable stack · Issue #86580 · llvm/llvm-project · GitHub to fix the immediate issue.

My PR fixes the immediate issue by placing any alloca on the frame if the alloca’s address gets escaped and there’s no llvm.lifetime.end(...) for the alloca. This would fix the miscompile in the bug, but @hansw2000 mentioned that the PR would incorrectly put the object on the stack if there was a llvm.lifetime.end(...) for the alloca and there’s an execution path that skips it. The question is then whether we should land an incomplete solution first before landing a complete solution later or if we should come up with a complete solution now.

Option 2: Abandon the PR and come up with a complete and correct solution to determine whether or not allocas should live on the frame.

The current implementation is likely broken, and the existing logic to determine whether or not an alloca could live on the coroutine stack would need to be rewritten. This would be a large undertaking though - I’m not sure what the algorithm would look like or how to start implementing this feature, and lifetime intrinsics are known to be fickle - see incorrect transformation around icmp: unclear semantics of "lifetime" intrinsics leads to miscompilation · Issue #45725 · llvm/llvm-project · GitHub. Suggestions are welcome in the thread though.

Option 3: Do not use lifetime intrinsics to determine whether or not an object should go on the stack.

This option would guarantee correct codegen since we wouldn’t use lifetimes to optimize allocas to put them on the stack. @ChuanqiXu commented in [coroutines][CoroSplit] Store allocas on the frame if there is no explicit lifetime.end by alanzhao1 · Pull Request #88806 · llvm/llvm-project · GitHub that this might result in decreased performance, but @apolloww commented in [coroutines][CoroSplit] Store allocas on the frame if there is no explicit lifetime.end by alanzhao1 · Pull Request #88806 · llvm/llvm-project · GitHub that performance issues would be limited to memory pressure rather than runtime performance.

cc:@rnk

Thanks,
Alan

ChuanqiXu · April 19, 2024, 1:29am

I don’t think we can’t assume the change of memory pressure may not impact the runtime performance. If we prefer this approach, I think we also need to understand the corresponding C++ pattern that makes this broken to make a decision.

But now it is 4.19. And the next release is in Sep. I feel we have sufficient time to not be hurry to make a workaround.

ChuanqiXu · April 19, 2024, 1:35am

This would be a large undertaking though - I’m not sure what the algorithm would look like or how to start implementing this feature, and lifetime intrinsics are known to be fickle

Maybe not so large in my mind. We already have such local data flow analysis in coroutines. See Lowerer::shouldElide in CoroElide.cpp and SuspendCrossingInfo in CoroFrame.cpp.

ayzhao · April 19, 2024, 5:49pm

I’m starting to think about an algorithm and here’s what I have so far - let me know what you think:

An alloca a must go on the coroutine frame if and only if there exists a path from any llvm.lifetime.start for the given alloca to any llvm.coro.suspend such that:

the address of a is escaped anywhere along the path
there is no llvm.lifetime.end for that alloca along the path

Otherwise the alloca may go on the coroutine stack.

Am I missing anything?

efriedma-quic · April 19, 2024, 6:30pm

You need to check whether the address escapes anywhere, not just along the path; lifetime.start/end don’t block hoisting of references to the address, only accesses to the underlying memory. Otherwise, that’s the right idea.

ChuanqiXu · April 22, 2024, 2:41am

I think we can say:

An alloca need to be on the frame if one of the following conditions are met:

If there exists a path from any llvm.lifetime.start for the given alloca to any llvm.coro.suspend without touching llvm.lifetime.end
The address of the alloca escapes, and, there exists a path from any llvm.lifetime.start to a llvm.lifetime.start (including the llvm.lifetime.start self) with touching a llvm.coro.suspend.

The differences may be, even if the alloca address escapes, if we can be sure all of the lifetime ranges won’t be broken by llvm.coro.suspend, it may be safe to put it on the stack.

hansw2000 · April 22, 2024, 1:35pm

The address of the alloca escapes, and, there exists a path from any llvm.lifetime.start to a llvm.lifetime.start (including the llvm.lifetime.start self) with touching a llvm.coro.suspend.

Do you mean a path from lifetime.start to lifetime.end?

Can we formulate a rule in the opposite direction instead? That is, under what conditions is it safe to not put an alloca in the coro frame?

Intuitively it seems like we want something like:

An object may be omitted from the coro frame if all coro suspend points occur outside the object’s lifetime (the lifetime hasn’t yet started, or has previously ended, at each suspend point).

With such a rule, would it even matter if the object’s address has escaped or not?

ChuanqiXu · April 23, 2024, 2:08am

Do you mean a path from lifetime.start to lifetime.end?

No, I mean a path from llvm.lifetime.start to any llvm.lifetime.start. The rationale is that the address of a variable should be the same with different lifetimes. e.g.,

%a = alloca ...
llvm.lifetime.start(%a)
call @leak_address(%a);
... # without llvm.coro.suspend
llvm.lifetime.end(%a)

llvm.coro.suspend(...)

llvm.lifetime.start(%a)
call @leak_address(%a);
... # without llvm.coro.suspend
llvm.lifetime.end(%a)

In the example, both the two lifetimes of %a doesn’t across suspend point, but we still need to put the variable on coro frame. Otherwise the leaked address may be different.

An object may be omitted from the coro frame if all coro suspend points occur outside the object’s lifetime (the lifetime hasn’t yet started, or has previously ended, at each suspend point).

The definition looks fine. However, it is slightly problematic to define what is a lifetime precisely. I think, in LLVM IR, a lifetime is region consisted by multiple llvm.lifetime.starts and llvm.lifetime.ends. (Range Analysis) But formal range analysis may be too heavy for us to do. So I’d like to mimic it by tranditional reachable dataflow analysis.

Can we formulate a rule in the opposite direction instead? That is, under what conditions is it safe to not put an alloca in the coro frame?

To answer the question, the description may be the following conditions are met:

For all lifetime.start, there doesn’t exist a path from the lifetime.start to a lifetime.end touching a llvm.coro.suspend. (we shouldn’t see llvm.lifetime.end for the same lifetime.start along the path)
If the corresponding variable escapes, there shouldn’t be a path from a lifetime.start to a lifetime.start touching a llvm.coro.suspend.

The description to describe when we need to variables on the frame may be simpler if we don’t want to care if the variable escapes:

- If there exists a path from any llvm.lifetime.start for the given alloca to any llvm.lifetime.end with touching llvm.coro.suspend

ayzhao · April 26, 2024, 8:32pm

I created the PR [coro][CoroSplit] Use `llvm.lifetime.end` to compute putting objects on the frame vs the stack by alanzhao1 · Pull Request #90265 · llvm/llvm-project · GitHub which implements the logic where if there’s a path from an llvm.lifetime.start to a coro.suspend without an llvm.lifetime.end then we put the object on the frame. The logic for a path from llvm.lifetime.start to any llvm.lifetime.start already exists.

Topic		Replies	Views
[RFC] LLVM Coroutines LLVM Dev List Archives	2	89	June 12, 2016
Lifetime markers and coroutine intrinsics IR & Optimizations	0	220	February 28, 2023
Avoiding alloca elision LLVM Dev List Archives	4	77	July 13, 2019
lifetime.start/end LLVM Dev List Archives	37	184	February 23, 2021
Optimize placement of llvm.lifetime.start/end LLVM Dev List Archives	1	108	October 29, 2018

[coro] Using lifetime markers to allocate values on the stack vs the frame

Related topics