Coroutine elision not applied

Hi,

I’m experimenting with LLVM coroutines, and am wondering about a particular case where a seemingly irrelevant IR change prevents elision optimization. Any insight into why this happens would be greatly appreciated. I’m using LLVM 7.0.1.

The code I’m working with is basically equivalent to the following Python example:

def my_coro(n: int):
yield n

my_var =
if my_var > 0:
for a in my_coro(my_var):
print a

Here’s my_coro in LLVM IR (note that there is an initial suspend, then a suspend to yield the value, then the final suspend):

define private i8* @my_coro(i64) {
entry:
%promise = alloca i64, i64 1
%1 = bitcast i64* %promise to i8*
%id = call token @llvm.coro.id(i32 0, i8* %1, i8* null, i8* null)
%2 = alloca i64, i64 1
store i64 %0, i64* %2
%3 = call i1 @llvm.coro.alloc(token %id)
br i1 %3, label %alloc, label %begin

alloc: ; preds = %entry
%4 = call i64 @llvm.coro.size.i64()
%5 = call i8* @my_alloc(i64 %4)
br label %begin

begin: ; preds = %entry, %alloc
%6 = phi i8* [ null, %entry ], [ %5, %alloc ]
%hdl = call i8* @llvm.coro.begin(token %id, i8* %6)
%7 = call i8 @llvm.coro.suspend(token none, i1 false)
switch i8 %7, label %suspend [
i8 0, label %9
i8 1, label %cleanup
]

final: ; preds = %12
%8 = call i8 @llvm.coro.suspend(token none, i1 true)
switch i8 %8, label %suspend [
i8 0, label %13
i8 1, label %cleanup
]

; :9: ; preds = %begin
%10 = load i64, i64* %2
store i64 %10, i64* %promise
%11 = call i8 @llvm.coro.suspend(token none, i1 false)
switch i8 %11, label %suspend [
i8 0, label %12
i8 1, label %cleanup
]

; :12: ; preds = %9
br label %final

; :13: ; preds = %final
unreachable

cleanup: ; preds = %final, %9, %entry
%14 = call i8* @llvm.coro.free(token %id, i8* %hdl)
br label %suspend

suspend: ; preds = %final, %9, %entry, %cleanup
%15 = call i1 @llvm.coro.end(i8* %hdl, i1 false)
ret i8* %hdl
}

And how it’s called (i.e. the for-loop above):

define external void @main() {
entry:
%0 = load i64, i64* @my.var
%1 = icmp sgt i64 %0, 0
br i1 %1, label %if, label %exit

if: ; preds = %entry
%2 = load i64, i64* @my.var
%3 = call i8* @my_coro(i64 %2)
br label %for

for: ; preds = %body, %for_cont, %if
call void @llvm.coro.resume(i8* %3)
%4 = call i1 @llvm.coro.done(i8* %3)
br i1 %4, label %cleanup, label %body

body: ; preds = %for
%5 = call i8* @llvm.coro.promise(i8* %3, i32 8, i1 false)
%6 = bitcast i8* %5 to i64*
%7 = load i64, i64* %6
call void @my_print(i64 %7)
br label %for

cleanup: ; preds = %for
call void @llvm.coro.destroy(i8* %3)
br label %exit

exit: ; preds = %entry, %cleanup
ret void
}

Now if I optimize this with “opt -S -enable-coroutines -O3”, the coroutine allocation is not elided. But if I remove the if-statement (e.g. change the condition to 1 > 0, giving the first branch in main() an i1 true condition), then elision does take place.

However, I see no reason why elision can’t be applied to the first version – why does the presence of a branch outside the blocks where the coroutine is used change anything? Am I perhaps using opt incorrectly? Any insight would be greatly appreciated here, and thanks in advance.

Ariya