(I searched simply on the discourse and github issues)
Reproducer: Compiler Explorer
For the IR:
; Function Attrs: nounwind
define fastcc void @_Z3Barv.destroy(ptr noundef nonnull align 8 dereferenceable(120) %0) #0 {
resume.entry:
%index.addr = getelementptr inbounds %_Z3Barv.Frame, ptr %0, i64 0, i32 5
%index = load i2, ptr %index.addr, align 8
%switch = icmp eq i2 %index, 1
br i1 %switch, label %cleanup21, label %CoroEnd
cleanup21: ; preds = %resume.entry
%.reload.addr = getelementptr inbounds %_Z3Barv.Frame, ptr %0, i64 0, i32 3
%1 = load ptr, ptr %.reload.addr, align 8
%2 = icmp eq ptr %1, null
tail call void @llvm.assume(i1 %2)
%index.addr.i2 = getelementptr inbounds %_Z3Barv.Frame, ptr %0, i64 0, i32 3, i64 72
%index.i3 = load i2, ptr %index.addr.i2, align 8
%switch.i4 = icmp eq i2 %index.i3, 1
br i1 %switch.i4, label %cleanup21.i, label %CoroEnd
cleanup21.i: ; preds = %cleanup21
%.reload.addr.i5 = getelementptr inbounds %_Z3Barv.Frame, ptr %0, i64 0, i32 3, i64 24
%3 = load ptr, ptr %.reload.addr.i5, align 8
%4 = icmp eq ptr %3, null
tail call void @llvm.assume(i1 %4)
br label %CoroEnd
CoroEnd: ; preds = %resume.entry, %cleanup21, %cleanup21.i
tail call void @_ZdlPv(ptr noundef nonnull %0) #1
ret void
}
It looks like the optimizer canât help to optimize it further. But this looks odd to human. Since the only meaning instruction in the function is tail call void @_ZdlPv(ptr noundef nonnull %0)
.
And if I remove the two assume
instructions, the âoptâ can optimize it as expected now: Compiler Explorer . Then it looks bad since now it shows the @llvm.assume
blocks the optimization.
Maybe we want to put the job to optimize @llvm.assume
to code generators. But it doesnât do a good job for this case at least: Compiler Explorer
(the generated code)
Bar() [clone .destroy]: # @Bar() [clone .destroy]
cmp byte ptr [rdi + 112], 1
jmp operator delete(void*)@PLT # TAILCALL
There is a meaningless cmp
instruction and this is also the motivation of the post.
I can understand that there is some trade offs here. e.g., it is hard to decide in which passes should we remove assume
intrinsics. But it still looks odd to me that we put some work at the middle end to the backend. And I am wondering if we had any chance to improve this.
Note that assume
become a standard C++ feature recently. So I am concerning this may be worse after users tries to use assume
widely in their source codes.
Any thoughts?