Allow CallSlot optimization for throwing functions for sret arguments

Hi all,

in Rust we have a bug report about about a missed optimization which
one would expect CallSlot optimization to handle:
https://github.com/rust-lang/rust/issues/48533

The IR looks like this:

define void @bar(%S* noalias nocapture sret dereferenceable(16), void
(%S*)* nocapture nonnull) unnamed_addr #0 {
  %3 = alloca %S, align 8
  %4 = bitcast %S* %3 to i8*
  call void @llvm.lifetime.start.p0i8(i64 16, i8* nonnull %4)
  call void %1(%S* noalias nocapture nonnull sret dereferenceable(16) %3)
  %5 = bitcast %S* %0 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull %5, i8* nonnull %4,
i64 16, i32 8, i1 false)
  call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %4)
  ret void
}

CallSlot optimization does not handle this case, because the call to
%1 might throw, and should calls are no longer subject to CallSlot
optimization as a fix for PR27849 --
https://bugs.llvm.org/show_bug.cgi?id=27849

The difference between that PR and this case is that the code in the
PR took a reference through which the value was modified, while in
this case we deal with a return value through an sret argument. I'm
wondering whether the "must not throw" restriction could be relaxed in
this case, to only apply when the destination is not an sret argument.
Given that the return value is only valid when the function does not
throw, it seems that the caller is always responsible to ensure that
the value behind the sret pointer is only used in that case,
introducing a temporary as necessary. Does that seem right?

Cheers,
Björn

I'm not sure we want to continue to overload the meaning of "sret" like this... the sret attribute affects the ABI on some targets, and we probably want to separate the ABI effect from the optimization hint. So we should introduce a new attribute to support this sort of optimization, I think.

But yes, that's the right idea: you can optimize more aggressively if you know the temporary will be discarded if the call throws an exception. (You can actually optimize in multiple places; see, for example, isKnownNonEscaping in LICM.)

-Eli