Optimizing sret on caller side

chfast · March 3, 2022, 2:30pm

In C/C++ when a big struct is returned from a function the calling convention requires the caller to allocate the space for this object and pass the pointer to this space as the function first hidden parameter. This is represented with sret attribute.

struct Big
{
    long v[100];
};


Big f() noexcept;

void caller(Big* out) noexcept
{
    *out = f();
}

In the example, the caller first allocates space on stack and passes that to f(). Then it has to copy the Big object to *out. Can this be optimized in a way that caller passes out to f() directly?

TNorthover · March 3, 2022, 2:46pm

Not in general. If f has independent access to out (e.g. if it’s a pointer to a global) then passing it as the sret parameter would corrupt the value before it should have been changed (i.e. when f returns).

LLVM doesn’t do the optimization even in cases where it could though (for example if it can see f doesn’t do anything like that).

jankorous · March 3, 2022, 6:50pm

@TNorthover Could you please elaborate on the cases LLVM doesn’t optimize?

TNorthover · March 3, 2022, 6:57pm

Probably most/all, I replaced f with

__attribute__((noinline)) Big f() noexcept {
  return {0};
}

which to me seems about the most obviously safe function possible and LLVM didn’t do the optimization (even if I hacked argmemonly into the IR for f).

It’d need some special knowledge of sret semantics, which probably just hasn’t been implemented.

nikic · March 3, 2022, 9:03pm

LLVM does support this optimization in general (this is the “call slot optimization” performed by the MemCpyOpt pass), but it does have some pretty steep preconditions. The bit missing in your case is that the pointer needs to be dereferenceable, aligned and noalias. Here is a working variant with a restrict reference: Compiler Explorer

In C++ this would work for forwarding between sret parameters (which are dereferenceable, aligned and noalias), but that particular case is also covered by NRVO on the language level, so it’s not particularly relevant.

Topic		Replies	Views
Question on sret LLVM Dev List Archives	2	86	July 14, 2010
Functions: sret and readnone LLVM Dev List Archives	10	104	November 6, 2009
What does "noalias sret" mean? LLVM Dev List Archives	2	94	July 16, 2015
sret LLVM Dev List Archives	1	121	January 1, 2009
Allow CallSlot optimization for throwing functions for sret arguments LLVM Dev List Archives	1	87	March 5, 2018

Optimizing sret on caller side

Related topics