[RFC] Desugar variadics. Codegen for new targets, optimisation for existing

First patch at [transforms] Inline simple variadic functions by JonChesterfield · Pull Request #81058 · llvm/llvm-project · GitHub is a refined version of the above.

A variadic function that happens to be a trivial wrapper around a va_list can be ‘inlined’ by building a target-specific va_list value and emitting a call to that va_list taking function instead. This occurs in practice occasionally, e.g.

int vfprintf(FILE *restrict f, const char *restrict fmt, va_list ap);
int fprintf(FILE *restrict f, const char *restrict fmt, ...)
{
  va_list ap;
  va_start(ap, fmt);
  int ret = vfprintf(f, fmt, ap);
  va_end(ap);
  return ret;
}

The vast majority of the target specific logic can be constrained to creating a va_list value which contains the values passed to the variadic function.

Most variadic functions don’t match this form. Those can be split into one internal function that does the same work as the original but takes a va_list instead and one function which replaces the original with a call to va_start. The va_start in the variadic function can be replaced with a va_copy of the new va_list trailing argument. This is more complicated than anticipated but is almost target independent.

Finally the codegen equivalent involves additionally rewriting calls to unknown variadic functions and being careful with the external symbols. That can be run on x64/aarch64 under tests, or as a whole program optimisation in the statically linked case from lld.

There’s a tail of edge cases but I think the outlined strategy is right:

  1. convert arbitrary variadic into a trivial one
  2. target specific inlining of trivial variadic
  3. use the same transform for codegen
1 Like