Looking at the performance of Linux kernel benchmarks with stack
initialization enabled, I've noticed the following pattern is quite
common in the kernel:
struct whatever input;
if (copy_from_user(&input, user_ptr, sizeof(input))
(For those not familiar with copy_from_user(), it's a function that
copies data from the userspace to the kernel and returns the number of
bytes that weren't copied)
When building with -ftrivial-var-auto-init=pattern, Clang generates
the 0xAAAA initialization despite |input| is either fully initialized
by copy_from_user() or never used.
I'm wondering if we could do better in this case and make the compiler
optimize away the initialization.
As copy_from_user() itself is written in assembly, we couldn't simply
rely on DSE. Instead, we probably need some annotation that prevents
Clang from instrumenting |input|.
Simply using __attribute__((uninitialized)) on |input| doesn't work
well, because a further change to this code may introduce another path
on which |input| isn't initialized.
For the same reason it's incorrect to make Clang apply this attribute
to every destination argument of copy_to_user(), especially given that
the copying may fail.
It would help to mark copy_from_user() as a function that initializes
one of these arguments, but only if it does not fail. We also need to
tell the compiler about the size of that argument, so that the
function prototype will look along the lines of:
unsigned long copy_from_user (void * to, const void __user * from,
unsigned long n)
__attribute__((param_sizeof(1, 3)) __attribute__((param_init_if_ret0(1)));
, where param_sizeof(1, 3) means that parameter #3 contains the size
of parameter #1, and param_init_if_ret0(1) means that parameter #1 is
initialized iff the function returns 0.
This looks quite clumsy, but maybe can be made more elegant and
general enough to handle other cases.
Does the idea make sense to you? Maybe you've seen other cases in
which a similar optimization can be applied?
To people working on DSE right now: do you think it's possible to
perform such an optimization assuming that we know about the behavior