Spurious register spill with volatile function argument

Seems I had misused volatile. I removed ‘volatile’ from the function argument on test_0 and it prevented the spill through the stack.

I added volatile because I was trying to avoid the compiler optimising away the call to test_0 (as it has no side effects) but it appeared that volatile was unnecessary and was a misuse of volatile (intended to indicate storage may change outside of the control of the compiler). However it is an interesting case… as a register arguments don’t have storage.

GCC, Clang folk, any ideas on why there is a stack spill for a volatile register argument passed in esi? Does volatile force the argument to have storage allocated on the stack? Is this a corner case in the C standard? This argument in the x86_64 calling convention only has a register, so technically it can’t change outside the control of the C "virtual machine” so volatile has a vague meaning here. This seems to be a case of interpreting the C standard in such a was as to make sure that a volatile argument “can be changed” outside the control of the C "virtual machine” by explicitly giving it a storage location on the stack. I think volatile scalar arguments are a special case and that the volatile type label shouldn’t widen the scope beyond the register unless it actually *needs* storage to spill. This is not a volatile stack scoped variable unless the C standard interprets ABI register parameters as actually having ‘storage’ so this is questionable… Maybe I should have gotten a warning… or the volatile type qualifier on a scalar register argument should have been ignored…

volatile for scalar function arguments seems to mean: “make this volatile and subject to change outside of the compiler” rather than being a qualifier for its storage (which is a register).

# gcc
test_0:
     mov DWORD PTR [rsp-4], esi
     mov ecx, DWORD PTR [rsp-4]
     mov eax, edi
     cdq
     idiv ecx
     mov eax, edx
     ret

# clang
test_0:
  mov dword ptr [rsp - 4], esi
  xor edx, edx
  mov eax, edi
  div dword ptr [rsp - 4]
  mov eax, edx
  ret

/* Test program compiled on x86_64 with: cc -O3 -fomit-frame-pointer -masm=intel -S test.c -o test.S */

#include <stdio.h>
#include <limits.h>

static const int p = 8191;
static const int s = 13;

int __attribute__ ((noinline)) test_0(unsigned int k, volatile int p)
{
     return k % p;
}

int __attribute__ ((noinline)) test_1(unsigned int k)
{
     return k % p;
}

int __attribute__ ((noinline)) test_2(unsigned int k)
{
     int i = (k&p) + (k>>s);
     i = (i&p) + (i>>s);
     if (i>=p) i -= p;
     return i;
}

int main()
{
     test_0(1, 8191); /* control */
     for (int i = INT_MIN; i < INT_MAX; i++) {
             int r1 = test_1(i), r2 = test_2(i);
             if (r1 != r2) printf("%d %d %d\n", i, r1, r2);
     }
}

GCC, Clang folk, any ideas on why there is a stack spill for a
volatile register argument passed in esi? Does volatile force the
argument to have storage allocated on the stack? Is this a corner
case in the C standard? This argument in the x86_64 calling
convention only has a register, so technically it can’t change
outside the control of the C "virtual machine” so volatile has a
vague meaning here.

"volatile" doesn't really mean very much, formally speaking. Sure, the
standard says "accesses to volatile objects are evaluated
strictly according to the rules of the abstract machine," but nowhere
is it specified exactly what constitutes an access. (To be precise,
"what constitutes an access to an object that has volatile-qualified
type is implementation-defined.")

So, we have to fall back to tradition. Traditionally, all volatile
objects are allocated stack slots and all accesses to them are memory
accesses. This is consistent behaviour, and has been for a long time.
It is also extremely useful when debugging optimized code.

volatile for scalar function arguments seems to mean: “make this
volatile and subject to change outside of the compiler” rather than
being a qualifier for its storage (which is a register).

No, arguments are not necessarily stored in registers: they're passed
in registers, but after function entry function they're just auto
variables and are stored wherever the compiler likes.

Andrew.

* Andrew Haley:

"volatile" doesn't really mean very much, formally speaking. Sure, the
standard says "accesses to volatile objects are evaluated
strictly according to the rules of the abstract machine," but nowhere
is it specified exactly what constitutes an access.

Reading or modifying an object is defined as “access”.

The problem is that “reading” is either not defined, or the existing
flatly contradicts existing practice.

For example, if p is a pointer to a struct, will the expression &p->m
read *p?

Previously, this wasn't very interesting. But under the model memory,
it's suddenly quite relevant. If reading p->m implies a read of the
entire object *p, you cannot use a member to synchronize access to
other members of the struct. For example, if m is a mutex, and
carefully acquire the mutex before you read or write other members,
you still have data race between a write to some other member and the
acquisition of the mutex because the mutex acquisition reads the
entire struct (including the member written to ).

One possible cure is to take the address of the mutex and keep track
of it separately. Or you could construct a pointer using offsetof.
But no one is doing that, obviously.

This is not entirely hypothetical. Even today, GCC's aliasing
analysis requires that those implicit whole-object reads take place,
to make certain forms of type-punning invalid which would otherwise be
well-defined (and for which GCC would generate invalid code).

Presumably the offset of m is substantially larger than 0? If so, my answer would be "it had better not". Does any compiler treat that statement as an access to *p ?

  paul

* Paul Koning: