rdrand intrinsic under -O0

Hi,

I wrote a small test program that uses the _rdrand64_step intrinsic,
where I think I might have hit a bug with clang.

The core portion of the program is:

static inline int
ivy_rng_store(long *buf)
{
  unsigned long long tmp;
  int retry;

  retry = RETRY_COUNT;
  do {
    if (_rdrand64_step(&tmp) == 1) {
      *buf = (long)tmp;
      break;
    }
    retry--;
  } while (retry > 0);

  return (retry);
}

When compiled with -O0, I got:

00000000004009b0 <ivy_rng_store>:
  4009b0: 55 push %rbp
  4009b1: 48 89 e5 mov %rsp,%rbp
  4009b4: 48 89 7d f0 mov %rdi,-0x10(%rbp)
  4009b8: c7 45 e4 0a 00 00 00 movl $0xa,-0x1c(%rbp)
  4009bf: 48 8d 45 e8 lea -0x18(%rbp),%rax
  4009c3: 48 89 45 f8 mov %rax,-0x8(%rbp)
  4009c7: 48 8b 45 f8 mov -0x8(%rbp),%rax
  4009cb: 48 0f c7 f1 rdrand %rcx
  4009cf: 89 ca mov %ecx,%edx # ???
  4009d1: be 01 00 00 00 mov $0x1,%esi
  4009d6: 0f 42 d6 cmovb %esi,%edx
  4009d9: 48 89 08 mov %rcx,(%rax)
  4009dc: 81 fa 01 00 00 00 cmp $0x1,%edx
  4009e2: 0f 85 10 00 00 00 jne 4009f8 <ivy_rng_store+0x48>
  4009e8: 48 8b 45 e8 mov -0x18(%rbp),%rax
  4009ec: 48 8b 4d f0 mov -0x10(%rbp),%rcx
  4009f0: 48 89 01 mov %rax,(%rcx)
  4009f3: e9 18 00 00 00 jmpq 400a10 <ivy_rng_store+0x60>
  4009f8: 8b 45 e4 mov -0x1c(%rbp),%eax
  4009fb: 05 ff ff ff ff add $0xffffffff,%eax
  400a00: 89 45 e4 mov %eax,-0x1c(%rbp)
  400a03: 81 7d e4 00 00 00 00 cmpl $0x0,-0x1c(%rbp)
  400a0a: 0f 8f af ff ff ff jg 4009bf <ivy_rng_store+0xf>
  400a10: 8b 45 e4 mov -0x1c(%rbp),%eax
  400a13: 5d pop %rbp
  400a14: c3 retq

It seems that, as the # ??? line suggests, that the compiler assumes
%ecx after rdrand %rcx would never be 1?

Version:

FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
Target: x86_64-unknown-freebsd11.0
Thread model: posix

Options used:

-O2 -pipe -mrdrnd -mmmx -O0 -std=gnu99 -fstack-protector
-Qunused-arguments

Thanks in advance!

Cheers,

From the comment in the code in X86ISelLowering that did this.

// If the value returned by RDRAND/RDSEED was valid (CF=1), return 1.
// Otherwise return the value from Rand, which is always 0, casted to i32.