Bug of intrin functions

I just find a serious bug. Windows prebuilt binary.

This implementation looks wrong to me.

First, `movsb` alters RDI, RSI and RCX, so all three parameters must be in-out ones. Second, the memory area offset from RDI is written to and the one offset from RSI is read from, but this __asm__ statement fails to say that.

AFAICT the correct implementation should be:

static __inline__ void __attribute__((__always_inline__, __nodebug__))
__movsb(unsigned char *__dst, unsigned char const *__src, size_t __n)
{
   __asm__ ("rep movsb"
     : "+D"(__dst), "+S"(__src), "+c"(__n), "=m"(*(char (*)[])__dst)
     : "m"(*(char (*)[])__src)
   );
}

Can you file a bug on this? I suspect it was broken in r290539 when the clobber list was removed because it was made an error to clobber an input or output.