suggested x86 peephole optimization: use bit rotate to save one instruction during mask creation

This kind of pattern is very common when dealing with bit arrays:

unsigned long clear_mask(int bit)

{
return ~(1UL << bit);
}

clang 3.6 at -O3 compiles this in the straightforward way, moving 1 into register, shifting it, then flipping all the bits:

movl $1, %eax
movb %dil, %cl
shlq %cl, %rax
notq %rax

gcc 4.8.2, however, is more clever. It moves -2, i.e. ~1, into a register then rotates that into place, saving one instruction:

movl %edi, %ecx
movq $-2, %rax
rolq %cl, %rax

A fix is out for review here: http://reviews.llvm.org/D8350