X86 peephole optimization

So I've been working on what amounts to peephole optimization because
my application is absolutely blowing up from poor instruction selection.

For example, this is just wrong:

0000000000005df3 btq $0x16, %rax

I thought that maybe there was a peephole optimizer which 'relaxed' this BTQ
into a normal BT $0x16,EAX but that line is the actual output from otool.
This should be handled in a peephole pass for X86 which doesn't currently
exist.

So I'm writing it.

This pass obviates the patch I submitted for LowerToBT() which can be
dropped.
I also think there's a lot of instruction selection gymnastics which would
be
better postponed as well. Better to do that all in one pass and in one
place.
But that's for later.

I'm currently handling this BTQ REX problem as well as substituting in the
smallest
and/or fastest BT or TEST instruction. I'll be adding support for
accumulator
and short instruction forms. This could be merged with X86FixupLEAs
and perhaps X86CallFrameOptimization as they perform similar tasks.
At least they could be put in the same file if not the same pass.

BTW, I still believe my Const Hoisting patch is very necessary.
This peephole optimizer doesn't change that at all.

Chris

And now I see X86MCInstLower which doesn't handle BT.
I'll see if I can just add the missing cases to that.
Indeed, that should be a LOT easier.