X86 missed optimization (PR24052)

Hi all,

In the context of PR24052 [1] I'm looking at how to generate LEA
instructions to handle i8 multiplications by 3, 5, and 9.

I tried adding a pattern to match one of the interesting
multiplications (e.g. def : Pat<(mul GR8:$src1, 9), ...) and this
works with regards to the input side (it is being matched). The
problem with this approach is that I don't know how to produce the
correct output. I've been dancing around with combinations of LEA32r,
EXTRACT_SUBREG, and SUBREG_TO_REG but I can't get TableGen to like it.

For i32 this is handled by a ComplexPattern with matcher function
"X86DAGToDAGISel::selectLEAAddr". i16 arithmetic operations are
promoted to i32 so the same pattern is used for i16 as well. The
difference for i8 is that I don't think promotion is generally wanted.

So I think that two different ways should be possible:
1. Implement a "peephole promotion" which promotes the specific i8
multiplications to i32 before ISel. If this is a viable strategy, are
there any previous examples available to look at?
2. Continue on the ISel pattern match approach.

At this point I don't know how to proceed and would appreciate any
tips and pointers to get me moving in the right direction.

[1] https://llvm.org/bugs/show_bug.cgi?id=24052

Best regards