Adding legal integer sizes to TargetData

Now that 2.5 is about to branch, I'd like to bring up one of Scott's favorite topics: certain optimizers widen or narrow arithmetic, without regard for whether the type is legal for the target. In his specific case, instcombine is turning an i32 multiply into an i64 multiply in order to eliminate a cast. This does simplify/reduce the number of IR operations, but an i64 multiply is dramatically more expensive than an i32 multiply on CellSPU.

There are a couple of different ways to look at this. On the one hand, I still strongly believe that codegen should be able to re-narrow operations (and it does on his testcase on i386). However, codegen is currently doing these optimizations on a per-basic block basis, and we're not likely to have whole-function dags in the near future, so there is an inherent limit to its power.

An earlier place to handle this is in codegen prepare, which is global. However, the bad thing about this is that it would effectively require duplicating all the type legalization code in CGP, which is a pass we want to shrink, not grow. OTOH, the whole CGP pass is really a hack around selection dags not being whole-function.

A third way to handle this is to add to target data a notion of "native types". Instcombine could then be constrained to not do the widening/narrowing transformations when the original type (i32 in this case) was native but the destination type (i64) is non-native.

On the one hand, adding this to targetdata is simple and straight-forward with well-defined semantics. OTOH, it is somewhat ugly that IR canonicalization gets a bit more target-specific. On the third hand, instcombine already promotes indices of GEPs to match the pointer size etc, so it wouldn't be too crazy for it to do this.

What do others think about this?

-Chris

Now that 2.5 is about to branch, I’d like to bring up one of Scott’s
favorite topics: certain optimizers widen or narrow arithmetic,
without regard for whether the type is legal for the target. In his
specific case, instcombine is turning an i32 multiply into an i64
multiply in order to eliminate a cast. This does simplify/reduce the
number of IR operations, but an i64 multiply is dramatically more
expensive than an i32 multiply on CellSPU.

I basically agree with Scott on this: we shouldn’t reintroduce types that
are illegal for the target after Legalize.

There are a couple of different ways to look at this. On the one
hand, I still strongly believe that codegen should be able to re-
narrow operations (and it does on his testcase on i386). However,
codegen is currently doing these optimizations on a per-basic block
basis, and we’re not likely to have whole-function dags in the near
future, so there is an inherent limit to its power.

An earlier place to handle this is in codegen prepare, which is
global. However, the bad thing about this is that it would
effectively require duplicating all the type legalization code in CGP,
which is a pass we want to shrink, not grow. OTOH, the whole CGP pass
is really a hack around selection dags not being whole-function.

A third way to handle this is to add to target data a notion of
“native types”. Instcombine could then be constrained to not do the
widening/narrowing transformations when the original type (i32 in this
case) was native but the destination type (i64) is non-native.

On the one hand, adding this to targetdata is simple and straight-
forward with well-defined semantics. OTOH, it is somewhat ugly that
IR canonicalization gets a bit more target-specific.

IR after Legalize is target-specific (indeed that’s Legalizer’s job), so I don’t see
why you should expect to treat it in a target-independent way. This seems
like the right fix to me. (I don’t offhand see why the separation into legal and
illegal types that we already have isn’t enough, but no doubt you’re right.)

I'm sorry, to be clear, this is mostly talking about an instcombine change. Obviously anything in codegen should respect current restrictions. The question is whether the mid-level optimizer should try to avoid introducing illegal types.

-Chris

I understand; I was stating a general principle, which I believe to be a good one.

I have always found this issue a little thorny on compilers. On one hand, we want to remove unnecessary IR instructions in general as that would reduce IR instruction count that could speed up compilation and and it can also simplify target independent optimization passes because it can assume certain code patterns will not occur after running some cleanup phase. On the other hand, I agree with Dale that it's generally not a good idea for a transformation to introduce an illegal type that would need to be undone in the code generation phase.

For this particular case, it sounds like undoing this transformation in CodeGen is not easy given our current framework and doing the optimization doesn't help making other transformation simpler (e.g., loop bound calculations). If that is the case, the fix that you are suggesting seems to be the right approach.

-- Mon Ping