Are SCEV normal form?

Hello,

I assumed SCEV purpose was to be a normal form, but then I wondered which one of those is the normal form:

(zext i16 (trunc i32 %a to i16) to i32)

vs

(-((%a /u 65536) *u 65536) + %a)

The first one is nice for interval analysis, and known bit analysis.
The second one is nice when plugged into gep of 2d arrays.

Hello again,

On that subject, I have the following SCEV:

(zext i16 (trunc i32 %a to i16) to i32) + ((%a /u 65536) *u 65536)

That I would like to normalize into:

%a

I am not sure where is the most natural spot in which to do that?
Also, this pattern is much more general. The idea is that I want to coalesce bit ranges that have been extracted and put back together.

Should I detect patterns, or look for a more generic strategy? Do you have input on that?

Note that there is a slight difficulty due to the fact that we “sink” the trunc:

(zext i16 {0,+,1}<%bb> to i32) + (65536 * ({0,+,1}<%bb> /u 65536)

Here the recurrence lost it’s and got reduced to a i16 (on the left), but not on the right.

But we can prove:

  • that (zext i16 {0,+,1}<%bb> to i32) has the same 16 LSB than (i32 {0,+,1}<%bb>)
  • that (65536 * ({0,+,1}<%bb> /u 65536) has the same 16 MSB than (i32 {0,+,1}<%bb>)
    And therefore conclude that we can simplify it to (i32 {0,+,1}<%bb>).

The difficulty is to come up with the (i32 {0,+,1}<%bb>) in the first place.

Hi Alexandre,

Right now there is no phase where we simplify or canonicalize SCEV
expressions after their creation -- the shape of a SCEV expression is
expected to be canonical when you create it. For this reason the
right place to do the simplification you suggest is during the
creation of the SCEV expression itself (createSCEV).

If there is a general strategy here that isn't too complex, I'd
suggest going with that; otherwise detecting some simple patterns that
you care about is fine too.

As far as your initial question is concerned (which I am sorry for not
replying too -- it looks like it slipped through the cracks somehow):
if there isn't a universal good reason to pick one over the other, I
think we should just pick one as the normal form as per our
convenience and teach downstream users to how to effectively use it.
For instance, if we choose the zext-trunc version as the canonical
form then we should teach the 2D array GEP generation logic (not sure
if there is such a thing in LLVM today?) that is equivalent to the
udiv-mul form, perhaps by helpers on ScalarEvolution itself.

Thanks!
-- Sanjoy

Hi Sanjoy,