[RFC] SCEV Enhancements

I have a few enhancements to SCEV to help it identify more types of
recurrences. I want to send these upstream but I need a little
feedback.

I've added ptrtoint and inttoptr SCEV expressions so that recurrences
involving these constructs can be recognized. However, this scary
comment gives me pause:

  // It's tempting to handle inttoptr and ptrtoint as no-ops, however this can
  // lead to pointer expressions which cannot safely be expanded to GEPs,
  // because ScalarEvolution doesn't respect the GEP aliasing rules when
  // simplifying integer expressions.

Now, we don't handle them as no-ops because we create explicit
expression for them, similar to how ZExt and SExt work. So any pass
using the analysis has to explicitly look through a ptrtoint or inttoptr
while examining a SCEV expression.

These enhancements are critical for us because of the way our frontends
work (less than ideal but we have to deal with it), due to some language
quirks (casting in C, odd Fortran constructs, etc.) and because users
sometimes stretch the boundaries of good taste :).

Is this a reasonable approach? Is this acceptable to upstream?

Thanks!

                           -Dave

Right now, the main strategy is:

- If front-ends are using ptrtoint+add+inttoptr to do addressing,
   convert them to use bitcast+getelementptr+bitcast instead. You
   can do the getelementptr with i8* so there's no scaling [0]
   if all your offsets are raw byte offsets.

- If users are casting pointers to integers, do one or more of:
    (a) Tell them to fix their code.
    (b) Try to convert the code to use getelementptrs in instcombine.
    (c) Give up on higher-level optimizations, because such low-level
        code is can be considered to have already been hand-optimized.

This is a compromise, but there are several reasons why it makes
sense -- other passes like getelementptrs too, and it reduces complexity
to avoid having everything understand inttoptr when most people
don't need it. If you want to change this, it'd be helpful to show
a code example to show motivation.

[0] or i1* if you feel like being a purist ;-).

Dan

Dan Gohman <gohman@apple.com> writes:

These enhancements are critical for us because of the way our frontends
work (less than ideal but we have to deal with it), due to some language
quirks (casting in C, odd Fortran constructs, etc.) and because users
sometimes stretch the boundaries of good taste :).

Is this a reasonable approach? Is this acceptable to upstream?

Right now, the main strategy is:

- If front-ends are using ptrtoint+add+inttoptr to do addressing,
   convert them to use bitcast+getelementptr+bitcast instead. You
   can do the getelementptr with i8* so there's no scaling [0]
   if all your offsets are raw byte offsets.

We already do that as much as possible but it doesn't cover every case
currently. I will see if we can cover more cases.

- If users are casting pointers to integers, do one or more of:
    (a) Tell them to fix their code.

Not possible, unfortunately.

    (b) Try to convert the code to use getelementptrs in instcombine.

We added a pass to do this but it doesn't get everything. I'll see if
we can enhance it.

    (c) Give up on higher-level optimizations, because such low-level
        code is can be considered to have already been hand-optimized.

We can't do that either.

This is a compromise, but there are several reasons why it makes
sense -- other passes like getelementptrs too, and it reduces complexity
to avoid having everything understand inttoptr when most people
don't need it. If you want to change this, it'd be helpful to show
a code example to show motivation.

Sure. I'll try tweaking instcombine first and see where that gets us.
I have some pretty good testcases. I'd be thrilled if we could just
dump these SCEV changes. That's why I did the original GEP instcombine
stuff. I actually ripped out the SCEV changes after that and
performance tanked. :frowning:

[0] or i1* if you feel like being a purist ;-).

Err? What are the scaling semantics for i1? I don't immediately see
that it should be the same as i8.

                               -Dave

Dan Gohman <gohman@apple.com> writes:

These enhancements are critical for us because of the way our frontends
work (less than ideal but we have to deal with it), due to some language
quirks (casting in C, odd Fortran constructs, etc.) and because users
sometimes stretch the boundaries of good taste :).

Is this a reasonable approach? Is this acceptable to upstream?

Right now, the main strategy is:

- If front-ends are using ptrtoint+add+inttoptr to do addressing,
  convert them to use bitcast+getelementptr+bitcast instead. You
  can do the getelementptr with i8* so there's no scaling [0]
  if all your offsets are raw byte offsets.

We already do that as much as possible but it doesn't cover every case
currently. I will see if we can cover more cases.

Ok. This is generally worthwhile for other reasons as well.

- If users are casting pointers to integers, do one or more of:
   (a) Tell them to fix their code.

Not possible, unfortunately.

   (b) Try to convert the code to use getelementptrs in instcombine.

We added a pass to do this but it doesn't get everything. I'll see if
we can enhance it.

As above, this generally has other benefits too.

   (c) Give up on higher-level optimizations, because such low-level
       code is can be considered to have already been hand-optimized.

We can't do that either.

This is a compromise, but there are several reasons why it makes
sense -- other passes like getelementptrs too, and it reduces complexity
to avoid having everything understand inttoptr when most people
don't need it. If you want to change this, it'd be helpful to show
a code example to show motivation.

Sure. I'll try tweaking instcombine first and see where that gets us.
I have some pretty good testcases. I'd be thrilled if we could just
dump these SCEV changes. That's why I did the original GEP instcombine
stuff. I actually ripped out the SCEV changes after that and
performance tanked. :frowning:

Ok, this seems the best approach. If you hit something you can't fix,
we can re-evaluate.

[0] or i1* if you feel like being a purist ;-).

Err? What are the scaling semantics for i1? I don't immediately see
that it should be the same as i8.

Array elements are guaranteed to start on address-unit boundaries, so i1
elements get padded to the nearest address-unit size.

Dan

Dan Gohman <gohman@apple.com> writes:

We already do that as much as possible but it doesn't cover every case
currently. I will see if we can cover more cases.

Ok. This is generally worthwhile for other reasons as well.

Certainly.

Sure. I'll try tweaking instcombine first and see where that gets us.
I have some pretty good testcases. I'd be thrilled if we could just
dump these SCEV changes. That's why I did the original GEP instcombine
stuff. I actually ripped out the SCEV changes after that and
performance tanked. :frowning:

Ok, this seems the best approach. If you hit something you can't fix,
we can re-evaluate.

Sounds reasonable.

[0] or i1* if you feel like being a purist ;-).

Err? What are the scaling semantics for i1? I don't immediately see
that it should be the same as i8.

Array elements are guaranteed to start on address-unit boundaries, so i1
elements get padded to the nearest address-unit size.

Still looks strange to me. :slight_smile:

                                   -Dave