Using inttoptr/ptrtoint instead of getelementptr

My compiler front-end generates lots of inttoptr and ptrtoint instructions instead of getelementptr. This is a consequence of my compiler’s custom IR’s design which is being translated to LLVM IR. I have read the Performance Tips for Frontend Authors doc, it mentions

Use ptrtoint/inttoptr sparingly (they interfere with pointer aliasing analysis), prefer GEPs

I am using the default O3 optimization pass, by how much does this effect LLVM’s ability to optimize? What is the worst case performance penality? Does ptrtoint/inttoptr interfere with aliasing all the time or only in specific cases? Does switching to the new opaque pointers improve this?

Thanks!

Passing a pointer to ptrtoint is considered a capture, which means that any pointer whose identity we cannot conclusively determine will be considered to alias it. In particular, any object passed to ptrtoint aliases any pointer returned by inttoptr (as well those returned by ptr loads or calls – and of course, calls will also clobber the object.)

I suspect that in practice, things might not be so bad, because we currently have some known-unsound folds that can eliminate ptrtoint/inttoptr pairs. But these will be going away at some point, so relying on this is a bad idea.

Why does your frontend use inttoptr and ptrtoint? If it’s just a matter of adding byte offsets, you can use getelementptr i8, ptr %p, i64 %offset for that purpose. With opaque pointers, this doesn’t even require inserting bitcasts anymore.

2 Likes

The basic problem with ptrtoint and inttoptr is that inttoptr (ptrtoint x) cannot safely be optimized to x for reasons that involve a more thorough discussion of pointer provenance than I can do justice here. You can see recent threads such as https://discourse.llvm.org/t/a-memory-model-for-llvm-ir-supporting-limited-type-punning or Pointers Are Complicated III, or: Pointer-integer casts exposed for some more thorough discussion on the topics involved.

In short, if you are exclusively using ptrtoint and inttoptr instead of getelementptr, you can probably expect a catastrophic loss of optimization opportunities, since no alias queries will be able to look through the inttoptr calls (and if any do, it is likely a bug in that alias query, given the pointer provenance issues). This logic is not affected by opaque pointers at all, as pointer provenance is a completely orthogonal issue to representation of typed pointers.

3 Likes

Outside of optimisation, CHERI targets such as Arm’s Morello do not allow this kind of round trip. Pointers must be derived from pointers, a pointer derived from an integer is not usable at all and will trap if used with a load or store instruction. If your front end emits inttoptr then you will not be able to support such targets.