So, currently, in LLVM data layouts, if we’ve declared pN:S:[alignment]:[alignment]:O
where S
is the pointer size and O
is the offset size, you can either have address space N
be integral (where, as I understand it, the pointer is assumed to address from [0, iS::umax]
), or non-integral, which means that inttoptr
and ptrtoint
are not-deterministic.
For some cases - such as the buffer descriptors on AMD GPUs, which are, for these purposes, i80 metadata || i48 address = i128
or CHERI’s capability poniters (i64 tag || i64 address = i128
), both of these semantics are not quite right. These types of pointers aren’t indexes into a flat area of memory, but they also aren’t the sort of wild, non-deterministic GC-managed things that non-integral pointers can be.
So, it seems to me (after some discussion with @jrtc27 ) that both of these semantics aren’t quite right for hardware fat pointers. Non-integral pointers go to far, and impose semantics that aren’t needed - like the fact that inttoptr may be non-deterministic (!). Those sorts of restrictions might make sense for things like garbage-collected pointers, but are too strong an assumption for a fat pointer.
Fat pointers do have an address component, and, as long as the compiler restricts itself to performing computations on the address compoment, fat pointers are just regular pointers.
So, I propose that, when an optimization pass inserts an ptrtoint/inttoptr pair, or otherwise starts modifying the bit value of a pointer, that transformation must not modify the high S - O
bits of the integer value. That is, if you have p200:128:128:128:64
, you could rewrite
%y = getelementptr i8, ptr addrspace(200) %x, i64 %idx
you could rewrite this to
%x.int= ptrtoint ptr addrspace(200) %x to i128
%metadata = and i128 %x.int, i128 0xffffffff_ffffffff_00000000_00000000 ; mask off address
%address = trunc i128 %x.int to i64
%address.y.trunc = add i64%address, i64 %x
%address.y = zext i64 %address.y.trunc to i128
%y.int = or i128 %metadata, %address.y
%y = inttoptr i128 %y.int to ptr addrspace(200)
but not to
%x.int = ptrtoint ptr addrspace(200) %x to i128
%idx.ext = zext i64 %idx to i128
%y.int = add i128 %x.int, %idx.ext
%y = inttoptr i128 %y.int to ptr addrspace(200)
because the latter could change the metadata bits
However, if the getelementpointer
were inbounds
the latter rewrite would be possible, because the inbounds
tag (as far as I know) means that adding the pointer to the offset won’t produce a carry.
Note that, for typical pointers - where the offset size and the pointer size and the same, the and
produces a 0, the truncations and extensions are noops, and so the or
is also just the result of the addition, recovering the original transformation at no extra cost.
A downside of this approach is that it introduces a bunch of complexity to anyone wanting to do integer arithmetic on pointer values that they’ll need to keep track of and are likely to trip over if they’re not targeting a platform that has fat pointers.
One upside, though, is that many of the optimizations locked behind isNonIntegralAddressSpace()
that are, with some care, applicable to fat pointers could be made applicable to such pointers, improving code generation and not saddling fat pointers with semantics that they don’t have.
On top of that, a quick skim of the isNonIntegralAddressSpace()
calls lying around shows that most of them are used in contexts where the compiler wants to perform bitcasts and will not be doing any arithmetic on the pointer values, which is a case where fat pointers can be bitcast with no trouble. The more complicated “treat pointers as integers” sections, like the loop optimizer, could probably be gated behind getPointerSizeInBits(AS) != getIndexSizeInBits(AS)
instead.
What do folks think? (also @arsenm since I’m rambling about AMD’s stuff and you might have thoughts)