So, currently, in LLVM data layouts, if we’ve declared
S is the pointer size and
O is the offset size, you can either have address space
N be integral (where, as I understand it, the pointer is assumed to address from
[0, iS::umax]), or non-integral, which means that
ptrtoint are not-deterministic.
For some cases - such as the buffer descriptors on AMD GPUs, which are, for these purposes,
i80 metadata || i48 address = i128 or CHERI’s capability poniters (
i64 tag || i64 address = i128), both of these semantics are not quite right. These types of pointers aren’t indexes into a flat area of memory, but they also aren’t the sort of wild, non-deterministic GC-managed things that non-integral pointers can be.
So, it seems to me (after some discussion with @jrtc27 ) that both of these semantics aren’t quite right for hardware fat pointers. Non-integral pointers go to far, and impose semantics that aren’t needed - like the fact that inttoptr may be non-deterministic (!). Those sorts of restrictions might make sense for things like garbage-collected pointers, but are too strong an assumption for a fat pointer.
Fat pointers do have an address component, and, as long as the compiler restricts itself to performing computations on the address compoment, fat pointers are just regular pointers.
So, I propose that, when an optimization pass inserts an ptrtoint/inttoptr pair, or otherwise starts modifying the bit value of a pointer, that transformation must not modify the high
S - O bits of the integer value. That is, if you have
p200:128:128:128:64, you could rewrite
%y = getelementptr i8, ptr addrspace(200) %x, i64 %idx
you could rewrite this to
%x.int= ptrtoint ptr addrspace(200) %x to i128 %metadata = and i128 %x.int, i128 0xffffffff_ffffffff_00000000_00000000 ; mask off address %address = trunc i128 %x.int to i64 %address.y.trunc = add i64%address, i64 %x %address.y = zext i64 %address.y.trunc to i128 %y.int = or i128 %metadata, %address.y %y = inttoptr i128 %y.int to ptr addrspace(200)
but not to
%x.int = ptrtoint ptr addrspace(200) %x to i128 %idx.ext = zext i64 %idx to i128 %y.int = add i128 %x.int, %idx.ext %y = inttoptr i128 %y.int to ptr addrspace(200)
because the latter could change the metadata bits
However, if the
inbounds the latter rewrite would be possible, because the
inbounds tag (as far as I know) means that adding the pointer to the offset won’t produce a carry.
Note that, for typical pointers - where the offset size and the pointer size and the same, the
and produces a 0, the truncations and extensions are noops, and so the
or is also just the result of the addition, recovering the original transformation at no extra cost.
A downside of this approach is that it introduces a bunch of complexity to anyone wanting to do integer arithmetic on pointer values that they’ll need to keep track of and are likely to trip over if they’re not targeting a platform that has fat pointers.
One upside, though, is that many of the optimizations locked behind
isNonIntegralAddressSpace() that are, with some care, applicable to fat pointers could be made applicable to such pointers, improving code generation and not saddling fat pointers with semantics that they don’t have.
On top of that, a quick skim of the
isNonIntegralAddressSpace() calls lying around shows that most of them are used in contexts where the compiler wants to perform bitcasts and will not be doing any arithmetic on the pointer values, which is a case where fat pointers can be bitcast with no trouble. The more complicated “treat pointers as integers” sections, like the loop optimizer, could probably be gated behind
getPointerSizeInBits(AS) != getIndexSizeInBits(AS) instead.
What do folks think? (also @arsenm since I’m rambling about AMD’s stuff and you might have thoughts)