Hi all,
LLVM has long had a restriction that pointer data types in the LLVM backend must be powers of two. While this works for many architectures, in some cases it is two restrictive. Because this is such a long-standing assumption, I’d like to get wider review on a patch here: https://reviews.llvm.org/D114141 that starts to relax this constraint. The patch allows DataLayouts to be parsed correctly with pointer types that are arbitrary values. Note that this does not change the alignment requirements for pointers, only the types that are permissible and used when InstructionSelection lowers pointers. We have so far used this effectively on out of tree targets and feel it is reasonably general to live upstream.
Steve Neuendorffer
Returning to this topic: There is a question of the existing getPointerSize() API, which returns the size of a pointer in bytes and is used by some architecture-independent code. Previously we implemented this to return the next larger size in bytes: i.e. for a pointer size of 20, getPointerSize() would return ceil(20/8) = 3. However, it seems that some users of the getPointerSize() API (notably code in the AsmPrinter handling DWARF information) assumes that getPointerSize() will return a power of two.
There’s a patch for review here that ensures this: ⚙ D122758 DataLayout::getPointerSize() should always return a power of 2
It’s unclear to me what the broader implications of this are, since the ‘size’ of a pointer could have multiple interpretations: The size in a register, the size as represented in memory, the size as represented in DWARF, etc. Input would be appreciated.
Steve
I’m currently leaning towards removing the getPointerSize() API entirely, migrating all the internal uses to call getPointerSizeInBits(). In looking more closely at the users, it seems that some are actually using getPointerSize() in contradictory ways, or ways that don’t really have anything to do with pointers. For instance, this code in OpenMPOpt.cpp:
const unsigned int PointerSize = DL.getPointerSize();
for (Instruction &I : *BB) {
if (&I == &Before)
break;
if (!isa<StoreInst>(&I))
continue;
auto *S = cast<StoreInst>(&I);
int64_t Offset = -1;
auto *Dst =
GetPointerBaseWithConstantOffset(S->getPointerOperand(), Offset, DL);
if (Dst == &Array) {
int64_t Idx = Offset / PointerSize;
StoredValues[Idx] = getUnderlyingObject(S->getValueOperand());
LastAccesses[Idx] = S;
}
}
Any objections to taking this course?