I’m not sure what the usual “ping time” is for llvm-dev, but may I ask if there are any updates on this?
It appears that the following lines are the root cause of the reordering (https://github.com/llvm/llvm-project/blob/fdcb27105537f77c78c4473d4f7c47146ddbab69/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2175):
// Handle gep(bitcast x) and gep(gep x, 0, 0, 0).
Value *StrippedPtr = PtrOp->stripPointerCasts();
PointerType *StrippedPtrTy = cast(StrippedPtr->getType());
Namely, the stripPointerCasts function also strips address spaces.
Would replacing this by stripPointerCastsSameRepresentation be an acceptable solution, or do other parts of LLVM rely on this reordering?
If not, I can send a patch in Phabricator to address this issue.
I missed this thread initially, but we hit this issue in CHERI a year or so ago and have disabled the relevant bit of InstCombine in our downstream fork. In CHERI, AS200 is used to represent memory capabilities (hardware-enforced fat pointers that do bounds and permission checks), AS0 is used to represent 64-bit (or 32-bit) integer virtual addresses that are implicitly offset and bounds checked against a default capability. An address space cast transforms between these two representations. If an AS200 pointer points to an object that is not fully within the default data capability then the AS200->AS0 cast will give null. Any AS0 -> AS200 cast will give the rights to the entire default data capability (which, in a pure-capability ABI program, may be no rights at all!). There are a number of corner cases where this transform is not sound.
There are a number of places in LLVM that assume AS cast and bitcast are equivalent. This is unfortunate because, if they were, there would be no point in AS cast existing). The semantics of AS cast (or ASs in general) are not particularly well defined. Matthew Arsenault has been doing a great job trying to pin them down.
This is probably the right solution. Nothing should be relying on this anyway
It appears that this behaviour of InstCombine is at least somewhat intended, as there are several tests that fail after the change mentioned before: cast.ll, gep-addrspace.ll and getelementptr.ll in test/Transforms/InstCombine.
I feel a little uncomfortable applying the patch after knowing this, and removing those tests doesn’t seem like a great solution.
What are your thoughts?
For now, I think I can work around my original issue since the CUDA driver seems to optimise away the additional ADD in most cases.
These are checking for the current behavior, it’s only natural tests would need updating for this change. You should just update them to show the new result
We hit on this issue internally with GEP and addrspacecast being reordered. Curious if you have a patch out for review?