Hello, I’m trying to figure out what’s legal and what’s not legal to do with GEP instruction in llvm and have some confusion that you can hopefully help me resolve.
Reading The Often Misunderstood GEP Instruction — LLVM 17.0.0git documentation:
Without the
inbounds
keyword, there are no restrictions on computing out-of-bounds addresses. Obviously, performing a load or a store requires an address of allocated and sufficiently aligned memory. But the GEP itself is only concerned with computing addresses.
seems to imply that non-inbounds gep could be used to produce out-of-bounds address and it’s perfectly legal to use that address to load/store as long as it points to sufficiently aligned “allocated” memory.
However, reading different section on the same page The Often Misunderstood GEP Instruction — LLVM 17.0.0git documentation:
It’s invalid to take a GEP from one object, address into a different separately allocated object, and dereference it.
which seemingly contradicts the statement above (unless there is an implicit “inbounds” everywhere).
Also:
Can I compute the distance between two objects, and add that value to one address to compute the other address?
As with arithmetic on null, you can use GEP to compute an address that way, but you can’t use that pointer to actually access the object if you do, unless the object is managed outside of LLVM.
What is meant by “object is managed outside of LLVM”? Does it mean that if some “managed outside” condition is meant, then it’s legal to load/store from this pointer inside llvm? Or was the intent here to only allow producing such address in the IR visible to llvm (so managed in this case should be read as “accessed”)?
Taking into the account all this, is the following transformation currently considered legal?
inttoptr ((ptrtoint %p) + 42)
->
gep %p, 42 ; no inbounds
what if this address is immediately used by load
or store
?