Nobody is going to intentionally tell their frontend to emit “gep x, 0”. The question is about the semantics if a variable offset is zero at runtime.
In general, LLVM has to assume that random addresses it doesn’t know anything about correspond to a platform-specific memory allocation. So in practice, you’re fine here.
We might want to come up with a rule that more explicitly blesses this, though.
The set of valid offsets is determined by the provenance.
If we convert an arbitrary integer literal to a pointer, we have to invent a provenance, which describes the set of legal offsets. How exactly that invention process works is not completely clear, but under any set of reasonable rules, the address itself has to point into the described object, so an offset of zero is going to be legal.
This shouldn’t really be that hard to explain, I think? It’s only weird if you try to explain the rules without mentioning provenance.
From an optimizer perspective, adding an exception for zero makes inbounds more tricky to reason about: currently, we can assume “inbounds” pointers are actually inbounds relative to a base object, regardless of the exact operations involved. But with an exception for zero, we need to check that the pointer is produced by a series of inbounds operations.