Question about getelementptr inbounds with offset 0

I have a question (or rather a series of questions) about the behavior of getelementptr inbounds (GEPi) when the effective offset is 0. In the following, I will assume that it does not matter how that offset is computed – it might be array index 0, or a field at offset 0, or an arbitrary array index when the array element type has size 0. When I say that GEPi is “well-defined”, I mean that it does not return poison.

Also note that I am asking about the dynamic semantics of LLVM IR here, so basically I am asking what an omniscient optimizer would be allowed to do with such code. So, this is not about whether the offset is “known” to be 0 or not.

(I have checked the GEP FAQ and could not see my question answered there.)

Stage 0: null pointers

Based on the LangRef (“The only in bounds address for a null pointer in the default address-space is the null pointer itself.”), I assume the following is well-defined:

%ptr = i8* null
%res = getelementptr inbounds i8, i8* %ptr, i64 0

After all, both the input to GEPi (null) and its output (null) are in bounds of the same allocation (according to what I quoted above).

Stage 1: literal integer pointers

In Rust, we also heavily rely on the following being well-defined (for any integer constant N):

%ptr = inttoptr (i64 N to i8*)
%res = getelementptr inbounds i8, i8* %ptr, i64 0

Folklore has it that someone in the past talked to LLVM people to get “blessing” for this pattern, but the details of that have been lost to time and (as far as I can tell) this never made it into the LangRef. Can someone confirm that this is indeed well-defined – or, if this is considered “ill-defined by omission”, can this be made well-defined by a LangRef patch without having to change optimizations? I would hope so, both to ensure Rust is using LLVM correctly, and because at least to me this seems very similar to the NULL case.

Stage 2: dangling and out-of-bounds pointers

If the first two stages are fine, then the question comes up whether there are any pointers for which GEPi with offset 0 is not well-defined. In particular, is the following well-defined?

// Create a dangling pointer (i.e., used to point to
// an allocation but that allocation is gone)
%ptr = call i8* @malloc(i64 4)
call void @free(i8* %ptr)
// Now index into that
%res = getelementptr inbounds i8, i8* %ptr, i64 0

I am aware that the corresponding C code is definitely UB due to “pointer zapping” (pointer values become indeterminate when the allocation they point to is deallocated), but I am asking about LLVM semantics here, and to my knowledge nothing in the LLVM semantics indicates that LLVM has “pointer zapping”.

And what about the following?

// Create an out-of-bounds pointer
%base = call i8* @malloc(i64 2)
%ptr = getelementptr i8, i8* %ptr, i64 4
// Now offset that
%res = getelementptr inbounds i8, i8* %ptr, i64 0

The latter is fairly clearly not well-defined under the current LangRef. But given in particular that even null pointers can be offset by 0, it seems possible to me that optimizations could be fine with allowing any pointer to be offset by 0. Does that sound like a reasonable option for LLVM?

For Rust, it might be quite useful if this were well-defined, since it would let us remove some awkward special cases from our documentation. Right now, we allow offset(0) on pointers created by “casting any non-zero integer literal to a pointer, even if some memory happens to exist at that address and gets deallocated”, but we cannot allow it on all dangling pointers due to LLVM IR semantics. This regularly leads to confusion because it is hard to explain this rule.

(If LLVM maintains that GEPi by 0 can return poison for non-poison inputs, we could of course alternatively change the Rust compiler to remove inbounds whenever the offset might be 0, but that cannot always be statically predicted and it also seems rather fragile.)

@nlopes have you experimented with things like this in Alive?
I wonder how “always allow 0-offset in GEPi” would fare there.

1 Like

LangRef requires that for %q = gep inbounds %p, i both %q and %p must be inbounds, even if i=0.
So technically, your stage 1 pattern is only ok if %ptr is inbounds.

For stage 2, it’s also ok. We don’t have any limitation on gep movability. It can be hoisted across function calls, so it can operate over freed objects.
The 2nd example of stage 2 is not ok according to LangRef.
I only have 5 mins between classes, so not much time to think. But it might not be possible to special case gep 0 because of loop optimizations. You usually need to prove that none of the gep operations inside the loop will overflow. It seems that you need to know that the base case is within bounds already to prove the inductive +1 case.

The thing is, “inbounds” it not defined for pointers obtained from integers (i.e., pointers without provenance) and for ranges of size 0.

The usual argument is: there might be 0-sized allocations everywhere in the address space (LLVM has no way of knowing they do not exist), so we can pretend that 4 (or any other fixed address) is inbounds of a 0-sized allocation.

Oh, that is interesting. I would have said that that pointer is definitely not inbounds, since the allocated object it refers to no longer exists.

True. There are a couple of proposals in the air, but we didn’t get to standardize one of them yet.
You know the tradeoffs: on one hand we want free movement of geps, icmp, etc, on the other hand we don’t want super complicated semantics. But free movement of gep is painful with pointers obtained from integers.

If there was a inttoptr with an explicitly annotated provenance, then to make the Rust case work we would only require GEPi by 0 in inttoptr annotated with the “empty” provenance that cannot actually do any loads/stores.

So, the case Rust actually cares about should be easy to support with free movement of GEP.

I can see how it can become more complicated with inttoptr that “guesses” a suitable provenance. However, if GEPi by 0 is always defined, then again this should be fine wrt reordering.

I don’t quite follow here, so if you could write some more details once you have a bit more time that would be great. :slight_smile:

Sorry for the delay. Didn’t reply because I didn’t have much to add.

I run LLVM’s unit tests with Alive2 with your proposed semantics, where gep inbounds ptr, 0 is equivalent to gep ptr, 0. I didn’t get any new failure, but Alive2 doesn’t model inttoptr yet, so I don’t want to extrapolate too much from these results.

If this is something that would really benefit Rust, then we need to do a more thorough study.

Hi,

Doesn’t the question change after the move to opaque pointers in LLVM 15 ?
Won’t a gep with offset 0 be optimized out with opaque pointers?

Kind Regards

James

Nobody is going to intentionally tell their frontend to emit “gep x, 0”. The question is about the semantics if a variable offset is zero at runtime.

In general, LLVM has to assume that random addresses it doesn’t know anything about correspond to a platform-specific memory allocation. So in practice, you’re fine here.

We might want to come up with a rule that more explicitly blesses this, though.

The set of valid offsets is determined by the provenance.

If we convert an arbitrary integer literal to a pointer, we have to invent a provenance, which describes the set of legal offsets. How exactly that invention process works is not completely clear, but under any set of reasonable rules, the address itself has to point into the described object, so an offset of zero is going to be legal.

This shouldn’t really be that hard to explain, I think? It’s only weird if you try to explain the rules without mentioning provenance.


From an optimizer perspective, adding an exception for zero makes inbounds more tricky to reason about: currently, we can assume “inbounds” pointers are actually inbounds relative to a base object, regardless of the exact operations involved. But with an exception for zero, we need to check that the pointer is produced by a series of inbounds operations.

FWIW I think there are actually very few optimizations that use inbounds to reason about allocated objects. Almost always, inbounds is only used for its nowrap implications.

There’s only one place that comes to mind where we do care about being inbounds of an allocated object, which is this special case in BasicAA:

In this context, zero offsets are not relevant.

Edit: One more use where inbounds is used to reason about allocated objects is this code in CaptureTracking:

I think for this code, making all zero-offset GEPs inbounds would be problematic and require an implementation change.

2 Likes

Actually, I think the CaptureTracking code is already broken. I believe the intention here is to exclude cases like gep p, -q == null, which are equivalent to p == q and capture p. However, if we rewrite this as gep inbounds (gep p, -q), %zero == null, then it would not be considered as capturing. The GEP is inbounds here, because we explicitly consider a zero-offset GEP to be inbounds of the null pointer.

Uhm, it’s true that gep inbounds (gep p, -q), 0 is null if p == q, but it is poison in a bunch of other cases (e.g. if q > p.idx).
Since programs can’t detect if a value is poison or not, I think a program cannot use this scheme to escape p/q. (a program can freeze the comparison result, but it doesn’t gain any information from that)

So I think the code of CaptureTracking is fine.

Thanks!

Well, it would make the user-facing rules for this case simpler. But we don’t have a decision that we want to simplify the rules like that (some people think the current rules are better) – and right now it’d be somewhat moot to even start this discussion on the Rust side since we don’t know if the simpler rules can even be implemented by our main (LLVM) backend. Looks like we have a cyclic dependency here. :wink:

Right, that is basically how we have been rationalizing this. Good to hear that you agree. :slight_smile:

We might want to come up with a rule that more explicitly blesses this, though.

Yes, it would be great if we could point at something in the documentation, rather than just a forum thread.

This shouldn’t really be that hard to explain, I think? It’s only weird if you try to explain the rules without mentioning provenance.

Indeed, if we start with “all pointers have provenance, and it matters even for zero-sized accesses and offset-by-0”, then this is all quite coherent. The current rules even naturally fall out then, and our reference interpreter does not need to do anything special here. (LLVM has a special exception that NULL can be offset by 0, which is surprising to me given that the NULL pointer is never dereferenceable – so this looks like a special case to me, and we do not have that special case in Rust. But that’s okay of course, it is just a case of Rust having more UB than LLVM.)

Indeed, if there was getelementptr nowrap it could be used to sidestep basically all of the tricky questions.

Is that code a problem for “stage 1”, i.e., non-zero integer constants cast to a pointer and inbounds-offset by 0?

1 Like

Another facet of this is that the GEP semantics as currently specified require us to “unnecessarily” drop inbounds in certain transforms. I ran across Instcombine started loosing inbounds sometimes after 12 · Issue #51157 · llvm/llvm-project · GitHub today (and I remember a few similar cases in the past) where we transform c ? p +inbounds x : p into p + (c ? x : 0). We must drop the inbounds here, because we don’t have a guarantee that the original p was inbounds of an allocation, and thus whether the p + 0 can be marked as inbounds.

I think it may be worthwhile to explicitly make gep with a zero offsets always inbounds, with the following effects:

  1. The inbounds shortcut in the CaptureTracking code referenced above would have to be adjusted. This is the only place I was able to find that uses the current semantics.
  2. We can preserve inbounds during transforms that create a gep p, 0, such as the above select transform.
  3. We currently have a special rule that “gep null, 0” is inbounds. This rule would become that “gep p, 0” is inbounds. As such, I would argue that this does not meaningfully increase the complexity of our GEP semantics, just changes one special case to a slightly broader one.
1 Like

Having a way for frontends to get only the nowrap without worrying about inbounds would be awesome. :slight_smile:

Did you see Nuno’s reply to that above?


@nikic also suggests another perspective on getelementptr inbounds that I had not considered yet – the provenance of the pointer might not matter at all. After all, nothing in the docs says anything about the pointer being “derived from” something. If the requirement is solely that the pointer must have the address of an allocated object, and the resulting pointer must have an address in the same allocated object (but not that the pointer must actually be derived from that allocated object), then “stage 1” and “stage 2” are equivalent. Their only difference is in how the pointer is derived. So if we have consensus that “stage 1” is fine, it follows that offset-by-0 is fine for all pointers.

OTOH @efriedma-quic says

The set of valid offsets is determined by the provenance.

so provenance might be relevant after all?

On the Rust side we discovered that allowing offset-by-0 only for some pointers (directly cast from an integer literal) but not for others (derived from a now-dead allocated object) causes major problems and blocks some pretty fundamental optimizations. LLVM semantics of doing integer-typed loads of provenance-enriched data are not specified precisely enough to be able to say whether LLVM has the same problem, but even if it doesn’t, if LLVM doesn’t allow offset-by-0 on some pointers that will cause major pain for Rust.

IMO the ideal primitive to support Rust would be a getelementptr nowrap.

I’d love to make progress in this direction. What would it take to make this happen?

I’d be happy to write a patch that adjusts the LangRef, and help clarify any questions there, but looks like this would require changing some LLVM analyses/optimizations as well which is a bit too much for me.

I think it’s mostly a matter of addressing the first point, i.e. replacing the CaptureTracking code with something that is compatible with the new semantics, and checking whether that has any meaningful codegen impact.

To follow up on this, I have submitted a number of patches for the “make gep inbounds 0 always legal” change:

1 Like