At the moment, we lower pointer math using int64_t and uint64_t offsets to the same IR (see example below).
This means we lose the information at the IR level that an uint64_t offset was unsigned and the pointer arithmetic shouldn’t overflow in the unsigned sense unless I am missing something. For unsigned offsets with bit widths smaller than the index width this is not a problem, because the GEP will use the offset zero-extended.
Is there anything that would prevent us from encoding this in the IR for the uint64_t example? If not, are there any preferences on how to possibly encode it?
One option could be to add a inbounds_unsigned GEP flag.
My preferred solution to this problem would be to split up inbounds into inbounds, nsw and nuw, where nsw is understood in the current “addition of signed offset to unsigned base” sense, and inbounds implies nsw.
Most transforms / analyses are interested in the nsw or nuw flag only (and quite often, they are interested in nuw only and currently prove it via the chain inbounds + non-negative → nsw + non-negative → nuw). These flags also have different preservation properties, e.g. we can preserve nuw when swapping GEPs, but not nsw (and thus also not inbounds).
Offsets are always sign-extended. So, combining that with nuw might be problematic. For example, InstCombine always creates sext instructions for each of the gep arguments: Compiler Explorer
So, does adding nuw means that offsets are then zero-extended? Or do we need another mechanism? Or do we assume frontends will give offsets in the right bitwdith?
While at it, I think it also makes sense to study whether it makes sense to special case the 0-idxs as it was proposed before. I don’t remember whether we reached a conclusion if it was worth it or not and what the implications on optimizations were.
Pointers themselves are not “signed” or “unsigned”. And pointer arithmetic in C – and “gep inbounds” in IR – is only defined within the bounds of an object, so I don’t see how any of this is actually a problem unless you admit to the possibility of having a single object which is larger than half the address space. And I don’t think we do; all sorts of other things are broken if you try to allow such an object…
That sounds like a good goal for the medium-term to me. I guess we might need to fudge the naming a bit during the transition so the existing inbounds accessors keep the inbounds_only nsw semantics.
It would probably simplify things in general if GEP would require its index operands to have the correct width and have the frontends emit the correct extensions (Which Clang should already do).
I might be missing something, but always using sign-extend for offsets might be sub-optimal for nuw, but shouldn’t cause any correctness issues? It’s probably not worth getting rid of the implicit extension/truncate semantics for GEP but IMO it should be fine to not worry about sub-optimal results for those cases.
I guess it would be easier to add than extending GEP semantics, but it would probably mean that a lot of places would need updating to handle GEP + the new intrinsic, which seems not ideal. IMO this allows encoding an important property for GEPs and it would be desirable to do this with the existing instruction.
I think the key here is that knowing the offset is non-negative allows us to prove additional properties using the fact that GEP arithmetic won’t overflow in the signed sense, which can be used by optimizations.
If the offset is guaranteed to be non-negative, we know that the following holds: GEP inbounds %base, 0 <=(unsigned) GEP inbounds %base, %offset. In LLVM IR, we can uses this property for reasoning in the @index_uint32_t example due to the offset being zero extended.
But at the moment we can’t do the same reasoning for @index_uint64_t, because in the IR there’s nothing indicating that the offset must be non-negative/does not wrap in unsigned sense. But IIUC, the C code the uint64_t offset must be non-negative (if interpreted as signed integer), unless you admit objects that can be larger than half the address space. If the uint64_t offset would be negative (if interpreted as signed integer) then the offset would be larger than half the address space.
Please let me know if I am missing something here, as the details can be quite tricky.
I’m confused, does this claim that pointers can never be decremented? Or have ptr[-1] sorts of constructs? (very handy in poking around in strings) Or is it just that you can’t use GEP to do these things?
But ptr + ((uint32_t)-1) won’t decrement a pointer, as indicated by the generated IR for the index_uint32_t case above for example. The new flag/semantics are targeting the case where unsigned offsets with the same bit width as the index width are used and the offset isn’t zero-extended.
Maybe I’m off base here but wouldn’t the signext/zeroext attributes make more sense as a way of indicating whether the offset is signed than nsw/nuw, since the latter is normally associated with wrapping behavior?
But at the moment we can’t do the same reasoning for @index_uint64_t , because in the IR there’s nothing indicating that the offset must be non-negative/does not wrap in unsigned sense
ISTM that it makes sense that an “inbounds” GEP offset is always signed; if you are providing an unsigned value in the source language, it should not be greater than signed-max. Which means some kind of non-negative value assertion on the index seems like it may be more appropriate than nuw.
Although, it worries me that Clang ubsan doesn’t check this. Given an unsigned idx, currently it checks that ptr + idx u>= ptr – but not that zext idx to ptr-width s>= 0. So, either that’s a missed validation, or I’m wrong somehow…
The sanitizer checks seems to only check for no unsigned wrap. I am not sure if the size restriction to half the address space is actually something that’s provided on the C/C++ level.
I think the main property that we need for reasoning about GEPs is nuw. Having a signed-positive index is a stronger guarantee which in turn implies nuw, but reducing it to nuw probably makes it easier for other frontends/languages to use, e.g. because the indices may not have to be signed-positive in those languages.
I put together a prototype to add nuw/nsw to GetElementPtr: ⚙ D136494 [IR] Add NUW/NSW to GetElementPtr. (WIP). The patch only adds the new fields (using SubclassData, all bits for SubclassOptionalData are already used unfortunately AFAICT), updates a few places in Clang to set nuw when possible and make use of it in the ConstraintElimination pass.
This should give a rough idea of what support for the new fields could look like. The patch currently has inbounds imply nsw, but nsw isn’t printed yet. If we would print it, I guess we would have to update every test containing an inbounds GEP which would be quite a lot.
It looks like nuw should be enough to cover the motivating cases I had, so I think it seems like going with encoding nuw instead of signed positive should make it easier for more frontends to use the new flag.