[RFC] LLVM IR should allow bitcast between address spaces with the same size

TL;DR

We propose the following change to LLVM IR:

  • Allow bitcast to support no-op pointer cast between pointers from different address spaces.
  • This bitcast is valid if the bit widths queried for addressspaces from datalayout match.
  • Overload CastIsValid call with datalayout argument to check validity of cast.
  • Update CastIsValid to allow bitcast between vector of pointers from different address spaces if total bit widths match.
  • GVN pass introduces ptrtoint/inttoptr for load which reinterprets bits from previous store.
    Instead use a no-op bitcast of ptrs from different address spaces.

Motivation

When addrspacecast was introduced, abilty to do no-op pointer bitcast from different address spaces has been removed.
Pointer sizes are always known from DataLayout which is now made mandatory in LLVM IR.
So, Bitcast can be analysed to be no-op cast by matching the pointer sizes from DataLayout.

Since there is no other way to do no-op reinterpret of bits, in some cases GVN pass introduces a ptrtoint/inttoptr pair.
After proper analysis, that a no-op bitcast can be done is concluded, then a bitcast can be introduced.
Usage of no-op pointer bitcast between addrspaces can be restricted to be used only by IR Transform passes but not by frontend.

For example consider the below IR:
GVN pass has discovered a reinterpretation of bits via a store followed by a load.

%struct.S.coerce = type { i32 addrspace(1)* }
%s.sroa.0 = alloca i32*, align 8, addrspace(5)
%0 = extractvalue %struct.S.coerce %s.coerce, 0
%1 = bitcast i32* addrspace(5)* %s.sroa.0 to i32 addrspace(1)* addrspace(5)*
%2 = addrspacecast i32 addrspace(1)* addrspace(5)* %1 to i32 addrspace(1)
store i32 addrspace(1)* %0, i32 addrspace(1) %2, align 8

%3 = load i32*, i32* addrspace(5)* %s.sroa.0, align 8, !tbaa !2

;GVN pass currently introduces no-op ptrotoint/inttoptr for load.
%3 = ptrtoint i32 addrspace(1)* %0 to i64
%4 = inttoptr i64 %3 to i32*

;If bitcast of pointers from different address is allowed, load can be replaced with no-op bitcast
%3 = bitcast i32 addrspace(1)* %0 to i32*

Implementation

  1. There are certain cases where standalone instructions are created without linking them to basicblock/function or module.
    In such cases DataLayout is not accessible by querying the module. To check validity of bitcast datalayout is mandatory.
    So CastInst::CastIsValid, CastInst::create etc have been overloaded to take DataLayout as argument.
   static bool castIsValid(Instruction::CastOps op, Value *S, Type *DstTy,
                           const DataLayout &DL);

   static CastInst *Create(
      Instruction::CastOps,   ///< The opcode of the cast instruction
      Value *S,               ///< The value to be casted (operand 0)
      Type *Ty,               ///< The type to which cast should be made
      const DataLayout &DL,   ///< DataLayout to check validity of bitcast
      const Twine &Name = "", ///< Name for the instruction
      Instruction *InsertBefore = nullptr ///< Place to insert the instruction
   );
  1. Verifier has been updated to check for validity of bitcast using datalayout.

  2. GVN pass has been updated to use bitcast for a load instead of emitting ptrtoint/ inttoptr.
    llvm/lib/Transforms/Utils/VNCoercion.cpp

Review link: âš™ D114533 LLVM IR should allow bitcast between address spaces with the same size.

My biggest concern here is that existing optimizations involving pointer bitcasts might not expect cross-address-space casts; they could get confused by the address-space change. They could make incorrect assumptions about aliasing, or they could just produce invalid code. Have you spent any time to try to figure out how much work is involved for this?

This example is not very good because the code is (very) likely undefined. It’s doing an implicit addrcast through load/store. That is illegal in general as addrcast may not be a NOP.
The other thing is that that ptr2int/int2ptr pair that GVN produces could well be an addrcast. So you are proposing a change to the IR just to fix a deficiency in GVN, AFAIU. Unless I’m missing something, this is not a good motivation.

Hi @nlopes. Thanks for the review.
Main motivation for this change is to give the IR a way to represent a cast we know is a no-op without resorting to ptrtoint/inttoptr
With the proposed change, we do not want to introduce new illegal address space conversions that were not already present in the original program. The original code was not performing an address space cast, it was doing type punning and reinterpreting some bits as a different address space. Since the IR transform, like in GVN case, did a no-op reinterpretation of bits between addrspaces, We are just trying to optimize it by using a no-op bitcast between addrspaces.

Hi @efriedma-quic. Thanks for the review.
In GVN case, no-op reinterpretation of bits between addrspaces is done by introducing ptrtoint/inttoptr. It was doing type punning and reinterpreting some bits as a different address space. So, with introduction of bitcast between addrspaces in-place of ptrtoint/inttoptr pair, the code still is valid since the original IR itself allowed such reinterpretation. The existing transforms produce valid IR incase of ptrtoint/inttoptr. So valid IR should be generated with the new change aswell.

I think @efriedma-quic and @nlopes raise some good concerns. They’re perhaps not necessarily show stoppers, but clearly the proposed change needs to be very careful about the semantics, in particular in terms of pointer provenance.

Even if inttoptr(ptrtoint(ptr)) with different address spaces is a no-op in terms of the resulting assembly code, my understanding is that it may not be a no-op in terms of pointer provenance.

Example: ptr refers to an underlying object A in address space 1 and ptrtoint(ptr) is 0x120. inttoptr(0x120) in address space 2 may refer to a different underlying object B, and inttoptr ensures that the resulting pointer has a provenance that matches that underlying object.

By contrast, it’s unclear whether bitcast(ptr) to addrspace(2) would result in a pointer whose provenance matches object B. I believe the proposal needs to make a choice here.

If bitcast is defined to adjust provenance when necessary, my initial thought is that this would risk the kind of alias analysis bugs that @efriedma-quic was talking about.

If bitcast is defined to never adjust provenance, then that should be safe in terms of alias analysis. The flip side is that ptrtoint(inttoptr(ptr)) cannot necessarily be replaced by bitcast. Some additional information is required to check whether that’s valid, presumably part of the DataLayout.

2 Likes

But you can represent that no-op cast with addrcast. What’s the advantage of using bitcast?
Addrcast may interact with some heuristic as they may consider all addrcasts as having non-zero cost (unlike bitcast). But that is fixable; alas, people have long wanted to express invariants about non-zero addresspaces (e.g., is the null pointer dereferenceable in addrspace=1?, etc).

We don’t really want to support type punning. That is a can of worms for the alias analysis algorithm. It’s easy to write examples that get miscompiled if you assume type punning is valid.

Perhaps the point is that inttoptr(ptrtoint(ptr)) cannot be turned into an addrspacecast unconditionally, while OP’s hope is that it could be unconditionally turned into bitcast. That would be an advantage.

However, whether that’s actually true depends on what the semantics of that bitcast would be, as I’ve written in my previous comment.

But that’s self-imposed pain, so fixable. We can change GVN to create addrcast straight away. We don’t need to reverse engineer it back.
Type punning is illegal so we can make GVN produce whatever we want. Including poison (which is my inclination here to avoid confusions!). The code that originates those load/store with different addr spaces should be fixed instead IMHO.

Depends on the source language and doesn’t apply to LLVM IR, at least not in general? Perhaps this specific case runs into pointer provenance issues, but even that isn’t clear to me.

I agree that helping GVN understand when it can just produce an addrspacecast sounds like the cleaner path compared to enabling bitcasts between address spaces.

In this case we don’t even to go to the complicated case of provenance. Let’s just use the fact that addrcast may not be a no-op.
If that’s the case, then if you do a store and then load of ptrs with different address space, which operation (load or store) will do the addrcast? None, because the compiler doesn’t know the value wasn’t stored with the right addr space or that it will be loaded by a different addr space.

Hence, we can’t convert an addrcast into a load/store with type punning. That must be ilegal.

1 Like

In this case we don’t even to go to the complicated case of provenance. Let’s just use the fact that addrcast may not be a no-op.

Presumably @skc7 cares about well-formed programs, so for the purpose of discussion it’s reasonable to assume that the store of pointer-to-addrspace(A) followed by a load of pointer-to-addrspace(B) is well-defined LLVM IR. This ought to be the case when an addrspacecast from A to B is a no-op.

(And as a corollary, an unconditional transform of such a store/load sequence to poison sounds incorrect.)

[…] Hence, we can’t convert an addrcast into a load/store with type punning. That must be ilegal.

It is illegal as an unconditional transform, yes, but it seems legal to me if the addrspacecast is known to be a no-op. If it isn’t, there needs to be some more subtle reason that goes beyond the underlying bit pattern of the pointer.

1 Like

“addrspacecast may not be a no-op” is preventing from turning it into a store/load sequence unconditionally, but it’s not clear to me that this prevents someone who knows more about specific addrspacecast being no-op to emit IR with store/load?
Basically, LangRef says: " It can be a no-op cast or a complex value modification, depending on the target and the address space pair" ; so why can’t I use the known no-op property of a target and a given source/dest address space to emit such store/load?

I would say that such store/load implies (for the IR to be valid) that an addrspacecast in the same place must be a no-op.

(edit: my message crossed @nhaehnle’s latest post, I think we’re saying the same thing?)

1 Like

It’s not as simple as that.
Once you allow type punning, alias analysis needs to be made weaker for everyone.

See this example:

p = malloc()
store p, q

... (no addrspacecast)

v = load w, as=1  // w and q don't alias, but AA can't prove it

Can v alias with p? If type punning is not allowed, then it’s a trivial no, as there’s no addrspace cast in between. Otherwise, it may alias!
So, do you want to make AA weaker for everyone just to allow the convenience of type punning? It’s a big no from me!

In general, the more implicit stuff we allow the worse it is for analyses as they need to be made more powerful to recover from the convenience offered. In this case, it’s clear the semantics of the IR become weaker and we lose the opportunity to make some trivial reasonings.

Other than trivial code, won’t you always have the possibility that somewhere there is a sequence that load your pointer, addrspacecast it, and store it back before the load?
If AA can’t prove that p and w don’t alias, then it is also likely that you can’t prove that the sequence above does not exists and just relying on the address space on the load won’t be enough.

Isn’t the whole point of this for when addrspacecast isn’t a no-op cast? In that the two pointers have different semantics but the same machine type. If addrspacecast were a no-op then you could just use that to stand in for the bitcast, but if it’s not, you want to reinterpret the bits in the right address space without applying an addrspacecast that would mutate the value. As a stupid example, you could imagine addrspace(1) <-> addrspace(2) being a two’s complement operation, and end up with IR where you want to type pun ptr addrspace(1) to ptr addrspace(2), likely due to union shenanigans in the original C source. Type punning should reinterpret the raw bits, like a bitcast, but addrspacecast would negate the bits.

The point is this explicitly is not an addrspacecast and it doesn’t actually make sense to be talking about addrspacecast in the context of this change. It’s doing a raw reinterpret of bits, not something which may change the pointer value as addrspacecast. At no point would we want to try to turn this into an addrspacecast because that’s not what’s happening here.

Using bitcast here avoids the aliasing issues associated with using ptrtoint; the point is this bitcast preserves provenance unlike the current solution of introducing the cast pair.

If this was an invalid reinterpret based on the target usage of address spaces, the code was undefined to begin with. We don’t care how the pointer is loaded or stored later, since we’re only observing the pointer bits in this context. Target knowledge of the address spaces would only add additional mess. That is, knowing a given addrspacecast type pair is a no-op or not would just introduce a second code path here if we were to forcibly introduce an addrspacecast, and we would still need to fall back to the ptrtoint/inttoptr path we have now.

1 Like

I don’t follow how this changes aliasing properties. Different address spaces do not imply a lack of aliasing in general. From an aliasing perspective, this should be identical to the case where the source and destination address space are the same (which is the case where we already introduce a bitcast here today)

This doesn’t address my concerns at all. I’m not worried about the IR semantics, or GVN itself; I’m worried about other passes interpreting the result of GVN. For example, consider this code in alias analysis . It assumes that the operand of a bitcast points to the same location in memory as its result. That’s fairly easy to fix, of course, but how many other places need similar changes?

The point is this explicitly is not an addrspacecast and it doesn’t actually make sense to be talking about addrspacecast in the context of this change. […] Using bitcast here avoids the aliasing issues associated with using ptrtoint; the point is this bitcast preserves provenance unlike the current solution of introducing the cast pair.

Are you really saying that the current transform of introducing inttoptr(ptrtoint(p)) is incorrect because it loses provenance? And so the point of the RFC is a bug fix?

If that is the true motivation for this RFC, then it might make sense to restart the entire discussion with an RFCv2 in which the Motivation section is crystal clear about that.

Otherwise, it’s still just not really clear what any of this solves (and an RFCv2 with a clearer Motivation section could also help).