InstCombine introduces inttoptr, violating GC assumptions


The following post is mostly relevant for GC statepoints.

InstCombine canonicalizes loads which are only stored to operate over integers. Also, it does simple load CSE, introducing a cast if the types don't match exactly. For example, assuming %a and %b don't alias:

   %load1 = load i8 addrspace(1)*, i8 addrspace(1)* addrspace(1)* %a
   store i8 addrspace(1)* %load1, i8 addrspace(1)* addrspace(1)* %b

   %load2 = load i8 addrspace(1)*, i8 addrspace(1)* addrspace(1)* %a
   call void @use(i8 addrspace(1)* %load2)

is transformed into:

   %1 = bitcast i8 addrspace(1)* addrspace(1)* %a to i64 addrspace(1)*
   %load11 = load i64, i64 addrspace(1)* %1, align 8
   %2 = bitcast i8 addrspace(1)* addrspace(1)* %b to i64 addrspace(1)*
   store i64 %load11, i64 addrspace(1)* %2, align 8

   %load2.cast = inttoptr i64 %load11 to i8 addrspace(1)*
   call void @use(i8 addrspace(1)* %load2.cast)

The first problem is that, in the presence of a relocating GC, the store could write the wrong value because during "..." the object in %a was relocated and RewriteStatepointsForGC didn't know it had to adjust the value of %load11. As a solution, the code doing the canonicalization could query the GC strategy associated with the function whether this transformation is allowed. Is it possible to get the GC strategy in InstCombine? I remember that is was difficult and it was planned to move the ownership of the GC strategies to LLVMContext to make it easier.

Even for non-relocating garbage collectors the canonicalization creates a problem in combination with the load CSE. In the example, an inttoptr was created, on which RewriteStatepointsForGC crashes. To fix that, the canonicalization could be forbidden also for non-relocating garbage collectors. Alternatively, RewriteStatepointsForGC could be made more clever to look through the inttoptr.