possible addrspacecast problem

Given a pointer, does any addrspacecast affect the pointer’s
dereferenceablity ? For example,
%pm = addrspaacecast float addrspacecast(n)* %pn to float addrspacecast(m)*
%r = load float addrspace(m)* %pm
In another word. the question is whether the following is true ?
isDereferenceablePointer(pn) == isDereferenceablePointer(pm)

[Note that the function is defined as
Value.cpp:isDereferenceaablePointer(const Value*, …) ]

The reason I am asking this is that the current LLVM thinks the answer is yes
(they are the same). But for the following case, it will make non-speculative
load to be moved out of its guard. For example (assume that isDereferenceablePointer(p)
is true),

The following code:

%pm = addrspaacecast float* %p to float addrspacecast(m)*
if (p is in addrspace(m))
// BB_then
%r0 = load float, float addrspace(m)* %pm;
else // p is in default addrspace
// BB_else
%r1 = load float, float* p;
%r = phi float [r0, BB_then], [r1, BB_else]

will be transformed to

%pm = addrspaacecast float* %p to float addrspacecast(m)*
%r0 = load float addrspace(m)* %pm;
%r1 = load float* p;
%r = select i1 (p is in adrspace(m)) float %r0, float %r1

which is wrong as “%r0 = load float addrspace(m)* %pm” would be illegal
if (p is in addrspace(m)) is false.

If we agree that the answer to the above question is no, then fix
is simple (fix diff is attached). I also have a small test to show
the problem, simplifyCFG actually does the above transformation
(to reproduce, use "opt -simplifycfg -S test3.ll -o tt.ll).

Any suggestions/comments ?

thanks

Junjie

fix.diff (603 Bytes)

test3.ll (1.54 KB)

The LangRef says this about addrspacecast: "Note that if the address
space conversion is legal then both result and operand refer to the
same memory location.". This brings forth two related questions:

1. what happens if you execute an "invalid" addrspacecast? Does this
    axiom make addrspacecast non-hoistable? From a quick glance at
    the code, ValueTracking seems to assume that addrspacecasts can be
    speculatively executed.

2. even if two pointers from two different address spaces point to
    the same "location", can one of them not be dereferenceable while
    the other is? In other words, does dereferenceability depend on
    the addrspace of the pointer given that the "location" is
    dereferenceable?

I think the pointer should still be dereferencable once casted, but since the testcase can be predicated on the address space your testcase looks broken. Have you tried making addrspacecast not isSafeToSpeculativelyExecute?

pm = addrspacecast float addrspace(n)* %pn to float addrspace(m)*

If addrspace n and m refers to the same memory, then isDereferenceablePointter(pm) is the same as isDereferenceablePointer(pn). This would be true for OpenCL on CPU. However, if address space n and m are disjoint (like in some GPU), pn in address space n would be undefined in address space m. Once addrspacecast’ed, it would not be able to derive dereferenceablity from the original pointer. My original code segment shows it as well.

My test case tries to have isDereferenceablePointer(pn) to be true (so created it by alloca on stack, which is always dereferenceable). In this way, isDereferenceablePointer(pn) will be true as the current llvm implemenation assumes isDeferenceablePointer(pn) = isDereferenceablePointer(pn). The test case might be a not very good code, but it is correct.

“Have you tried making addrspacecast not isSafeToSpeculativelyExecute?”
isSafeToSpeculativelyExecute() invokes isDeferenceablePointer() to check if it is speculative. So, my patch will make isSafeTospeculativeExecute() to return false.

“but since the testcase can be predicated on the address space your testcase looks broken”
Could you elaborate on this ?

In our architecture, address space casts will never trap and are deterministic (unless there is some asm with unmodelled side effects). However, there are pointers that can be dereferenced in one address space but not another: speculatively performing the load would be a real bug.

Ideally, this would be a target-specific decision, but having the default be to not propagate the dereferencing information and allowing targets to opt in.

I've just been debugging a related issue with regard to commutativity of address space casts with respect to other operations. One of the optimisers is turning an add after an address space cast into an add before the address space cast. On our architecture, this results in different bounds information being available on the pointer and a run-time crash. We could represent casts with a target-specific intrinsic, but we'd rather avoid that if possible. This is not fixed by changing isSafeToSpeculativelyExecute to return false for AddrSpaceCast, so I'll need to dig a bit more to understand exactly when and why it happens.

I think there's a lot of code that still assumes that address space casts are basically bitcasts.

David

In our architecture, address space casts will never trap and are
deterministic (unless there is some asm with unmodelled side effects).
However, there are pointers that can be dereferenced in one address space
but not another: speculatively performing the load would be a real bug.

Ideally, this would be a target-specific decision, but having the default
be to not propagate the dereferencing information and allowing targets to
opt in.

Yes.

I've just been debugging a related issue with regard to commutativity of
address space casts with respect to other operations. One of the
optimisers is turning an add after an address space cast into an add before
the address space cast. On our architecture, this results in different
bounds information being available on the pointer and a run-time crash. We
could represent casts with a target-specific intrinsic, but we'd rather
avoid that if possible. This is not fixed by changing
isSafeToSpeculativelyExecute to return false for AddrSpaceCast, so I'll
need to dig a bit more to understand exactly when and why it happens.

Like to know what is causing this.

Thanks
Junjie

I should start with a disclaimer that I haven't pulled in upstream LLVM for a couple of months, so this might already be fixed...

After SROA, we're left with this:

; Function Attrs: nounwind
define i32 @main() #0 {
entry:
  %call = call i8* @malloc(i64 zeroext 168)
  %0 = bitcast i8* %call to i32*
  %1 = addrspacecast i32* %0 to i32 addrspace(200)*
  call void @set(i32 addrspace(200)* %1)
  %add.ptr = getelementptr inbounds i32 addrspace(200)* %1, i64 41
  call void @test(i32 addrspace(200)* %add.ptr)
  ret i32 0
}

This is correct and semantically valid for our architecture. The cast to AS200 is creating a fat pointer with base and bounds information, the GEP is constructing a new fat pointer with the same base and bounds but a different offset within the object. This survives for a bit, but then:

*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind
define i32 @main() #0 {
entry:
  %call = call i8* @malloc(i64 zeroext 168) #2
  %0 = bitcast i8* %call to i32*
  %1 = addrspacecast i32* %0 to i32 addrspace(200)*
  call void @set(i32 addrspace(200)* %1) #2
  %add.ptr = getelementptr inbounds i8* %call, i64 164
  %2 = bitcast i8* %add.ptr to i32*
  %3 = addrspacecast i32* %2 to i32 addrspace(200)*
  call void @test(i32 addrspace(200)* %3) #2
  ret i32 0
}

So it looks as if InstCombine is deciding that it can hoist the GEP above the addrspacecast. Unfortunately, narrowing it down to InstCombine doesn't actually narrow it down very much - I'll keep looking...

David

It looks like, when the AddrSpaceCast stuff went in, all of the Bitcast code paths in InstCombine were simply extended to also do address space casts. I believe that this is the wrong approach: at the very least, target should be allowed to decide whether address space casts are symmetric, and ideally we'd simply define that they're not required to be unless we have strong evidence that we're killing some useful optimisations. We'll change this and add some tests in our version, but I'm happy to prepare a test for upstreaming if others agree with this assessment.

David

but I'm happy to prepare a test for upstreaming if others agree with this assessment.

I think it is very valuable to have that upstreamed. I think we
should also add other general test cases that show what we definitely
cannot do with addrspacecasts.

-- Sanjoy

Here's the commit in our tree:

https://github.com/CTSRD-CHERI/llvm/commit/751a6c89afd47ee8e7f0c6c3ec3ef7bc1f057806

I'll rebase on the latest LLVM soon and upload a diff to phabricator, but I'd appreciate early feedback if anyone wants to take a look. In particular, we seem to have tests that explicitly check for the old behaviour, but have no comments indicating why. If anyone depends on GEPs being reordered across AS casts then now would be a good time to speak up...

David