Any Optimization Suggestion to Get Rid of AddrSpaceCast around PHI

In the following example, for some reasons, the input pointer entering the loop was casted to generic pointer. How can the backend get rid of the
addrspacecast and use local store in the loop?

for.body.lr.ph: ; preds = %entry

%0 = addrspacecast i32 addrspace(3)* %in to i32 addrspace(4)*

br label %for.body

for.body: ; preds = %for.body, %for.body.lr.ph

%i.03 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]

%ptr.02 = phi i32 addrspace(4)* [ %0, %for.body.lr.ph ], [ %add.ptr, %for.body ]

store i32 %i.03, i32 addrspace(4)* %ptr.02, align 4

%add.ptr = getelementptr inbounds i32 addrspace(4)* %ptr.02, i64 4

%inc = add i32 %i.03, 1

%exitcond = icmp eq i32 %inc, %numElems

br i1 %exitcond, label %for.end, label %for.body

for.end: ; preds = %

Thanks;

Changpeng

It's fairly simple to find all of the casts and then walk down the use chains, identifying whether the pointers escape, and if they don't rewrite all of the instructions to use the other address space. I've done this in an (extremely hacky) pass to allow allocas to be in an address space other than 0 (which I'm slowly replacing with much less hacky code).

As I understand it, for your target stores in AS 4 are more expensive than stores in AS 3, but both can refer to the same memory?

David

It’s fairly simple to find all of the casts and then walk down the use chains, identifying whether the pointers escape, and if they don’t rewrite all of the instructions to use the other address space. I’ve done this in an (extremely hacky) pass to allow allocas to be in an address space other than 0 (which I’m slowly replacing with much less hacky code).

But I got lost when a PHI node (as in the test case) is encountered.

As I understand it, for your target stores in AS 4 are more expensive than stores in AS 3, but both can refer to the same memory?

Yes. Load/store in generic address space in expansive. The casts also add a small overhead in address computation.

Thanks;

Changpeng

When a PHI node is encountered, the easiest thing to do is add a second one with the correct type. Keep a list of them and whenever the original phi shows up as a user of something that you're rewriting, add the value that you're visiting to the new PHI. If, at the end, you have any missing entries in the new phi node, then you visit the corresponding predecessor block and insert an address space cast of the value that the old phi expected and add this.

For most things, you can just use replaceUsesOfWith and just walk forward along the use chain of phi nodes ptrtoint-arithmetic-inttoptr sequences, and GEPs.

David