[llvm-commits] Proposal: New IR instruction for casting between address spaces

Problem:
Bit casting between pointers of different address spaces only works if all address space pointers are the same size. With changes from email chain [1][2], support for different pointer sizes breaks the bitcast instruction since there is no guarantee that the pointer size for the address space is on the source and destination arguments are of the same size.

We also have this problem for our architecture. We're currently doing the ptrtoint / inttoptr dance with lots of hacks and it's very ugly.

Our tree is here, and I'd be happy to push any changes upstream:

https://github.com/CTSRD-CHERI/llvm/

Solution:
Remove the ability of bitcast to cast between pointers of different address spaces and replace with an instruction that handles this case explicitely.

Proposed changes:

* Add restriction to the verifier on the bitcast instruction making bitcasting between address spaces illegal.

Sounds good.

* Change documentation[3] to state the bitcast to pointers of different address spaces is illegal.

Great.

* Add in a new IR node, addrspacecast, that allows conversions between address spaces

That would simplify some of our hacks in clang significantly.

There are a few things missing from your proposal, however. The most obvious is supporting different sizes of pointers in LLVM. There are a huge number of places where the assumption that all pointers have the same size exist in LLVM (I've proposed to give a talk about this at the DevMeeting in November). The first is the target description. Beyond that, LLVM appears to have three constant folders that all need to be taught that inttoptr ptrtoint pairs can't be replaced with bitcasts unless the pointers are in the same address space. I've done this in our tree.

Finally, don't forget that you'd also need to add an address space cast instruction to the SelectionDAG. I added PTRTOINT and INTTOPTR ones in our tree because I didn't want to make the invasive change to the IR that really needed.

We also needed to add an iFATPTR to the ValueTypes, but for the more general solution (the reason I haven't pushed ours upstream is that it's full of ugly hacks and FIXMEs) you'd need the possibility of multiple iPTR-like types with different sizes. Possibly just enumerating the plausible ones would work (ours are 256 bits, which LLVM really doesn't like). Things like the atomic operations are currently broken on pointers because they are lowered by some code that doesn't know how big iPTR is. The proposed solution to this was for the front end to lower them to integers for the atomic intrinsics, but this won't work for us because our architecture doesn't have i256 support, just 256-bit pointers (which are not just just integers).

David