Clarifiying the semantics of ptrtoint

nikic · September 30, 2025, 1:47pm

I believe that this discussion has clarified the semantics of ptrtoint, and we have introduced ptrtoaddr with different semantics, to cover the use cases where ptrtoint is not appropriate.

However, I don’t think we reached a consensus on what the semantics of icmp are supposed to be, when it comes to pointers where pointer width != address width.

I think that the three possible semantics here are:

icmp always works on the full width of the pointer.
icmp only works on the low address bits of the pointer.
icmp returns poison if the non-address bits differ.

I believe our current LangRef wording implies the first option (icmp working on full pointer), but we should probably more explicitly consider this now that CHERI upstreaming is in progress.

I don’t think it’s possible to pick option 3 because that would make null pointer comparisons not work (as those presumably would have different metadata bits). More generally, even in C/C++ equality comparisons across objects are not UB. Making different metadata bits poison is only potentially viable for relational comparisons, and I don’t think we’ll want to diverge semantics between those.

How do existing CHERI implementations interpret icmp ptr?

Topic		Replies	Views
Proposal: impose guarantees on introduced inttoptr/ptrtoint pairs when pointers have index type < pointer size IR & Optimizations	15	612	February 17, 2023
Pointers Are Complicated III, or: Pointer-integer casts exposed Community	15	1982	April 21, 2022
Reducing the number of ptrtoint/inttoptrs that are generated by LLVM LLVM Dev List Archives	24	269	January 23, 2019
Proposal: intp type LLVM Dev List Archives	38	238	December 3, 2009
GEP vs IntToPtr/PtrToInt LLVM Dev List Archives	14	319	April 20, 2011

Clarifiying the semantics of ptrtoint

Related topics