Clarifiying the semantics of ptrtoint

I believe that this discussion has clarified the semantics of ptrtoint, and we have introduced ptrtoaddr with different semantics, to cover the use cases where ptrtoint is not appropriate.

However, I don’t think we reached a consensus on what the semantics of icmp are supposed to be, when it comes to pointers where pointer width != address width.

I think that the three possible semantics here are:

  1. icmp always works on the full width of the pointer.
  2. icmp only works on the low address bits of the pointer.
  3. icmp returns poison if the non-address bits differ.

I believe our current LangRef wording implies the first option (icmp working on full pointer), but we should probably more explicitly consider this now that CHERI upstreaming is in progress.

I don’t think it’s possible to pick option 3 because that would make null pointer comparisons not work (as those presumably would have different metadata bits). More generally, even in C/C++ equality comparisons across objects are not UB. Making different metadata bits poison is only potentially viable for relational comparisons, and I don’t think we’ll want to diverge semantics between those.

How do existing CHERI implementations interpret icmp ptr?