Non-void zero pointers and -Wnull-pointer-subtraction

Hello,

I am mostly foggy on some semantics of -Wnull-pointer-subtraction and looking
for a bit of clarity.

Having just tried 13.0.1 on a project, I noticed the following expression
triggering a null pointer subtraction warning:

foo-((A*)0)

where A is different from void. Being a bit naive to this particular UB, a
little digging led me to the C11 standard (draft), which clearly lays out the
behaviour of pointer subtraction. However, it also defines a “null pointer” as
follows 0:

An integer constant expression with the value 0, or such an expression cast
to type void *, is called a null pointer constant.

This, in isolation, would seem to imply that “null pointer” doesn’t include
integer constant expressions cast to some non-void pointer. Am I simply missing
something, or is this a legitimate reading?

Trying to grok this a little more, I poked around the clang source for a bit of
enlightenment. It appears the diagnostic is generated in
clang/lib/Sema/SemaExpr.cpp at Sema::CheckSubtractionOperands, where the
relevant block looks like:

  bool LHSIsNullPtr = LHS.get()->IgnoreParenCasts()->isNullPointerConstant(
      Context, Expr::NPC_ValueDependentIsNotNull);
  bool RHSIsNullPtr = RHS.get()->IgnoreParenCasts()->isNullPointerConstant(
      Context, Expr::NPC_ValueDependentIsNotNull);

  // Subtracting nullptr or from nullptr is suspect
  if (LHSIsNullPtr)
    diagnoseSubtractionOnNullPointer(*this, Loc, LHS.get(), RHSIsNullPtr);
  if (RHSIsNullPtr)
    diagnoseSubtractionOnNullPointer(*this, Loc, RHS.get(), LHSIsNullPtr);

which git blame attributes to commit 9cb00b9ecbe74d19389a5818d61ddee328afe031
by @jamie. So it looks like the code is explicitly ignoring casts of zero
constant expressions. This seems like a stronger interpretation of the
standard. Would you mind shedding some light on the motivations? I must be
missing something.

P.S.
As for why null pointer subtraction is UB, is this because “pointer” and
“memory location” are not the same thing?

Since pointer subtraction is defined 1 to be mostly the “difference of array
indices”, it begs the question whether Q-P is even meaningful in the case
that Q and P are not both part of the same array (or one past the end).

However, on a system with a flat memory model, it seems like we could naively
generalize the calculation to ((intptr_t)Q - (intptr_t)P)/sizeof(Q) or
similar. Granted, on segmented architectures or systems where “memory offset”
and “memory location” have different sizes, this seems fishy.

Furthermore, I am aware that some old, cranky architectures actually represent
“null pointer” by some memory location different from 0, so it seems like
(void *)0 might have the leeway to resolve into some, potentially
instruction-dependent, non-zero value in the instruction operand.

Are the above thoughts vaguely in the right direction?

1 Like

The next sentence of the C11 spec is:

If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

Plus the following paragraph is:

Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal.

So such an expression is not a null pointer constant, but is still a null pointer.

1 Like

Regarding the subtraction, both the C and C++ standards indicate that it is invalid, with differing levels of severity. The C standard states “When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object;” and since a null pointer “is guaranteed to compare unequal to a pointer to any object or function”, a null pointer cannot point to an element of the array; thus, it is actually illegal in C (by my reading). The C++ standard is worded differently stating about such an expression: “the result is the null pointer value of that type and is distinguishable from every other value of object pointer or function pointer type.” It describes the subtraction as UB: “Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined.”

@jrtc27 Doh. It was right there beneth my nose. Thank you.

@Jamie Thank you. I actually linked to that exact paragraph in the P.S. section.

… pointer subtraction is defined 1

Would you have any insight into the reasons/motivations behind such a definition? Am I on the right track in the P.S. section?

I think you are. The original point of the official C standard, as I understood it, was to ensure consistency and portability across platforms without undue constraints, because as you point out, there can be different memory models and a null pointer need not necessarily be zero. There is a definition of “undefined behaviour” which is in section 3.4.3 and states “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.” UB may work on one platform (or compiler), by design or fluke, but it is non-standard and one cannot expect it to work elsewhere. The rules for subtraction are designed to make sense for keeping pointer arithmetic useful and portable. Subtracting to or from a null pointer will almost always (note the “almost” to avoid someone saying “what if…”) represent a bug, hence the warning.

1 Like