[RFC] _Optional: a type qualifier to indicate pointer nullability

Thank you for putting together this RFC! For others, something along these same lines was previously proposed by ISO C3X proposal: nonnull qualifier which did not end with strong consensus but did have significant interest from folks in the community.

I have specific comments below, but a summary of my current position is: despite the importance of this topic, I don’t think it should be added to Clang at this time. I disagree with the syntactic choice made to put the qualifier on the pointee rather than the pointer. This design space is littered with attempts to solve the same problem and it’s not clear that this approach is going to be an improvement. The proposal is largely experimental in terms of design and needs a stronger indication that a standards body really believes in this design (in terms of syntactic choices, at the very least) and data demonstrating how this feature catches bugs that cannot be caught by other features in the same space (not just due to missing diagnostics that are possible to implement). The proposal also needs to work for all pointer types without requiring the user to jump through hoops (like using typedefs).

The existing features in this space that Clang supports are, at least:

  • [[gnu::nonnull]]
  • [[gnu::returns_nonnull]]
  • _Nonnull (and friends)
  • static array extents (C only)
  • references (C++ only)

The proposal touches on why the author believes some of these are deficient, but it doesn’t really talk about which ones are repairable in terms of diagnostic behavior or how the new functionality will interact with existing functionality. More examples comparing and contrasting the existing features with the proposed feature could perhaps help make this more clear.

In terms of syntactic choices, as mentioned above, I disagree that the qualifier belongs on the pointee rather than the pointer. Losing the qualifier on lvalue conversion does not seem to be an issue, as after lvalue conversion the loaded value can never be changed. So once you’ve done an lvalue conversion to read the pointer value itself, optionality of the pointee is resolved (so dropping the qualifier should not lose any diagnostic fidelity).

There’s also some details missing about how the feature plays with various languages. In Objective-C, object pointers are a bit special; should this qualifier be usable for them? In C++, if a member function is marked as being _Optional, what are the semantics? Separately, do you expect this qualifier to be mangled as part of a function signature? Can users in C++ overload functions (or in C with __attribute__((overloadable))) or create template specialization based on this qualifier? In C, how does type compatibility work? e.g., are these valid redeclarations? void func(int *i); void func(_Optional int *i) {} (Note, if these are not valid redeclarations, that’s another difference in behavior from other qualifiers in that position and is likely to cause confusion.) What is the ABI of passing an optional pointer and does it differ from the ABI of passing a regular pointer? What happens if one TU declares an extern pointer variable and another TU defines it as being an optional pointer (or vice versa)?

Despite my current feeling of “we should not add this at this time”, I think it’s useful to continue the discussion and iterate on the design to see if we can come to something that we think should be added at some point in the future.

This isn’t fully accurate. As you note below, static array extents are the way in which you signal that a pointer value cannot be null. However, adoption of static array extents in industry has been slow, common tooling often misses helpful diagnostics, the syntax is currently limited to only function interfaces, the syntax is awkward for void pointers, etc. But the syntax and desired semantics do exist in C already today.

The concerns around applying to parameters is a non-issue now that C has [[]] style syntax (and also given that Clang supports __attribute__ syntax on parameters directly). Also, intrusive and verbose are personal style concerns rather than technical issues with the functionality, so “I want this to be a keyword because I don’t like the way attributes look” is not very compelling, especially given how often attributes wind up hidden behind macros.

I think all pointer types already have the property of being able to represent a null pointer value. From that perspective, a type qualifier is unnecessary, that’s just the way the language already works today. However, being able to signal (to readers, to an analyzer, to the optimizer, etc) that a pointer value is expected to never be null is useful. The converse, a type signaling that a pointer is only ever null, is already handled by the nullptr_t type in C2x. But any annotation that says “this pointer might be null” seems like a non-starter because that is how pointers already work, so incremental adoption would be very unlikely.

Hmm, this has nothing to do with pointers though. This is how lvalue conversions work, and it works that way on everything, not just pointers. This is observable via _Generic already today (try getting it to associate with a qualified type, it won’t happen unless the qualifier is on another level of the declarator).

I don’t agree with that assessment. They’re qualifying the pointer to give information about what values the pointer may have, not what objects they point to. So, to me, they’re qualifying exactly what I would expect. And lvalue conversion gives you exactly the properties I’d expect – after obtaining the value of the pointer itself, you no longer need any marking to say whether it’s nonnull or not because you already have the value and it’s too late to ask that question. Basically, lvalue conversion is the point at which you would test whether the pointer could be null or not. It can never change state after lvalue conversion.

I’m not sold on the name given the current proposal. To me, _Optional has very little to do with pointer types and everything to do with values. e.g., I would not want to close the door on being able to write:

  _Optional int get_value(struct something *ptr) {
    if (ptr)
      return ptr->field;
    return _None;
  }

  int main() {
    _Optional int val = get_value(nullptr);
    if (val)
      return *val; // Steals C++ syntax, but the * could be replaced by another syntactic marker
    return EXIT_FAILURE;
  }

this is more in line with Python’s optional functionality, as well as the same idea from C++ (though they solved it with a library feature rather than a language feature).

That said, I think:

  int * _Optional get_ptr_value(struct something *ptr) {
    if (ptr && ptr->field)
      return ptr->field;
    return _None;
  }

is along the same lines of what you’re proposing, except that the optionality is lost on lvalue conversion. But again, once you have an rvalue, the state of the original object cannot change in a way that effects the loaded rvalue and so losing the _Optional qualifier is not harmful.

To be honest, if you designed _Optional to be less about pointers and more about values in general, I think the feature becomes much more compelling.

This is novel in C; no other qualifier works this way. So I’m not convinced that this is going to be less incompatible, confusing, and error-prone than other solutions. Also, const doesn’t always indicate anything about an object’s address (it can in theory for globals, but doesn’t for things like function parameters).

This continues to be an unresolved issue showing the irregularity of the syntax choice of the proposal to qualify something other than the pointer. The fact that you need to jump through hoops for function pointers, specifically, is a serious concern – these are pointers I would expect people would very much want to annotate if they’re plausibly going to be null pointers.

FWIW, the reason for the limitation against qualified function types in C is because functions are not data, but the function pointer is. That’s why you can qualify the pointer but not the function type itself.

2 Likes