First of all, I don’t think this specific problem even needs an in-code solution at all. Instead, the analysis that causes the warning to appear near strcmp
can be made smart enough to recognize that s1
and s2
are never null at this point.
That is true, but I one of the axioms at start of my proposal was that it should be ‘(relatively) easy to create a compiler for [C]’. I later postulated that improved null safety does not require path-sensitive analysis and mentioned some compilers which are not huge, complex and resource-hungry, but still perform a useful job of compiling C programs. I believe it also simplifies analysis if only (syntactic) dereferences need to be checked.
You need such smarts in the analysis anyway, to cover another very important case:
int foo(_Optional const char *s1, _Optional const char *s2)
{
if (...) {
return 0;
}
assert(s1);
assert(s2);
return strcmp(s1, s2);
}
I can only guess that the if
condition you have elided is something which allows the programmer to assume that neither s1
nor s2
is null, despite not explicitly checking for that. From my point of view, the assertions are irrelevant, since I’m assuming they are not checked in release builds.
Honestly, I would be OK with forcing the programmer to check for null values of s1
and s2
in that scenario. It seems a bit silly to create an interface which explicitly allows those values to be null but does not handle the consequences (not even by casting away the qualifier).
Now, obviously, sometimes you really need a “force-unwrap” operator to indicate that you’re sure the pointer can’t be null here. In this case “easy to type” isn’t necessarily valuable; say, Rust chose the syntax .unwrap()
which is designed to catch the eye, be easy to notice and audit.
Sorry but I don’t see why anything needs to be built-in. This looks self-explanatory to me:
_Optional int *x = ...;
assert(x);
int *y = (int *)x;
I don’t expect it to be needed very often, so I see your proposals as a ‘nice-to-have’. It would also be nice to have an optional_cast
for use in C++ code, but that wasn’t the language I was mainly concerned with.
I don’t see it as fundamentally different from the following commonplace code:
int x = ...;
assert(x >= 0);
unsigned int x = (int)x;
So like I said in the other thread, I think this attribute doesn’t need to be checked by the static analyzer. The contract behind your attribute can be much simpler, fully resolved with either purely syntactic analysis or with very basic flow-sensitive analysis.
It was one of my design goals that purely syntactic analysis should be sufficient. I used that for all my early testing. However, if you think you can implement basic flow-sensitive analysis in the compiler without requiring use of the static analyzer, I’m very interested in that. I wouldn’t know where to start.
The static analyzer can take advantage of it. You can introduce a warning about any unchecked dereference of the _Optional pointer.
I already implemented that, and my new checks catch a lot of undefined behaviour that was previously ignored. I really like that.
The easiest way to introduce such warning is to perform a state split every time the pointer is encountered: in one state the pointer is null, in the other state it’s non-null. Then the null case simply becomes a path that the analyzer has to explore.
I wanted to do that, and even had an attempt, but it didn’t seem to be necessary to make my prototype useful and I didn’t want my wife to divorce me during my paternity leave. Again, if you think you can do this, then that would be wonderful.
I considered specifying as part of my paper how static analysis should work but deliberately left the wording vague in the expectation that different implementions would diverge. If I’m honest, I think this is the biggest weakness of my paper, not the endless arguments over syntax. However, I take solace from the fact that different toolchains already generate different warnings (or none) for the same code.
Maybe there’s still room in the analyzer to warn about invalid force-unwraps, but most of such warnings would be about potential execution paths that the developer has just explicitly said aren’t there, aka false positives.
That sounds like a bad idea to me, and it seems you agree. I want to maintain a strong distinction between verifiable unwraps (&*s
) and force-unwraps ((int *)s
).
Thank you for your thoughtful comments.