Setjmp() and static analysis

Hi,

consider the following code:

#include <setjmp.h>

extern int f2(void);

int
f1(void)
{
    int result;
    jmp_buf env;

    if (setjmp(env) == 0)
    {
        result = 1;
    }
    if (f2())
    {
        result = 2;
    }

    return result;
}

Static analysis with Clang 15 emits the following warning:

$ scan-build-15 clang-15 -c ljump.c
scan-build: Using '/usr/lib/llvm-15/bin/clang' for static analysis
ljump.c:21:5: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
    return result;
    ^~~~~~~~~~~~~
1 warning generated.
scan-build: Analysis run complete.
scan-build: 1 bug found.
scan-build: Run 'scan-view /tmp/scan-build-2022-10-08-141123-8953-1' to examine bug reports.

The static analyzer does not know that setjmp() is guaranteed to return 0 when the calling environment is saved and that therefore variable result is guaranteed to be initialized. Is it possible to inform the static analyzer somehow about this behavior of setjmp() so that this warning does not appear?

Thanks
Stephan

result is unspecified if you don’t make it volatile but mutate it between setjmp and longjmp, i.e. if you longjmp back to here and then f2 returns 0 it’s unspecified (because it could be allocated in a register and thus get lost; marking it volatile ensures it gets allocated to the stack). Making it volatile doesn’t suppress that analysis result it seems, but it’s not a false positive in your original code.

1 Like

You are right about the need for volatile, and the original code from which I produced the reduced example actually has it. But as you say even with int volatile result; the warning is produced.

Yes looks like a bug, we need to teach the static analyzer about setjmp()'s behavior. That’s one of the rare cases where the if-statement doesn’t mean what it usually means, i.e. in this case not all of its branches are instantly accessible, so it’s a great thing to special-case.

I think that’s an easy function to model with summaries, @martong WDYT? (there may be an annoying problem with setjmp() being defined as a macro in practice)

1 Like

Let’s assume that setjmp was not a macro but a normal identifier that we can lookup. Then,
yes, we could model that setjmp alwasys returns with 0 in the StdLibraryFunctionChecker. This could look like this:

    Optional<QualType> JmpBufTy = lookupTy("jmp_buf");
    addToFunctionSummaryMap(
        "setjmp", Signature(ArgTypes{JmpBufTy}, RetType{IntTy}),
        Summary(EvalCallAsPure)
            .Case({ReturnValueCondition(WithinRange, SingleValue(0))},
                  ErrnoIrrelevant));

On the other hand, I am not sure if this would be a sound modeling considering that a call to longjmp(env, val) would continue the execution as if setjmp(env) would have returned with val. In that path the model would be faulty. So, in my opinion, we should be more clever than just simply providing one branch to setjmp, we should consider the modeling of longjmp as well.