Detect Undefined Behavior due to Uninitialized Variables in BPF Programs

Introduction

If the code has uninitialized variable(s), the compiler may take advantage of it
by assuming any value to generate final code which may not be what user expects.
The flags “-Wall -Werror” intends to issue warnings for such cases. But currently
compiler may not issue such warnings in some cases.

In the BPF context, the generated buggy code will be load into the kernel and
likely verification will pass. When the bpf program actually runs, its result
may not be expected and user will then need to check source or asm code to
find reasons. So ideally users really want to get a signal during compilation
where compiler has taken advantage of some uninitialized variables for its
transformation.

A Test Case

Marc Suñé (Isovalent, part of Cisco) reported this issue where an uninitialized
variable caused generated bpf prog binary code not working as expected. See
clang_bpf/Makefile at main · msune/clang_bpf · GitHub
The flags “-Wall -Werror” are enabled, but there is no warning.

The following is a simple code using latest upstram clang20:

    $ cat test.c
    void bar(int *val) {
        int n = *val;
        for (int i = 0; i < 5; i++) {
                switch(n) {
                case 0:
                case 1:
                        return;
                case 2:
                        n = 3; break;
                default:
                        *val = n;       return;
                }
        }
        return;
    }
    int foo(void) {
        int val;

        bar(&val);
        if (val == 1) return 6;
        return 8;
    }
    $ clang --target=bpf -O2 -g -c -Wall -Werror test.c
    $ llvm-objdump --no-show-raw-insn -d test.o

    test.o: file format elf64-bpf

    Disassembly of section .text:

    0000000000000000 <bar>:
       0:       w2 = *(u32 *)(r1 + 0x0)
       1:       if w2 < 0x2 goto +0x3 <bar+0x28>
       2:       if w2 != 0x2 goto +0x1 <bar+0x20>
       3:       w2 = 0x3
       4:       *(u32 *)(r1 + 0x0) = w2
       5:       exit

    0000000000000030 <foo>:
       6:       w0 = 0x8
       7:       exit

You can see that even with “-Wall -Werror”, there is no warning.

Possible Solutions

The ideal case is that compiler issue warnings with uninitialized variables.
“-Wall -Werror” implies “-Wunitialized”. gcc has “-Wmaybe-unitialized” which
dumps out possible unitialized variables. Could clang supports this whenever
an unitialized variable may impact generted codes?

Another option is option “-fsanitize=undefined”. Currently it is not clear how
this can be supported in bpf. So compiler warnings are still preferred.

Any other suggestions are also welcome.

Simple implementation of this feature can lead to lots of false positives making the warning less useful. Predicate aware warning is needed: predicate aware uninitialized analysis · gcc-mirror/gcc@34f97b9 · GitHub

For the example case, interprocedural analysis is also needed if inlining does not happen for call to ‘bar’.

I agree that maybe-uninitialized may have a lot of false positives due to limited analysis. Predicate aware warning should help. Are there any plan for this?

On the other hand, during IR optimizations, if a ‘undef’ variable is used during optimization, maybe we can expose such information to users so users knows
undefined behavior may happen to the code?

For example, in the above code, if I do:

   clang --target=bpf -O2 -S -emit-llvm test.c -mllvm -print-after-all

I have the following IR dump:

...
; *** IR Dump After OpenMPOptCGSCCPass on (foo) ***
; Function Attrs: nounwind
define dso_local range(i32 6, 9) i32 @foo() local_unnamed_addr #2 {
  %1 = alloca i32, align 4
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %1) #3
  %2 = load i32, ptr %1, align 4, !tbaa !3
  switch i32 %2, label %4 [
    i32 0, label %6
    i32 1, label %6
    i32 2, label %3
  ]

3:                                                ; preds = %0
  br label %4

4:                                                ; preds = %3, %0
  %5 = phi i32 [ %2, %0 ], [ 3, %3 ]
  store i32 %5, ptr %1, align 4, !tbaa !3
  br label %6

6:                                                ; preds = %0, %0, %4
  %7 = load i32, ptr %1, align 4, !tbaa !3
  %8 = icmp eq i32 %7, 1
  %9 = select i1 %8, i32 6, i32 8
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %1) #3
  ret i32 %9
}
; *** IR Dump After SROAPass on foo ***
; Function Attrs: nounwind
define dso_local range(i32 6, 9) i32 @foo() local_unnamed_addr #2 {
  switch i32 undef, label %2 [
    i32 0, label %4
    i32 1, label %4
    i32 2, label %1
  ]

1:                                                ; preds = %0
  br label %2

2:                                                ; preds = %1, %0
  %3 = phi i32 [ undef, %0 ], [ 3, %1 ]
  br label %4

4:                                                ; preds = %0, %0, %2
  %5 = phi i32 [ %3, %2 ], [ undef, %0 ], [ undef, %0 ]
  %6 = icmp eq i32 %5, 1
  %7 = select i1 %6, i32 6, i32 8
  ret i32 %7
}

You can see in SROAPass, ‘undef’ is used in code transformation.
It would be great to expose such information as a warning to the user.
I guess llvm could have a flag to enable this.
WDYT?

When building the test with ubsan (-fsanitize=undefined), compiler generates unconditional trap in ‘foo’. For such cases, warnings will be useful.

; Function Attrs: mustprogress nofree noinline norecurse noreturn nosync nounwind willreturn memory(none) uwtable
define dso_local noundef range(i32 6, 9) i32 @foo() local_unnamed_addr #2 !func_sanitize !13 {
unreachable
}

The following is the IR with ‘–target=bpf’ and -fsanitize=undefined:

*** IR Dump After Safe Stack instrumentation pass (safe-stack) ***
; Function Attrs: mustprogress nofree norecurse noreturn nosync nounwind willreturn memory(none)
define dso_local noundef range(i32 6, 9) i32 @foo() local_unnamed_addr #2 !func_sanitize !10 {
  unreachable
}
# *** IR Dump After BPF DAG->DAG Pattern Instruction Selection (bpf-isel) ***:
# Machine code for function foo: IsSSA, TracksLiveness

bb.0 (%ir-block.0):

# End machine code for function foo.

That is, bpf currently not able to support ‘unreachable’ to an invalid insn. I will take a look. Another choice, we can have a BPF backend pass to specifically check undefined behavior and issue warning.

Without -fsanitize=undefined, I think that it is not guaranteed that bpf backend pass can capture undefined behavior (undef in actual IR) due to various optimizations which may already optimize before bpf backend pass checks IR. WDYT?

Leveraging/reusing existing functionality for bpf sounds like the right path forward.

I did some experiments when some uninitialized variables exist in the code. I do found that ‘unreachable’ in the IR before lowering. But it looks like some ‘unreachable’ can turn into an error during compilation time (e.g., the same in the above), but some other ‘unreachable’ cannot turn into an error at compilation time since they may be turely unreachable. I identified two kind of cases:

  1. function with ‘naked’ attribute where inline asm is the func body. Since llvm/IR didn’t really analysis inline asm, it puts a ‘unreachable’ at the end of inline asm. This may or may not be true.
  2. I see some examples like switch statement where the default target is actually not unreached.

So for the above two cases, even we see unreachable insn in the IR, we should not flag it to the user. But this requires some kind of analysis in LLVM…

I also find a C option -ftrivial-auto-var-init=# (e.g. -ftrivial-auto-var-init=zero) where the compiler can initialize simple uninitialized variable with certain value. This will at least make code not taking advantage of undefined behavior.

Right, these are really the same issues that affect GCC’s middle-end warnings overall. You need to do them late enough that you pruned obviously unreachable paths, but not so late that you lost a lot of information or optimised out a possible UB case and so on.