Question about mem2reg's handling of global variables

Description

This C code snippet:

void foo() {
  static int baz = 42;
  int *p, *q = p;
  printf("%p %p\n", p, q);
  p = &baz;
  q = p;
}

is transformed into

define void @foo() {
entry:
  ;                                                  vvvvvvvv      vvvvvvvv
  %call = call i32 (ptr, ...) @printf(ptr @.str, ptr @foo.baz, ptr @foo.baz)
  ret void
}

by mem2reg. But if the last two lines are removed, the transformation result is:

define void @foo() {
entry:
  ;                                                  vvvvv      vvvvv
  %call = call i32 (ptr, ...) @printf(ptr @.str, ptr undef, ptr undef)
  ret void
}

My environment:

  • WSL2, Arch Linux
  • clang 15.0.7
  • tried -O0, -O1, -O2, -O3, and -Ofast

Cause

// In file: llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
static bool rewriteSingleStoreAlloca(...) {
  StoreInst *OnlyStore = Info.OnlyStore;
  bool StoringGlobalVal = !isa<Instruction>(OnlyStore->getOperand(0)); // true
  ...
  for (...) {
    ...
    if (!StoringGlobalVal) { // skipped

My Question

Although using uninitialized local variables is UB (per C11 Annex J.2), I wonder if it makes more sense not to replace load’s before any store with the stored value when handling global (static) variables?

It would require a lot of logic if we wanted to guarantee such a behavior. Basically, we would often end up not replacing the load. If you want to detect UB, use the UB sanitizer, the optimizer should not be limited by “sometimes, maybe” detecting UB.

1 Like

You are aware the the assignment of q = p is erroneous, as p never gets a value.
You are essentially assigning an uninitilized pointer to another pointer. {If you
assume that the stack in which p and q will live is completely randomized between
invocations, you can end up with any possible bit pattern in p and q.

The assignments at the end of the function disappear at return from the function
and do not persist from return to call. It is erroneous to assume that these values
persist on the stack between invocations.

The IR containing

is similarly erroneous, since neither p nor q has been assigned the address of baz
in a way that would persist at the call to printf(). Therefore the second snippet of

is the best interpretation from the starting ASCII. You are calling printf() with
2 uninitialized pointers.