Abstract
Current memory semantics state that loading an uninitialized memory location yields undef. This prevents optimization of single branch initializations due to a mismatch in undefined behavior type between the undef of the uninitialized variable and other variables which may be poison. Converting uninitialized memory accesses to poison introduces issues with bit-fields where single member variable initialization always results in loading uninitialized memory (see GSoC topic post). This proposal outlines the current limitations of uninitialized memory loads returning undef and a potential solution of uninitialized memory loads returning poison or a nondeterministic value.
Mentee: John McIver @jmciver
Mentor: Nuno Lopes @nlopes
Introduction
Interactions between undef and poison are nuanced and can be difficult to reason about. Additionally, the interaction between the two types can prevent or make existing optimizations incorrect. For these reasons we would like to replace undef with poison wherever possible.
The constant undef was created to represent the return value of uninitialized memory but was later used as any arbitrary constant in other constructs. Later, poison was added to represent signed overflow as undef was insufficient for the desired optimizations and undefined behavior (UB) was too strong. UB is used whenever an operation may trap the CPU (e.g., divide by zero), for example.
Phi with Undef and Poison
Currently the LLVM IR uses undef when dereferencing an uninitialized memory location. The following is an example of single branch initialization:
int phi_example(bool cond, int* ptr) {
int a;
if (cond) {
a = *ptr;
}
return a;
}
After optimization, the following IR is produced:
define i32 @phi_example(i1 %cond, ptr dereferenceable(4) %ptr) {
br i1 %cond, label %if.then, label %if.end
if.then:
%load = load i32, ptr %ptr
br label %if.end
if.end:
%a.0 = phi i32 [ %load, %if.then ], [ undef, %entry ]
ret i32 %a.0
}
Ideally, the phi
could be folded to:
define i32 @phi_example(i1 %cond, ptr %ptr) {
%load = load i32, ptr %ptr
ret i32 %load
}
However, this is not correct if %cond
were false and load were to yield poison then we would be replacing undef with poison. Poison is less defined than undef and therefore the fold is not correct.
The solution is to convert loading of uninitialized memory locations to emit poison.
Current Uninitialized Memory Semantics
The following figure shows current uninitialized memory load semantics using undef. Cases 2 shows that a uninitialized byte (0x??
) results in an i8 undef. Case 3 is similar to case 4, but with 4 bytes of uninitialized memory resulting in an i32 undef. Case 4 shows one defined byte and three uninitialized bytes results in a i32 of both defined and undef values.
For example the following xor_a
function can be optimized into a single return statement of zero. If %load.2
were to come from a different pointer variable dereferencing the same or different uninitialized memory location then the return value of the optimized function would be undef.
; Example of multiple loads of the same address with undef semantics
; Unoptimized
define i32 @xor_a(ptr %a.addr) {
entry:
%load.1 = load i32, ptr %a.addr
%load.2 = load i32, ptr %a.addr
; Note: Values of %load.1 and %load.2 are the same nondeterministic value.
%xor = xor i32 %load.1, %load.2
ret i32 %xor
}
; Optimized
define i32 @xor_a(ptr %a.addr) {
ret i32 0
}
Semantic Proposal
For a majority of use cases initialized loads can be converted to poison. However, this is not possible with language features such as bit-fields. Additionally, any semantic changes need to be backwards compatible with bitcode generated with undef.
Our proposal is to add a new version of the load instruction via an unique opcode. This will provide an easily detectable trigger by which automatic semantic conversion can occur. Additionally, a flag will be added to the new load instruction allowing uninitialized memory to be converted to a nondeterministic value, this provides the semantic capability to mimic undef.
The new flag !uninit_is_nondet
will be used to convert poison to a nondeterministic value. Like undef-semantics, this nondeterminism will be applied at the byte level. The following figure shows examples of the byte level nondeterminism for single and multi-byte types.
Behavioral Discrepancies
Current uninitiated memory load semantics and load duplication results in the same nondeterministic value. Duplicate loads using the proposed load semantics and !uninint_is_nondet
may potentially not result in the same nondeterministic value. This is because each load is effectively performing a series of byte level nondeterministic generations.
Another consideration is load instruction relocation. If a load is using the !uninit_is_nondet
flag it cannot be pushed inside a loop. This is due to the potential increase in nondeterministic results. The load outside a loop will result in a single nondeterministic value as where inside the loop body the load
can potentially generate many differing nondeterministic results.
Emplace Upgrade Path
Bitcode generated using load semantics with undef can be automatically upgraded using AutoUpgrade
translation. Loads performed with the earlier opcode will be converted to the new opcode and the !uninint_is_nondet
attribute will be applied.
loadv1 <ty>, ptr <pointer> ā loadv2 <ty>, ptr <pointer>, !uninit_is_nondet !{}
Mem2reg Example
Taking the phi with undef and poison example with poison semantics, if the load were decorated with !uninit_is_nondet
then the result would always be a nondeterministic value. This would result in a definedness mismatch at the phi as poison is less defined.
define i32 @phi_example(i1 %cond, ptr dereferenceable(4) %ptr) {
br i1 %cond, label %if.then, label %if.end
if.then:
%load = load i32, ptr %ptr, !uninit_is_nondet ; Causes definedness mismatch
; at phi
br label %if.end
if.end:
%a.0 = phi i32 [ %load, %if.then ], [ poison, %entry ]
ret i32 %a.0
}
To resolve this issue a freeze poison would need to be inserted in the entry:
define i32 @phi_example(i1 %cond, ptr dereferenceable(4) %ptr) {
%freeze = freeze i32 poison ; Add nondeterministic value to
; match definedness at phi
br i1 %cond, label %if.then, label %if.end
if.then:
%load = load i32, ptr %ptr, !uninit_is_nondet
br label %if.end
if.end:
%a.0 = phi i32 [ %load, %if.then ], [ %freeze, %entry ]
ret i32 %a.0
}
Bit-field Examples
struct Bitfields {
unsigned int a : 3;
unsigned int b : 3;
};
unsigned int load_example() {
Bitfields s;
s.a = 4;
s.b = 6;
...
}
With undef
The following example shows generated IR using the current undef semantics. The initial load instruction results in a nondeterministic value, which can be cleared, set, and stored. Subsequent loads for the bit-field are a combination of the nondeterministic value and set fields.
; Attempt to store 4 into āaā bit-field
%bf.load1 = load i8, ptr %s, align 4 ; Load of uninitialized memory
; results in sampled undef,
; assume 0bXXXXXXXX
%bf.clear1 = and i8 %bf.load1, -8 ; Propagate 0bXXXXX000
%bf.set1 = or i8 %bf.clear1, 4 ; Propagate 0bXXXXX100
store i8 %bf.set1, ptr %s, align 4 ; Store 0bXXXXX100
; Attempt to store 6 into ābā bit-field
%bf.load2 = load i8, ptr %s, align 4 ; Previous store was
; 0bXXXXX100
%bf.clear2 = and i8 %bf.load2, -57 ; Propagate 0bXX000100
%bf.set2 = or i8 %bf.clear2, 48 ; Propagate 0bXX110100
store i8 %bf.set2, ptr %s, align 4 ; Store 0bXX110100
With poison
The following example shows the previous IR, but with loading of uninitialized memory using poison semantics. The initial load instruction will result in poison rather than undef. Because poison propagates until it encounters a freeze instruction, subsequent loads will result in poison. This
will result in a reduced number of optimizations.
; Attempt to store 4 into āaā bit-field
%bf.load1 = load i8, ptr %s, align 4 ; Load contiguous bit-field
; memory, result poison
%bf.clear1 = and i8 %bf.load1, -8 ; Propagate poison
%bf.set1 = or i8 %bf.clear1, 4 ; Propagate poison
store i8 %bf.set1, ptr %s, align 4 ; Store poison
; Attempt to store 6 into ābā bit-field
%bf.load2 = load i8, ptr %s, align 4 ; Previous store was poison,
; result poison
%bf.clear2 = and i8 %bf.load2, -57 ; Propagate poison
%bf.set2 = or i8 %bf.clear2, 48 ; Propagate poison
store i8 %bf.set2, ptr %s, align 4 ; Store poison
With poison and !uninit_is_nondet
The following adds the !uninit_is_nondet
to the with poison example on the first load. This provides a similar nondeterministic behavior to that of the undef example, which allows the clear, set, and store cycle to represent concrete values.
; Attempt to store 4 into āaā bit-field
%bf.load1 = load i8, ptr %s, align 4, !uninit_is_nondet ; Semantically equivalent to
; freeze i8 poison, assume
; 0bXXXXXXXX
%bf.clear1 = and i8 %bf.load1, -8 ; Propagate 0bXXXXX000
%bf.set1 = or i8 %bf.clear1, 4 ; Propagate 0bXXXXX100
store i8 %bf.set1, ptr %s, align 4 ; Store 0bXXXXX100
; Attempt to store 6 into ābā bit-field
%bf.load2 = load i8, ptr %s, align 4 ; Previous store was
; 0bXXXXX100
%bf.clear2 = and i8 %bf.load2, -57 ; Propagate 0bXX000100
%bf.set2 = or i8 %bf.clear2, 48 ; Propagate 0bXX110100
store i8 %bf.set2, ptr %s, align 4 ; Store 0bXX110100
Conclusion
We propose the use of a new load instruction that supports the use of attribute !uninit_is_nondet
. This enables poison based semantics where possible, provides a nondeterministic capability similar to that of undef where needed, and allows for an automatic upgrade path to existing bitcode. However, load duplication with !uninit_is_nondet
is not supported.
Thanks for taking the time to read and your feedback is greatly appreciated.