Summary
We propose new mechanisms to model deferred undefined behavior in MLIR using the notion of poison
. The goal is to allow dialects to better define their op semantics and reason about it in transformations. Initially, we plan to use it in ‘LLVM’, ‘arith’, and ‘vector’ and motivate this RFC with a few examples of arith and vector ops of choice. The exact use of this model in arith, vector, and other dialects is outside of the scope of this RFC.
This proposal generalizes the notion of poison
/ freeze
from LLVM IR [1, 2, 3] so that it can be applied to an open set of types and ops. Poison semantics are opt-in on a per-type/op basis and should not break existing dialects.
We also propose a new dialect ‘ub’ that provides basic poison
-related ops for use in documentation, tests, and some transformations. It includes the following ops: ub.poison
, ub.freeze
, ub.unreachable
. Dialects can define their own equivalent ops using the proposed interfaces.
Background
Not all ops have well-defined behavior for all inputs values, for example:
- Integer addition:
arith.addi
can be defined for all possible inputs as mathematical addition of the operands modulo the 2^N where N is the bit-width. - Right shift:
arith.shrui
can require the shift value to be less than the bit-width, and have no meaningful and consistent interpretation otherwise. - Integer division:
arith.divui
requires the divisor to be non-zero, and the underlying division instruction may trap.
While A and C can already be reasoned about with existing constructs (e.g., the ConditionallySpeculatable
op interface), we need some formalized framework that allows us to precisely define what happens for B. when preconditions of ops are not satisfied and how the unspecified results propagate and interact with ops that use them [4].
This framework will enable us to employ more speculative execution and related transformations (e.g., hoisting, duplication) in a way that is consistent and easy to reason about.
Semantics
We propose to introduce poison
as an extra ‘virtual’ state that types define and ops mutate. Poison representation is a property of types which affects op semantics. Poison state allows us to define ops that may not always return values with a well-defined state.
To make the presentation easier and gradually build up the semantics, we consider 4 cases:
- Builtin scalars, modeled after poison in LLVM
- Builtin vectors, modeled similarly to LLVM
- Custom types
- Compound types consisting of arbitrary subtypes
Scalar Case
We consider the poison state to be an extra ‘virtual’ bit. For example, we would say that i32
has 2^32 possible well-defined values and one extra value called poison
which is separate from those plain bitvectors. When the poison bit is on, we say that the entire value is poison
. Otherwise the value is well-defined.
A use of a poison
value can result in one of the following, depending on the op (inline constants for brevity):
- Another
poison
, e.g.,arith.addi %x, poison : i32
- A well-defined non-
poison
value, e.g.,arith.select false, poison, 1 : i32
- Immediate UB, e.g.,
arith.divui poison, 0 : i32
The ub.poison : ty
op creates a poison value of the given type. Poison propagation is unconditionally stopped by the ub.freeze %x : ty
which is a no-op for well-defined values and a nondeterministic choice for poison values. Last, we provide ub.unreachable
that when executed always exhibits immediate UB.
The ops above are said to be ‘convenience’ ops because they simply implement the new interfaces (see below). Other dialects may define their own equivalent ops, e.g., llvm.poison
.
Vector Case
For some compound types like 1-D vectors, tuples, or other aggregate-like types it would probably be more useful to support per-element poison
semantics. In this case, the poison state can be a bitvector with a poison bit per each element. For brevity, we say that a poison value of a vector is a vector with all poison state bits on:
%v = ub.poison : vector<2xi32> // Poison state: <true, true>
while ub.freeze
becomes an elementwise operation for vector types.
Vector operations (like in arith.*
and vector.*
) can then manipulate the bitvector of poison bits. For example, when we perform a vector shift arith.shrui %x, %y : vector<2xi32>
, it’s possible that only one of the shift values is greater or equal to the bitwidth. We would hope that the other result element is well-defined despite the other one being poison
. This allows us to, for example, scalarize vector operations:
%r = arith.shrui %x, %y : vector<2xi32>
%e0 = vector.extract %r[0] : vector<2xi32>
%e1 = vector.extract %r[1] : vector<2xi32>
==>
%x0 = vector.extract %x[0] : vector<2xi32>
%x1 = vector.extract %x[1] : vector<2xi32>
%y0 = vector.extract %y[0] : vector<2xi32>
%y1 = vector.extract %y[1] : vector<2xi32>
%e0 = arith.shrui %x0, %y0 : i32
%e1 = arith.shrui %x1, %y1 : i32
Or vice versa, i.e., vectorize scalar operations. Similar concerns can arise with higher-level ops like vector.mask
[7] or tcp.elementwise
[8].
(Note: the transformation above is correct in LLVM: https://alive2.llvm.org/ce/z/UPVa6x.)
Vectors in MLIR can have more than one dimension (unlike LLVM). There’s also support for scalable vectors. This may necessitate a more elaborate poison state representation than a (flat) bitvector of poison bits.
Custom Type Case
Poison semantics may not be required or make sense for all types. For example, resource handles or synchronization primitives may not have meaningful arbitrary values necessary to freeze poison values. This makes us propose poison semantics to be opt-in for types:
- By default, types do not have poison semantics.
ub.poison
andub.freeze
are not defined for these types and fail verification. - When opted-in, types can define their poison state representation.
- Types decide whether to support
ub.poison
and/orub.freeze
. This is a static property of the given type. We say that a type that supportsub.poison
is poisonable and a type that supportsub.freeze
is freezable.
Compound Type Case
Let’s consider a hypothetical pair<ty1, ty2>
type combining two arbitrary element types. This type could combine the poison semantics of its element types in the following ways:
- Per-element poison semantics:
- One poison state bit per each element type with poison semantics. By definition, only poisonable elements can be in the poison state.
- Only poisonable when all elements are poisonable.
- Only freezable when all poisonable element types are also freezable.
- Lowest common denominator of element types poison semantics:
- If either of the element types does not support poison semantics,
pair<ty1, ty2>
should not either. - The poison state is a pair of two poison states of the elements.
- If either of the types is not poisonable/freezable,
pair<ty1, ty2>
is not either.
- If either of the element types does not support poison semantics,
- Whole-value poison semantics:
- The poison state is a single bit.
- If either of the elements is poison at some point, the whole pair becomes poison.
- The pair is freezable only when both element types are freezable.
Ultimately, it’s up to the dialect that defines the pair<ty1, ty2>
type to define its semantics in a way that is compatible with its ops.
Desired Folds
// 1
%x = ub.poison : i32
%y = arith.add %x, %x : i32
==>
%y = ub.poison : i32
// 2
%x = arith.constant 0 : i32
%y = ub.freeze %x : i32
==>
%y = arith.constant 0 : i32
// 3
%x = ub.freeze %arg0 : i32
%y = ub.freeze %x : i32
==>
%y = ub.freeze %arg0 : i32
// 4
%x = arith.add %arg0, %arg0 : vector<2xi32>
%y = vector.extract %x [0] : vector<2xi32>
%z = vector.extract %x [1] : vector<2xi32>
==>
%a = vector.extract %arg0 [0] : vector<2xi32>
%b = vector.extract %arg0 [1] : vector<2xi32>
%y = arith.add %a, %a : i32
%z = arith.add %b, %b : i32
// 5
%a = vector.extract %arg0 [0] : vector<2xi32>
%b = vector.extract %arg0 [1] : vector<2xi32>
%y = arith.add %a, %a : i32
%z = arith.add %b, %b : i32
==>
%x = arith.add %arg0, %arg0 : vector<2xi32>
%y = vector.extract %x [0] : vector<2xi32>
%z = vector.extract %x [1] : vector<2xi32>
// 6
%p = ub.poison : i32
%v = arith.constant dense 0 : vector<2xi32>
%y = vector.insert %p into %v [1] : vector<2xi32>
%z = vector.extract %v [0] : vector<2xi32>
==>
%z = arith.constant 0 : i32
// 7 -- https://alive2.llvm.org/ce/z/2o8FpN
%v = ub.poison : vector<2xi32>
%c = arith.constant 0 : i32
%y = vector.insert %c into %v [0] : vector<2xi32>
%z = vector.extract %v [0] : vector<2xi32>
==>
%z = arith.constant 0 : i32
// 8
%p = ub.poison : i32
%v = vector.broadcast %p : i32 to vector<2xi32>
%x = vector.extract %v [0] : vector<2xi32>
==>
%x = ub.poison : i32
Implementation
Goals
- We want to faithfully capture the poison semantic while making the implementation performant for the open set of possible types and ops.
- Dialects should be able to opt into poison semantics.
- Compound types can define the elements with poison semantics.
- Analyses derive lattices and transfer functions based on dialect definitions.
- Allow for gradual rollout without breaking existing dialects.
Implementation Strategy
- Introduce the concept of poison and poison-related terminology to the MLIR documentation.
- Add implementation for lowest-level dialects that will benefit from poison semantics: LLVM, arith, vector.
- Introduce poison bottom-up in the dialect lowering chain.
- Gradually expand and generalize the implementation. Make it easy for other dialects to opt into poison semantics.
- Add traits for most common cases simplify the implementation.
- Add a ‘ub’. dialect with
ub.poison
andub.freeze
ops, so that other dialects do not have to reimplement those. - Add support for a generic poison dataflow analysis.
Later implementation stages (3.) may require separate RFC to agree on the details.
Initial Implementation Idea
The implementation is not 100% clear at this point. We hope that the implementation experience gained when adding support to a couple initial dialects will inform a general and pluggable implementation.
At this point, we propose for dialects to opt into poison semantics through Type and Op interfaces.
Types:
- By default, types do not have any poison states
- Type can be poison as a whole or can have its element poison
- Types with metadata (e.g., memref) can have metadata as poison but elements well-defined
- The initial type interface has two methods:
-
bool canBePoison()
– whether the type is poisonable andub.poison
can be used on this type, is allowed to recursively check the elements and other contents of the type. Intended for compound types that may include element types that do not support poison semantics. -
bool canBeFrozen()
– whether the type is freezable andub.freeze
can be used on this type, is allowed to recursively check the elements and other contents of the type. Intended for types like resource handles that cannot be frozen.
-
Ops:
- By default, if any of the input values have poison, the poison state is propagated.
- Ops can specify a more fine-grained behavior if they wish to.
- The op interface describes how poison state is propagated from inputs to outputs.
FAQ
Why not undef
?
Both LLVM and SPIR-V started with undef
values to model uninitialized values or results of instructions whose preconditions were violated without immediately trapping. Every use of undef
yields an arbitrary bit pattern, which makes some compiler optimizations difficult to get right. For these reasons, we do not propose a similar op in the ‘ub’ dialect.
Related discussion: [5, 6].
Do all types have a poison state?
No, poison is opt-in for types. We expect it to be most useful to basic types (integers, floats, etc.) and compound types containing the basic ones (vectors, structs, tensors, etc.).
Should all types with poison state support ub.poison
?
Some types may combine types with and without poison semantics which makes them not poisonable as a whole, e.g., pair<i32, m_dialect.ResourceHandle>
. In this case, we may have custom ops that only poison/freeze the first pair element.
Should all types with poison state support ub.freeze
?
It may not be possible to create meaningful arbitrary values of some types, e.g., synchronization primitives or resource handles.
When you add poison state to builtin types like IntegerType
, will I have to update my dialect to support poison?
Most likely not if your dialect is higher-level. If none of your ops produce poison, and you do not lower from ops that are known to produce poison, then you don’t have to worry about any poison values. This is why we intend to use a bottom-up strategy and introduce poison to the lowest-level dialects first.
When should I consider giving my types and ops poison semantics?
When your ops can benefit from refusing to define some behavior, e.g., allowing uninitialized values or ops that would otherwise require inserting runtime checks when lowering. In such cases, we can gain extra optimization power at the expense of increased semantic complexity. Using poison semantics has 2 main benefits:
- It gives us a framework to talk and reason about such cases and define the precise semantics (documentation and dialect definitions)
- It allows for code hoisting, duplication, and speculation based on these semantics.
We expect this to be most useful in lower-level dialects. In those dialects, we expect the vast majority of ops to simply propagate poison elementwise.
References
[1] https://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf
[2] [llvm-dev] RFC: Killing undef and spreading poison
[3] LLVM Language Reference Manual — LLVM 16.0.0git documentation
[4] [RFC] Define precise arith semantics
[5] Why there's no "std.undef"?
[6] [RFC] Undef/uninitialized values and control flow
[7] [RFC] Vector Masking Representation in MLIR
[8] [RFC] Elementwise ops in TCP
-Jakub Kuderski (@kuhar), with input from Alex Zinenko (@ftynse), Sanjoy Das (@sanjoyd), and Nuno Lopes (@nlopes)