[RFC] Add nopoison attribute/metadata

This is a great question :slight_smile: I have some thoughts on it.

First, in the context of metadata, one important secondary use-case for !noundef is how it acts in conjunction with other metadata like !nonnull or !range. In that case, it indicates that the value after application of other metadata is noundef. In particular, this means that violation of !nonnull etc becomes immediate undefined behavior instead of returning poison.

As such, even if we say that memory itself cannot contain poison, we would still need an attribute to promote from poison to UB semantics. Currently !noundef is fine for that. If we remove undef at some point, we’d need !nopoison then. But it would be fine to defer it until that time, as it would just be a rename at that point.

Second, there is the question of how we will go about removing the last major user of undef: Uninitialized memory. I think the plan that @nlopes originally had in mind is that we will consider uninitialized memory to be poison, and compensate for this with some variation on “freezing load” in cases where it is relevant.

If we go down that road, we obviously have to allow poison in memory. However, I think that making uninitialized memory poison causes a lot of complications that we don’t have great answers for, and I think the most recent iteration on this ([RFC] Load Instruction: Uninitialized Memory Semantics) has essentially reintroduced a concept of uninit memory that is distinct from poison.

What I am leaning towards nowadays, is to leave uninitialized memory as undef (maybe with a rename to uninit) and only forbid its use as an SSA value. That is, alloca is still implicitly initialized to undef, you can have a global variable with undef in its initializer, but function IR never contains undef, only freeze poison.

I think that this still addresses the main pain point of undef, which is the inconsistent multi-use problem. The main externality relative to poison is that you cannot, in general, duplicate loads (though I guess we could keep !noundef to allow that).

Third, not allowing poison in memory solves a lot of open miscompilation problems. E.g. we can combine byte-wise comparisons into a wide comparison, optimize away store undef, etc. The memcpy → load/store problem would not be fully fixed due to provenance, but it would fix one key facet.

Fourth, forbidding poison in memory closes the door on frontends exposing poison semantics. For example, some Rust developers were playing with the thought of exposing a type that does FP with FMF and can hold poison. I don’t think that LLVM should feel obligated to support this use-case though.

Overall, unless I’m missing something major, I think that forbidding poison in memory would be a good move. I’d like to hear @nlopes thoughts on it.

2 Likes