Should data races become poison instead of undef?

nhaehnle · March 11, 2022, 5:09am

Hi all,

The “Memory Model for Concurrent Operations” section of the LangRef currently states that loads that are involved in data races return undef. Is it safe to assume that we want this to eventually be changed to posion?

Poison is stronger than undef and it’s the logical thing to do to move closer to removing undef. I don’t see anything obvious that would go wrong, and frontends could still freeze the loaded value if avoiding undefined behavior matters to them.

cc @nlopes @aqjune

nlopes · March 11, 2022, 12:33pm

Yes, agreed. I don’t see any reason for not moving them to poison. Nevertheless, it’s technically a BC change that should be listed in the release notes, but shouldn’t matter in practice as LLVM doesn’t replace racy loads with poison.

Thanks!

preames · March 14, 2022, 6:59pm

I think this is a much bigger change in semantics than you’re describing it as.

Currently, a program which branches on a racy load may have multiple output states. (i.e. branching on undef can evaluate in any direction) If the value is used multiple times, it can also have nearly arbitrarily complicated and intuitive results since each use of the undef can see a separate concrete value.

However, this is different from branching on poison - which is immediate UB. Shifting our definition such that any program which branches on a racy value becomes full UB seems like a pretty major change to me.

To be clear, it’s also not one I’d support based on the current discussion. I could maybe be convinced, but I’d need to see a well argued case for it.

efriedma-quic · March 14, 2022, 7:55pm

Practically speaking, this change isn’t going to really have much effect, I think? Nothing explicitly cares what a race returns; the reason we don’t just follow the C/C++ race semantics is we want to allow speculative loads from dereferenceable locations.

We might want to hold off on hacking at the text until we resolve the whole “byte type” thing, though, so we don’t have to mess with it twice.

nikic · March 14, 2022, 8:37pm

Point of order: Branching is immediate undefined behavior for both undef and poison. From LangRef for br:

If ‘ cond ’ is poison or undef , this instruction has undefined behavior.

aqjune · March 15, 2022, 1:35am

+1 for this change.
Formalization of the semantics of concurrent memory operations was published to CGO by a team of MPI-SWS: https://plv.mpi-sws.org/llvmcs/paper-full.pdf
Its ‘undef’ value semantics is actually equivalent to the poison semantics in LLVM (Sec. 2.1).
The paper proved that various basic optimizations are valid under the poison-like undef semantics, so the minimum safety of this poison is achieved, I believe.

preames · March 15, 2022, 3:44pm

@nikic You’re technically correct per the LangRef, but this is a case where the LangRef is out of sync with the actual implementation and has been for a while.

As a specific example, consider ScalarEvolution::isAddRecNeverPoison which considers branching on poison to be UB, but doesn’t consider undef. I remember there being other such places, but a quick skim of some doesn’t reveal them - maybe the code has sense changed.

I really don’t think this change is worthwhile given the stated motivation.

Philip

nlopes · March 15, 2022, 3:51pm

Philip, the Langref is not out of sync. It was a decision that was made a while ago and it was reviewed by several folks.

Not defining br undef as UB makes GVN wrong. That was the main motivation to define it as UB rather than a non-deterministic jump.

Topic		Replies	Views
[RFC] Load Instruction: Uninitialized Memory Semantics IR & Optimizations llvm	40	2242	July 23, 2023
RFC: Proposal to Remove Poison LLVM Dev List Archives	19	185	February 17, 2015
Undef and Poison round table follow-up & a plan LLVM Dev List Archives	10	263	October 12, 2020
Function Inlining and undef / poison question LLVM Dev List Archives	5	81	June 16, 2017
Undef/poison semantics LLVM Dev List Archives	0	105	February 20, 2018

Should data races become poison instead of undef?

Related topics