Memory semantics and why is DSE even legal

smichelland · July 2, 2024, 3:38pm

I’m on a journey to document the descent in storage abstractions for LLVM’s compilation chain. This comes up because C variables, “memory”, SSA values, etc. all tend to have slightly different semantics and this comes back to bite us with some non-functional (e.g. security) considerations.

Context aside, I’ve been searching lang ref, docs and such for a specification of LLVM’s memory semantics, without much success. For instance, I couldn’t find any reference that stated why DSE was legal, other than by inferring from very much implied semantics and non-aliasing properties that the observable behavior of the program could not depend on a dead memory location’s value.

Note that I’m deliberately not talking about concurrent memory semantics, which is a whole other can of worms with the unfortunate tendency to dominate search results for memory semantics keywords. ^^" This is strictly about single-threaded code.

Questions that I’m interested in include, but are not limited to:

Exactly what constructs of the language allow for the observation of memory?
Are non-volatile memory access associated with any level of side-effect?
If so, why is DSE legal? (maybe this touches upon what observable behavior LLVM guarantees it preserves)
If not, does that mean aliasing is the only consideration when reordering/removing/transforming memory accesses (consider e.g. the many attributes related to specifying the memory behavior of intrinsics)?
Are there conditions besides aliasing for reordering memory accesses around each other and side-effecting calls/intrinsics?
How is mem2reg/sroa even legal around side-effecting calls/intrinsics that could observe memory?

This seems to especially cross into side-effects for LLVM as I’ve seen comments discussing how LLVM IR doesn’t really have a notion of side-effect and uses memory for it instead.

So are there any references, documentation, etc. that address these non-concurrent memory semantics?

jdoerfert · July 2, 2024, 4:29pm

I’ll try to keep it short, but others will likely elaborate.

Assuming non-volatile, non-atomic accesses as you did:

A memory access is a side-effect that can be observed by reading any part of the written location.
If the user accesses something that is not accessible, e.g., non-allocated memory, or use the wrong alignment, the access is UB.
Thus, if the access is UB, or if there is no read of the memory that can observe the write, a memory access can be eliminated.

SROA works on stack allocations and verifies they do not “escape”, thus their address is never visible to the user. After some other checks the pass can be certain the loads will only observe the value they are replaced with, and the entire alloca + stores + loads chains can be removed.
Calls, intrinsics, etc. are all taken into account.

We have various attributes to help with the reasoning, memory, readnone, nocapture, noalias, are a few important ones.
If you want to hoist accesses you need to take the potential UB into account again to not introduce UB where there was none.

jcranmer · July 2, 2024, 5:59pm

In my experience, semantics tend to start out as wishy-washy and heavily
implied–and ultimately internally inconsistent–until people start
trying to apply formal semantics approach to semantics and uncover all
of the problems with the semantics. I am by no means an expert on formal
semantics, but my understanding is that memory is intrinsically more
difficult to tackle with formal semantics, and so we’re still in the
progress of trying to actually iron out all of these semantics (with
progress being glacially slow).

There is a basic underlying, if not fully stated, principle in LLVM IR
with regards to memory: if you’re not properly “told” about a memory
location, it is UB to access that memory. The mechanics of how to “tell”
someone a memory location are somewhat laid out in the “Pointer Aliasing
Rules” section of the LangRef, but LLVM does not correctly implement
those rules, and AFAICT there’s agreement that the rules are more
incorrect than the implementation (both are wrong, in different ways,
however).

As a practical matter, the local analyses of memory effects follow a
baseline model of “escaping” pointers, as Johannes details: if the
pointer value of a stack allocation doesn’t “escape” to unknown code,
the compiler assumes that it knows about all of the reads and writes of
that allocation, and can eliminate dead stores or forward from stores to
loads at will. For most optimizations, the question of whether or not
memory addresses are escaping (or perhaps have already done so) is the
most useful way to frame thinking about memory operations.
Unfortunately, looking at the LangRef, we never properly define this
“escaping” model, even as several attributes explicitly detail their
effect in terms of it.

kparzysz · July 2, 2024, 6:22pm

Here is the summary of memory-access instructions. In addition to that there are instructions with “hidden” behavior (i.e. those that can other code to be executed, such as function calls). In the absence of any additional information, these need to be assumed to read/modify memory. The relevant information can be provided though attributes (which the other post mentions).

Generally no, but it’s not a simple question. A simple load from memory will likely modify cache on a specific system, but the definition of the load instruction in the LLVM IR reference, doesn’t mention anything about cache. In terms of the abstractions defined by the LLVM IR, memory-access instructions don’t have any side-effects that are not described by their definitions. Calls, again, need to be treated specially.

For a single-threaded execution, the assumption is that if an instruction stores to memory, then (in the absence of other memory-modifying instructions) any following load from that location will read the stored value.

Pretty much. The only consideration is whether the program can observe any difference resulting from such a change.
For reordering stores, if they access distinct memory, then the state of memory will be the same regardless of which order they happened. For this check, aliasing information is sufficient, but it’s not always a simple question either (e.g. when iterating over array, do a[i] and a[i+1] alias?).

tschuett · July 2, 2024, 6:44pm

DSE is currently limited to basic blocks and uses:

github.com

llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/CaptureTracking.h

//===----- llvm/Analysis/CaptureTracking.h - Pointer capture ----*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains routines that help determine which pointers are captured.
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_ANALYSIS_CAPTURETRACKING_H
#define LLVM_ANALYSIS_CAPTURETRACKING_H

#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLFunctionalExtras.h"

namespace llvm {

This file has been truncated. show original

To track pointer escapes and
https://llvm.org/docs/MemorySSA.html
to reason about the order of memory accesses.

There are already plans for global dse:

smichelland · July 3, 2024, 12:19pm

Thanks for all this info. My understanding is converging towards the following. Does that seem right to you?

Assumptions: single-threaded; volatile and non-volatile resources disjoint.

Non-volatile memory is your basic storage resource, holds what you last wrote (“strong” memory in concurrent terms, I believe), with the added complication of reading/writing parts of objects which adds the usual considerations about addressing, layout, alignment.
Only loads can access memory; and, by extension, opaque code that might load, such as calls and intrinsics. Whether these do load has a first over-approximation by kind-of-access attributes (readnone, readonly, writeonly).
As a result, accessing non-volatile memory is a “weak” type of side effect which only affects other memory-loading code. In particular, it is never a C-style, compiler-must-preserve-traces side effect (e.g. I/O). On the other hand, C-style side effects are assumed to access memory unless specified otherwise.
Individual allocations are separated by default (in the sense of separation logic), meaning accesses to memory are only valid when the pointer they dereference data-flows from the allocation site to the access site according to the rules of pointer-capture/“escaping”/MemorySSA. This leads to another over-approximation of opaque code behavior with kind-of-location attributes (argmemonly, inaccessiblememonly, etc).
Volatile memory access are C-style side effects that do not obey algebraic laws for memory (i.e. “reads give you the last value written”).

Could you please elaborate on memory and its relationship with side effects? I’m thinking specifically of this post.

With the current state of PL research it feels like we should know to think formally early by now, but even modern projects show otherwise. Heh, I guess one can dream. ^^ I hope these efforts with LLVM succeed.

kparzysz · July 3, 2024, 12:53pm

If you’re looking for definitions, they are described in the LLVM IR reference, in the section about function attributes.

smichelland · July 3, 2024, 1:06pm

I was hoping for a clarification of how C-style side effects are mapped to LLVM IR. C requires the trace of volatile accesses and I/O to be preserved. Apparently, IntrHasSideEffects isn’t directly that concept (as it would seem on a first read) and memory is used for that purpose, that is, if I understand Johannes’ comment correctly. This suggests that the memory effect of a common non-volatile access vs. the memory effect of an actual I/O call should be different, and it’s not obvious to me how at the moment.

nikic · July 3, 2024, 1:18pm

Side effects are modeled as memory accesses to inaccessible memory, in the sense of memory(inaccessiblemem: readwrite).

Topic		Replies	Views
[open-analysis] Alias Analysis Design & Implementation and LLVM LLVM Dev List Archives	4	161	November 10, 2003
Marking SSA values as pointing to immutable memory IR & Optimizations	9	1000	November 2, 2022
Intrinsic with sideeffect is optimized out LLVM Project llvm	24	1536	November 16, 2022
invariant.load metadata semantics LLVM Dev List Archives	34	186	August 31, 2016
Atomic Ordering: Non-atomic vs Unordered IR & Optimizations llvm	1	459	October 28, 2022

Memory semantics and why is DSE even legal

Related Topics