[RFC] Volatile access to non-dereferenceable memory may be well-defined

Overview and Problem Statement

Volatile access is commonly used when interacting with objects that, although accessed via memory interfaces, have special semantics making them different from a typical memory byte. For instance, peripheral registers mapped into the address space as a memory region (MMIO registers). These don’t fit the definition of dereferenceable, since accessing them may trap/interrupt, and they cannot be loaded from speculatively (two reads in a row may return different values). Therefore, they represent non-dereferenceable memory, but the semantics of their access are well-defined when target specific context is introduced. This RFC proposes to make that observation explicit in the LLVM Language Reference, under the section for volatile and in the description of the ptr type, including the possibility of access to typically forbidden addresses (such as bit-value 0) in this way only.

Background and Motivation

Certain architectures map their peripherals to hard-coded addresses, and the mapping may include bit value 0. An example is present here, as well as in the AVR architecture. Some of its chips map peripherals starting at data address 0x0. See the thread on the Rust Zulip chat for full discussion, but in summary, the Language Reference may lead the reader to believe volatile access to these special addresses is undefined behavior and forbidden like everywhere else, when in practice it may be defined and valid to do so in certain cases. The idea that null cannot be passed to a volatile access is the understanding of the Rust compiler currently, and it has a condition that if the address passed is known to be null, no access is generated, instead a panic. However, if the compiler is modified to not have that check, the following IR is produced, and LLVM, as it stands, generates a binary with a correct access to null. Therefore, the behavior already matches what is described in this proposal, but not the Language Reference.

%0 = load volatile i8, ptr null, align 32768

We’d like to remove the condition, but first thought that this should be discussed to make sure the LLVM understanding is the same. An implementation of the relevant changes in the LLVM repo is pushed to here.

I’m open to any feedback regarding the changes proposed and the RFC text/reasoning.

1 Like

We last discussed this in ⚙ D53184 [LangRef] Clarify semantics of volatile operations. . That was supposed to cover your use-case, I think? If it isn’t clear, we can try to revise it.

I see. In the Rust Zulip I was instructed to post this thread because we were unsure about the implications of the LangRef text, but I think the discussion you linked covers most of it. Then, I think the only doubt that remains is whether the pointer with bit-value 0 is already included as a possibly valid operand to volatile operations. In other words, whether there is some optimization, valid for volatile access, that assumes 0 will not be passed to load volatile/store volatile.

From my testing with LLVM IR, it seems this use-case is OK. (I also searched the code, but since I haven’t worked with LLVM directly before, that wasn’t conclusive.) Therefore, if we allow null in Rust for volatile reads and writes, that wouldn’t break LLVM’s assumptions, correct?

Edit:

For some extra context, this came up when developing peripheral support for an AVR device, which unlike the others supported thus far, starts it’s peripheral registers at 0x0.

It’s probably worth adding some text to LangRef to specifically say that volatile operations on null are allowed. But yes, LLVM optimizations should be okay with volatile operations on a null pointer.

I’ve sent [llvm] Comment on validity of volatile ops on null by LuigiPiucco · Pull Request #139803 · llvm/llvm-project · GitHub to add said comment. I also added a test case that it works for AVR (I had this test ready from before, if it’s not useful I can remove it).