Overview and Problem Statement
Volatile access is commonly used when interacting with objects that, although accessed via memory interfaces, have special semantics making them different from a typical memory byte. For instance, peripheral registers mapped into the address space as a memory region (MMIO registers). These don’t fit the definition of dereferenceable, since accessing them may trap/interrupt, and they cannot be loaded from speculatively (two reads in a row may return different values). Therefore, they represent non-dereferenceable memory, but the semantics of their access are well-defined when target specific context is introduced. This RFC proposes to make that observation explicit in the LLVM Language Reference, under the section for volatile and in the description of the ptr
type, including the possibility of access to typically forbidden addresses (such as bit-value 0) in this way only.
Background and Motivation
Certain architectures map their peripherals to hard-coded addresses, and the mapping may include bit value 0. An example is present here, as well as in the AVR architecture. Some of its chips map peripherals starting at data address 0x0. See the thread on the Rust Zulip chat for full discussion, but in summary, the Language Reference may lead the reader to believe volatile access to these special addresses is undefined behavior and forbidden like everywhere else, when in practice it may be defined and valid to do so in certain cases. The idea that null cannot be passed to a volatile access is the understanding of the Rust compiler currently, and it has a condition that if the address passed is known to be null, no access is generated, instead a panic. However, if the compiler is modified to not have that check, the following IR is produced, and LLVM, as it stands, generates a binary with a correct access to null. Therefore, the behavior already matches what is described in this proposal, but not the Language Reference.
%0 = load volatile i8, ptr null, align 32768
We’d like to remove the condition, but first thought that this should be discussed to make sure the LLVM understanding is the same. An implementation of the relevant changes in the LLVM repo is pushed to here.
I’m open to any feedback regarding the changes proposed and the RFC text/reasoning.