Summary
Various semantic information available at source or at LLVM IR level may be lost when lowering and generating target-specific code. As such, it becomes impossible to recover such information without additional metadata stored elsewhere that can map instruction and function addresses, viz. program counters (PCs), to metadata of interest.
Such metadata can aid in more accurate runtime binary analysis that requires knowledge of source-level information (e.g. atomic vs. plain accesses in data race detection). Similarly, source-level debug information must be stored (e.g. as DWARF) alongside the binary to recover useful debugging information. Unfortunately, debug information is not guaranteed to be present in a binary (it may be stripped), nor is it efficient to arbitrarily extend and store new metadata: both factors are crucial for metadata that is required at runtime affecting the correct and fast operation of a program.
We propose a mechanism to efficiently generate and store arbitrary PC-keyed metadata associated with IR instructions that can be retrieved at runtime. The following discusses background and motivation in more detail, followed by design of the core feature, followed by the first concrete use case.
An earlier discussion that led to this RFC may be found here.
Background and Motivation
To perform certain detailed runtime binary analysis on an otherwise unmodified binary, semantic metadata is required that is lost when generating machine code. For example, data race detection requires knowledge of atomic accesses to avoid false positives. For deployment in production, however, this metadata needs to be stored in the binary and needs to be accessible efficiently at runtime: the presence of the metadata should not affect performance of the binary unless it is accessed, and overall binary size should be minimally impacted. Therefore the metadata will require storage in separate loadable sections, with size having priority over extensibility, backwards compatibility, or human readability. Crucially, for some deployment scenarios, the presence of the metadata is required for the correct and fast operation of a program (this is unlike traditional debug information, which may be stripped).
Use cases. Most of the immediate use cases are to generate PC-keyed semantic metadata for sampling-based error detectors aka. sanitizers, that if disabled, have zero overhead. The first such sanitizer will be a variant of GWP-TSan, but other GWP-Sanitizers (such as an UBSan and MSan variant) that require language-level semantic information are planned. Other binary instrumentation tools, such as Valgrind, Helgrind, or DRD could also benefit from PC-keyed metadata.
Challenges. The main challenge here is that instruction PCs will only be known in the backend during code generation, yet the semantic information of interest is only known in the frontend or middleend: propagating for which instructions PC-keyed metadata should be emitted to the backend is non-trivial. The implementation should also take care to work well with the linker garbage collector (GC), such that if associated code is dropped, the metadata is dropped, too. Finally, the encoded PCs should be stored as efficiently as possible, avoiding relocations if possible (which adds size and linker overheads).
Related Features
Similar metadata is emitted by some of the following:
- SanitizerCoverage’s PC Table feature constructs a list of basic block entry PCs with attached metadata in the
__sancov_pcs
section of the binary. - The
-basic-block-sections feature
records metadata about each basic block in the.llvm_bb_addr_map
section of the binary for use by profilers and debuggers.
The commonality here is that these only work on basic block addresses, and not individual instructions. No existing feature easily allows emitting PCs of individual instructions.
Design
The most scalable design is to allow attaching MDNodes to arbitrary IR instructions and functions, where the attached metadata is propagated through to the AsmPrinter which then interprets the metadata and generates code to emit the metadata in the binary. The metadata itself is stored in arbitrary sections determined by the information stored in the metadata.
More concretely, we introduce PC sections metadata which can be attached to IR instructions and functions, for which addresses, viz. program counters (PCs), are to be emitted in specially encoded binary sections. Metadata is assigned as an MDNode of the MD_pcsections
kind (!pcsections
). The format and encoding (see below) of !pcsections
metadata is kept generic, so that different kinds of PC-keyed metadata can be translated to a !pcsections
metadata node. Therefore, we only need to take care to propagate !pcsections
metadata from IR instructions to replacement IR instructions and generated machine IR (MIR), and no special logic is required for different kinds of PC-keyed metadata.
Metadata propagation. The biggest challenge is to losslessly propagate !pcsections
through IR transformations, from IR to machine IR (MIR), and through MIR transformations in the backend, through to the AsmPrinter. The problem is similar to propagating debug info. In many cases both debug info and !pcsections
metadata should be copied together: for generation of MachineInstrs, we modify BuildMI() to simplify the propagation of debug info and !pcsections
metadata together.
-
IR-to-IR transformations: The current use cases only intend to add
!pcsections
metadata after all IR optimizations. As such, no special care is taken to preserve!pcsections
metadata through IR transformations yet. One notable exception is the AtomicExpandPass which runs after optimizations right before instruction selection, which we update to preserve!pcsections
metadata for all replacement instructions (see patch). -
IR-to-MIR lowering: MachineInstrs will allow setting
!pcsections
metadata viaMachineInstr::setPCSections()
, which stores the MDNode pointer out-of-line inMachineInstr::ExtraInfo
, to avoid bloating MachineInstr in the common case (see patch). TheBuildMI()
MachineInstr builder is updated to take a bundle of debug info and!pcsections
metadata as MIMetadata, which simplifies copying both from IR and MIR instructions (see patch).-
SelectionDAG: Before lowering to MachineInstrs, SelectionDAG lowers instructions to SDNodes. As such, we need to introduce the ability to store
!pcsections
metadata in SDNodes during IR-to-SD lowering. SelectionDAG provides several callbacks that simplify propagating metadata on DAG transformations (viaReplaceAllUsesWith
, see patch; and viaDAGUpdateListener
, see patch). -
FastISel: Because there is no intermediate representation between LLVM IR instructions and MIR instructions, on instruction selection with FastISel the metadata is copied through MIMetadata and all
BuildMI()
calls are updated. Implementing FastISel support is relatively straightforward:FastISel::DbgLoc
is replaced with an MIMetadata instance to copy debug info and!pcsections
metadata together (see patch). -
GlobalISel: Like FastISel, requires updating
BuildMI()
calls in various locations (see patch).
-
SelectionDAG: Before lowering to MachineInstrs, SelectionDAG lowers instructions to SDNodes. As such, we need to introduce the ability to store
Metadata format. An arbitrary number of interleaved MDString
and constant operators can be
added, where a new MDString
always denotes a section name, followed by an arbitrary number of auxiliary constant data encoded along the PC of the instruction or function. The first operator must be a MDString
denoting the first section.
!0 = metadata !{
metadata !"<section#1>"
[ , iXX <aux-consts#1> ... ]
[ metadata !"<section#2">
[ , iXX <aux-consts#2> ... ]
... ]
}
The occurrence of “section#1”, “section#2”, …, “section#N” in the metadata causes the backend to emit the PC for the associated instruction or function to all named sections. For each emitted PC in a section #N, the constants aux-consts#N
will be emitted after the PC.
Binary encoding. Instructions result in emitting a single PC, and functions result in emission of the start of the function and a 32-bit size. This is followed by the auxiliary constants that followed the respective section name in the MD_pcsections
metadata.
To avoid relocations in the final binary, each PC address stored at entry
is a relative relocation, computed as pc - entry
. To decode, a user has to compute entry + *entry
. The size of each entry depends on the code model. With large and medium sized code models, the entry size matches pointer size. For any smaller code model the entry size is just 32 bits.
With the metadata emitted by the SanitizerBinaryMetadata
pass (discussed in the next section), a study on several of the largest binaries deployed at Google showed that a naive implementation without relative relocations (and entries of regular size of 64 bits) resulted in an overall binary size increase of >10%, which was unacceptable. The proposed version with relative relocations results in an overall binary size increase of less than 2%.
Use case
The first use case will be a middleend pass, SanitizerBinaryMetadata
(see patch), that will emit PC-keyed metadata for use by a set of new sanitizers. The first such sanitizer will be a variant of GWP-TSan, but other GWP-Sanitizers (such as an UBSan and MSan variant) that require language-level semantic information are planned.
GWP-TSan will require knowledge of which instructions have been lowered from C11 and C++11 atomics, to avoid generating false positive data race reports. For now, the new pass supports generating PC-keyed metadata about atomic instructions, and which semantic features have been analyzed per function. The latter metadata enables mixing code for which no PC-keyed metadata exists with code where PC-keyed metadata has been enabled without producing false positive reports.
The plan is to open source a stable and production quality version of GWP-TSan and other GWP-Sanitizers. The development of which, however, requires upstream compiler support. Until the first tool has been open sourced, we mark this kind of instrumentation as “experimental”, and reserve the option to change binary format, remove features, and similar. Until that time, PC-keyed metadata via SanitizerBinaryMetadata
can be emitted with the frontend flag -fexperimental-sanitize-metadata
.
Implementation
Phabricator patch series:
- [Metadata] Introduce MD_pcsections
- [MachineInstr] Allow setting PCSections in ExtraInfo
- [MCObjectFileInfo] Add getPCSection() helper
- [AsmPrinter] Emit PCs into requested PCSections
- [SelectionDAG] Rename CallSiteDbgInfo into SDNodeExtraInfo
- [SelectionDAG] Properly copy ExtraInfo on RAUW
- [SelectionDAG] Propagate PCSections through SDNodes
- [MachineInstrBuilder] Introduce MIMetadata to simplify metadata propagation
- [FastISel] Propagate PCSections metadata to MachineInstr
- [AtomicExpandPass] Always copy pcsections Metadata to expanded atomics
- [GlobalISel] Propagate PCSections metadata to MachineInstr
- [SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass
- [Clang] Introduce -fexperimental-sanitize-metadata=
Additionally a Git tree with the implementation is available here.