Summary
Type based segregation of allocations has been increasingly adopted as a low cost mitigation for memory errors in unsafe languages. To support systemic adoption of such allocators, we are proposing a new attribute to support automatic adoption of these allocators. This attribute allows the developer to annotate C APIs to specify that there is a type-segregating version of a allocation function available, and to redirect calls to that typed variant, including an inferred type tag to support automatic type based segregation by the allocator.
Motivation
As unsafe languages C and C++ are prone to a variety of memory error classes with severe security implications. Reducing and mitigating these issues without requiring large scale rewriting of existing code, and without breaking existing ABI, and many different approaches are needed in concert to resolve these issues. The existing -fbounds-safety proposal[1] works to prevent security problems caused by logic errors resulting in out of bounds access to memory. Another major attack vector is logic resulting in temporal memory or lifetime safety bugs, most typically use-after-free, and this proposal aims to provide tools to mitigate attacks built on these errors.
One class of systemic mitigation that can achieve this is type based segregation of dynamic allocations, an approach that Apple has already used to great benefit on its platforms[2], as have other projects[3]. The core problem in existing approaches to type segregation is that they require significant source adoption work, do not support pure C, or have limited/ad-hoc type segregation (often based on call stack introspection) that limit their applicability to general platform wide adoption. The attribute we are proposing allows this automatic exposure of type information in C APIs, where the language limits the ability to automatically communicate type information.
The increased flexibility and type awareness of C++ provides options for more general solutions, and to that end we are proposing extensions for operator new
and operator delete
to the C++ language committee that expose the actual type being allocated to the relevant allocation APIs.[4]
Apple’s experience has found that type based segregation is a generally effective and low cost mitigation for many common attacks on C and C++ code, and we expect the use of such segregation to increase over time.
The intent of this proposal is to allow libraries and platforms to provide type segregating allocation APIs, and have those APIs be adopted automatically and transparently without requiring any adoption effort by downstream consumers to switch to the new APIs or to provide explicit type information.
Programming model
General Usage
To adopt this feature an author providing an allocation API uses this attribute to specify the typed variant of the method to call, as well as the parameter to perform type inference over.
e.g.
void *malloc(size_t sz);
is updated to
void *typed_malloc(size_t sz, uint64_t type_descriptor);
void *malloc(size_t sz) __attribute__((typed_memory_operation(typed_malloc, 1)));
The result of these declarations and annotations is transparent redirection of calls to annotated allocation functions, functionally equivalent to rewriting
ptr = malloc(sizeof(SomeType));
to
ptr = typed_malloc(sizeof(SomeType), /* type descriptor for SomeType */)
For a developer using such annotated APIs this redirection is largely transparent. The major caveat is that maintaining source compatibility requires this redirection only occur during direct calls, as the type of the typed segregating function is necessarily distinct from that of the original so indirect calls must use the unsegregated interface.
Attribute semantics
The typed_memory_operation
attribute takes two parameters, the first parameter is the type segregating function that is to be used as the new target for typed calls, the second parameter is the argument number for the parameter to perform type inference over. Earlier implementations allowed the target function to be an entirely opaque symbol that avoided the need for actual declaration of the typed interface, however in practice it proved beneficial to expose the typed interface explicitly, and doing so allows semantic checks that prevented errors due to silent API mismatches.
Type inference and type descriptors
By design this proposal does not require developers explicitly specify types being allocated, but rather it introduces an inference step to be performed over the call expression to determine what type[s] are being allocated. This inference is based on local analysis of the callsite assuming idiomatic coding practices to determine the set of C types being allocated, and whether the allocation is fixed size.
To reduce the performance burden on the allocator from explosive growth of “distinct” types, the type descriptors in this proposal are produced by first coalescing the relevant C types to unique structural types based on the type[s] of data in each byte of a data type rather than the type name, point of declaration, or similar source level properties.
Having developed the structural type for an allocation, we need the descriptor that is actually used to be sufficiently compact that it does not impact code size or call performance. To that end this proposal does not use or provide any complex type metadata, but rather uses a single 64bit type descriptor that contains flags to indicate core properties of the type (whether the object contains pointers, vtables, etc), and a hash of the structural type. The flags are necessary to allow allocators to adjust the segregation policies according to the data contained by those types, and the hash provides the core mechanism to segregate distinct types.
As this proposal supports existing code, and performs heuristic based type inference it is possible for the inference to fail. In such a case the redirection is still performed, however the type descriptor in this case is set to indicate that inference failed, and the descriptor hash is derived from the call location, so that the allocator is able to segregate allocations from different call sites even when inference fails.
Rewrite target ABI and semantics
The rewrite target for an annotated function logically acquires an additional type descriptor parameter, that is required to be declared immediately following the parameter targeted for inference. In other words the an annotated API as below
void *allocator_function(T1 Arg1, T2 Arg2, ..., TN ArgN, TN1 ArgN1, ...) __attribute__((typed_memory_operation(typed_allocator_function, N)));
requires that the typed_allocator_function
target function be declared as
void *typed_allocator_function(T1 Arg1, T2 Arg2, ..., TN ArgN, uint64_t Descriptor, TN1 ArgN1, ...);
Inferring the type descriptor value passed as Descriptor
does not involve any runtime evaluation and is determined entirely statically, and the call rewrite does not impact evaluation order or side effects of any argument expression. As there is a change to the number and position of arguments to the target function vs the original, this does necessarily impact the register and/or stack locations of parameters though this should not impact any existing code as definitionally the new target function is aware of the ABI from the time of initial adoption.
Portability with toolchains that do not support the extension
If a toolchain does not recognize this extension, either the attribute will be ignored, or API providers will need to ensure appropriate macro guards around the declarations to prevent breakage due to -Werror -Wunknown-attributes
and similar configurations. As the adoption of the type segregating APIs is an automatic translation from the original call, the end users of these APIs do not need to maintain different code paths for platforms supporting the type segregation.
Implementation
We have implemented this proposed extension, and have deployed it on the codebases for multiple large consumer operating systems with no meaningful source compatibility impact or code size regressions. Runtime overhead is dependent on design decisions and trade offs made by the allocator, however in our deployment we were able to adopt trade offs that resulted in no overall runtime or memory regression while providing the segregation properties we felt were necessary.
Our implementation of this extension performs the call retargeting during the CodeGen pass in Clang, as this means that any compiler passes, warnings, or other analysis over the AST or during Sema will produce feedback to the user that match the call site as written rather than the implicitly rewritten call that they are not necessarily aware of.
ABI
In addition to the typed target ABI, it is also necessary to specify the ABI for the type descriptor that is exposed to the platform or allocator library vending the type entrypoints. In our current implementation this information is passed via the following structure.
enum tmo_layout_semantics : uint16_t {
tmo_layout_none = 0,
tmo_layout_data_pointer = 1 << 0,
tmo_layout_struct_pointer = 1 << 1,
tmo_layout_immutable_pointer = 1 << 2,
tmo_layout_anonymous_pointer = 1 << 3,
tmo_layout_reference_count = 1 << 4,
tmo_layout_resource_handle = 1 << 5,
tmo_layout_spatial_bounds = 1 << 6,
tmo_layout_tainted_data = 1 << 7,
tmo_layout_generic_data = 1 << 8,
};
enum tmo_type_semantics : uint8_t {
tmo_type_semantics_none = 0,
tmo_type_semantics_is_polymorphic = 1 << 0,
tmo_type_semantics_has_mixed_unions = 1 << 1,
};
enum tmo_type_kind : uint8_t {
tmo_type_kind_c = 0,
tmo_type_kind_objc = 1,
tmo_type_kind_swift = 2,
tmo_type_kind_cxx = 3
};
enum tmo_callsite_semantics : uint8_t {
tmo_callsite_semantics_none = 0,
tmo_callsite_semantics_fixed_size = 1 << 0,
tmo_callsite_semantics_array = 1 << 1,
tmo_callsite_semantics_header_prefixed_array = 1 << 2,
};
struct tmo_type_descriptor {
tmo_layout_semantics layout_semantics : 16;
tmo_type_semantics type_semantics : 4;
tmo_type_kind kind : 2;
tmo_callsite_semantics callsite_semantics : 4;
unsigned unused : 4;
unsigned version : 2;
uint32_t hash : 32;
}
This structure is then flattened to a 64 bit integer as [layout:16][type:4][kind:2][callsite:4][unused:4][version:2][hash:32]
.
Builtin support
To support interfaces where explicit type information is available (as can occur in wrappers and template allocation functions) the __builtin_tmo_get_type_descriptor(type or expression)
builtin is provided that produces a type descriptor for the type of the given expression, without performing any heuristic driven inference. This can be used to support explicit adoption in environments where exact types can be known (for example, macro and C++ template based allocators).
Limitations
As neither C nor C++ provide full object introspection, and C does not support C++ style compile time code execution, it is not possible to specify this extension in a manner that allows developers to customise construction of the type descriptor. We have endeavoured to define the descriptor in a manner that makes it generically usable, however doing so necessarily loses some granularity.
Allocation wrappers are another common idiom in normal code, and frequently separate the expression that contains type information from the allocator call site. As a result such wrappers make allocation types opaque and coalesce the allocator callsites so the site based hash fallback also fails to provide information that can be used to support allocation segregation. The solution for the specific issue is to have the wrapper authors make use of this attribute to provide typed wrappers that can explicitly forward the inferred type descriptor explicitly to the underlying typed allocation APIs.
Future directions
The current inference model is derived from local analysis of idiomatic use of sizeof
operator and similar constructions, and as a result misses cases where local inference could still be performed - casting, out parameters, etc.
The semantic information currently provided in the type descriptor is limited by extensive use of opaque types (untyped pointers and intptr_ts) in objects, so adding a mechanism that allows such data and types to be annoted with semantic information would potentially be beneficial, though doing so in a way that that is ergonomic and is compatible with the required constraints may prove challenging.
Citations
[1] RFC: Enforcing Bounds Safety in C (-fbounds-safety)
[2] Towards the next generation of XNU memory safety: kalloc_type
[3] Efficient And Safe Allocations Everywhere!
[4] P2719R0: Type-aware allocation and deallocation functions
Clang consensus called in this message.