[RFC 0/2] Propose a new pointer trait.

Hi,
I'm part of an engineering team doing research on persistent memory support and
we have stumbled upon an interesting problem. The issue is, we would like to be
able to use the standard library containers in a persistent memory context
(think NVDIMM-N). What I mean is that you allocate a container from said
memory, use it like you normally would. After the application terminates,
expectedly or otherwise, you can place the container back into the newly
created process's address space - it will be in the last consistent state in
which it was before termination (interruption/crash/exit). The obvious way to
achieve this is to substitute the allocator. The programming model is based on
memory mapped files and because of that the pointer type used by the allocator
is a fancy pointer, so that it can do all the offset calculations.

User data consistency is provided via a transaction mechanism provided by the
library. We snapshot data before modifying them to be able to roll the
transaction back in case of an error. The usage model we are proposing is this
(pseudocode, not the actual API):

int main() {
    auto handle = pool::open("path/to/file", ...);
    using pvector = std::vector<foo, persistent_allocator<foo>>;
    {
        auto tx = transaction::start(handle); // transactions are per pool file
        auto vec_ptr = allocate_from_persistent_memory<pvector>();
        vec_ptr->emplace_back(foo, parameters);
    }
}

The problem occurs when standard library containers have metadata. For example
rb tree nodes carry their color, which will not be included in the
transaction's undo log. To address this we propose to define a new type in the
std::pointer_traits, which could be used to wrap the containers' metadata in
a user-defined type. This should be fully backward compatible and as such not
mandatory.

This is a standard level change and we would like to gather feedback for this
idea before we start the whole process. This feature is not persistent memory
specific, even we use it as part of providing transactional data consistency.

This is in fact a working proof of concept available at
(https://github.com/pmem/libcxx) combined with the allocator from
(https://github.com/pmem/nvml).

I will be happy to answer any questions you might have and await your
feedback.

Thanks,
Tom

Tomasz Kapela (2):
  pm: change deque typedef
  pm: add persistency_type typedef to pointer_traits

include/__hash_table | 20 ++++++++------
include/__tree | 8 +++---
include/deque | 8 +++---
include/list | 7 +++--
include/map | 7 +++--
include/memory | 31 ++++++++++++++++++++++
include/set | 9 +++++--
.../pointer.traits.types/persistency_type.pass.cpp | 25 +++++++++++++++++
8 files changed, 94 insertions(+), 21 deletions(-)
create mode 100644 test/std/utilities/memory/pointer.traits/pointer.traits.types/persistency_type.pass.cpp

Necessary to make the deque understand fancy pointers.

Signed-off-by: Tomasz Kapela <tomasz.kapela@intel.com>

The typedef by default binds to value_type, and to
pointer_traits::persistency_type if available.

Signed-off-by: Tomasz Kapela <tomasz.kapela@intel.com>

A couple of comments/questions:

  1. I don’t see why persistent_type is tied to pointer_traits at all? Wouldn’t it work the same as a separate trait?

  2. Given an arbitrary T what are the requirements on the “persistent type for T” (e.g. pointer_traits<T*>::persistent_type)?
    Is it intended to be usable exactly like T itself? If so how do you plan to achieve this and do you have an example?
    If not how are implementations supposed to access the actual underlying object?

  3. Changing the types of internal node data members is somewhat problematic, at least in terms of implementation.

__tree and __hash_table don’t actually invoke the constructors for the node types, and instead only construct the
“value” member directly using allocator_traits::construct as required by [container.requirements.general].
If the “persistent” types have non-trivial constructors or destructors they won’t be called leading to UB. This isn’t
an impossible problem to solve, but it would take a lot of work.

/Eric

PS. Email regarding libc++ should be sent to cfe-dev as well.

Hi Eric,
thanks for the swift reply. As for your questions:

ad.1) In fact it doesn't have to be tied to pointer_traits. Initially i
had it in the allocator_traits, because that made more sense. The
implementation of __tree in the __tree_node_base doesn't know anything
about the allocator. The only thing it has is the _VoidPtr template
parameter, which is a rebound allocator_traits::pointer. I wanted to
have a working implementation without rewriting the whole library. I
believe the most suitable solution would be to add something like
memory_traits from which you could retrieve the persistency_type (which
probably isn't the best name), pointer and maybe even the
allocator. But that's a massively invasive change.

ad.2) I believe that externally it should behave like T. In my
implementation it falls back to a T for a T* pointer. Additionally it should
be default constructible, have an implicit converting constructor,
should be castable to T and assignable from anything convertible to T.
In fact in our github organization we have a working implementation of
both libcxx (https://github.com/pmem/libcxx), a complementary allocator
with a custom fancy pointer and a template class p<> to be used with the
persistency_type
(https://github.com/pmem/nvml/tree/master/src/include/libpmemobj%2B%2B).

ad.3) Unfortunately this whole approach does have caveats. This is one
of them. All of the member data that I have seen in the standard
library's containers are fundamental types. Therefore I think any
persistency_type wrapper should be default constructible and trivially
destructible. There is one other issue, since the member access operator
cannot be overloaded, they cannot be class types. There are probably
other issues I'm not seeing right now. That is why I'm consulting you as
standard library experts.

I hope this answers you questions and comments. Looking forward to more
feedback.

Tom