[Reproducers] SBReproducer RFC

Hi Everyone,

In September I sent out an RFC [1] about adding reproducers to LLDB. Over the
past few months, I landed the reproducer framework, support for the GDB remote
protocol and a bunch of preparatory changes. There’s still an open code review
[2] for dealing with files, but that one is currently blocked by a change to
the VFS in LLVM [3].

The next big piece of work is supporting user commands (e.g. in the driver) and
SB API calls. Originally I expected these two things to be separate, but Pavel
made a good case [4] that they’re actually very similar.

I created a prototype of how I envision this to work. As usual, we can
differentiate between capture and replay.

SB API Capture

When capturing a reproducer, every SB function/method is instrumented using a
macro at function entry. The added code tracks the function identifier
(currently we use its name with PRETTY_FUNCTION) and its arguments.

It also tracks when a function crosses the boundary between internal and
external use. For example, when someone (be it the driver, the python binding
or the RPC server) call SBFoo, and in its implementation SBFoo calls SBBar, we
don’t need to record SBBar. When invoking SBFoo during replay, it will itself
call SBBar.

When a boundary is crossed, the function name and arguments are serialized to a
file. This is trivial for basic types. For objects, we maintain a table that
maps pointer values to indices and serialize the index.

To keep our table consistent, we also need to track return for functions that
return an object by value. We have a separate macro that wraps the returned
object.

The index is sufficient because every object that is passed to a function has
crossed the boundary and hence was recorded. During replay (see below) we map
the index to an address again which ensures consistency.

SB API Replay

To replay the SB function calls we need a way to invoke the corresponding
function from its serialized identifier. For every SB function, there’s a
counterpart that deserializes its arguments and invokes the function. These
functions are added to the map and are called by the replay logic.

Replaying is just a matter looping over the function identifiers in the
serialized file, dispatching the right deserialization function, until no more
data is available.

The deserialization function for constructors or functions that return by value
contains additional logic for dealing with the aforementioned indices. The
resulting objects are added to a table (similar to the one described earlier)
that maps indices to pointers. Whenever an object is passed as an argument, the
index is used to get the actual object from the table.

Tool

Even when using macros, adding the necessary capturing and replay code is
tedious and scales poorly. For the prototype, we did this by hand, but we
propose a new clang-based tool to streamline the process.

For the capture code, the tool would validate that the macro matches the
function signature, suggesting a fixit if the macros are incorrect or missing.
Compared to generating the macros altogether, it has the advantage that we
don’t have “configured” files that are harder to debug (without faking line
numbers etc).

The deserialization code would be fully generated. As shown in the prototype
there are a few different cases, depending on whether we have to account for
objects or not.

Prototype Code

I created a differential [5] on Phabricator with the prototype. It contains the
necessary methods to re-run the gdb remote (reproducer) lit test.

Feedback

Before moving forward I’d like to get the community’s input. What do you think
about this approach? Do you have concerns or can we be smarter somewhere? Any
feedback would be greatly appreciated!

Thanks,
Jonas

[1] http://lists.llvm.org/pipermail/lldb-dev/2018-September/014184.html
[2] https://reviews.llvm.org/D54617
[3] https://reviews.llvm.org/D54277
[4] https://reviews.llvm.org/D55582
[5] https://reviews.llvm.org/D56322

[Adding Tamas for his experience with recording and replaying APIs.]

Thank you for sharing the prototype Jonas. It looks very interesting, but there are a couple of things that worry me about it.

The first one is the usage of __PRETTY_FUNCTION__. That sounds like a non-starter even for an initial implementation, as the string that expands to is going to differ between compilers (gcc and clang will probably agree on it, but I know for a fact it will be different on msvc). It that was just an internal property of the serialization format, then it might be fine, but it looks like you are hardcoding the values in code to connect the methods with their replayers, which is going to be a problem.

I've been thinking about how could this be done better, and the best (though not ideal) way I came up with is using the functions address as the key. That's guaranteed to be unique everywhere. Of course, you cannot serialize that to a file, but since you already have a central place where you list all intercepted functions (to register their replayers), that place can be also used to assign unique integer IDs to these functions. So then the idea would be that the SB_RECORD macro takes the address of the current function, that gets converted to an ID in the lookup table, and the ID gets serialized.

The part that bugs me about this is that taking the address of an overloaded function is extremely tedious (you have to write something like static_cast<function_prototype>(&SBFoo::Bar)). That would mean all of these things would have to be passed to the RECORD macro. OTOH, the upshot of this would be that the macro would now have sufficient information to perform pretty decent error checking on its invocation. Another nice about this could be that once you already have a prototype and an address of the function, it should be possible (with sufficient template-fu) to synthesize replay code for the function automatically, at least in the simple cases, which would avoid the repetitiveness of the current replay code. Together, this might obviate the need for any clang plugins or other funny build steps.

The second thing I noticed is the usage of pointers for identifying object. A pointer is good for that but only while the object it points to is alive. Once the object is gone, the pointer can (and most likely will) be reused. So, it sounds to me like you also need to track the lifetime of these objects. That may be as simple as intercepting constructor/destructor calls, but I haven't seen anything like that yet (though I haven't looked at all details of the patch).

Tying into that is the recording of return values. It looks like the current RECORD_RETURN macro will record the address of the temporary object in the frame of the current function. However, that address will become invalid as soon as the function returns as the result object will be copied into a location specified by the caller as a part of the return processing. Are you handling this in any way?

The final thing, which I noticed is the lack of any sign of threading support. I'm not too worried about that, as that sounds like something that could be fitted into the existing framework incrementally, but it is something worth keeping in mind, as you're going to run into that pretty soon.

Thanks Pavel for looping me in. I haven’t looked into the actual implementation of the prototype yet but reading your description I have some concern regarding the amount of data you capture as I feel it isn’t sufficient to reproduce a set of usecases.

One problem is when the behavior of LLDB is not deterministic for whatever reason (e.g. multi threading, unordered maps, etc…). Lets take SBModule::FindSymbols() what returns an SBSymbolContextList without any specific order (haven’t checked the implementation but I would consider a random order to be valid). If a user calls this function, then iterates through the elements to find an index I, calls GetContextAtIndex(I) and pass the result into a subsequent function then what will we do. Will we capture what did GetContextAtIndex(I) returned in the trace and use that value or will we capture the value of I, call GetContextAtIndex(I) during reproduction and use that value. Doing the first would be correct in this case but would mean we don’t call GetContextAtIndex(I) while doing the second case would mean we call GetContextAtIndex(I) with a wrong index if the order in SBSymbolContextList is non deterministic. In this case as we know that GetContextAtIndex is just an accessor into a vector the first option is the correct one but I can imagine cases where this is not the case (e.g. if GetContextAtIndex would have some useful side effect).

Other interesting question is what to do with functions taking raw binary data in the form of a pointer + size (e.g. SBData::SetData). I think we will have to annotate these APIs to make the reproducer system aware of the amount of data they have to capture and then allocate these buffers with the correct lifetime during replay. I am not sure what would be the best way to attach these annotations but I think we might need a fairly generic framework because I won’t be surprised if there are more situation when we have to add annotations to the API. I slightly related question is if a function returns a pointer to a raw buffer (e.g. const char* or void*) then do we have to capture the content of it or the pointer for it and in either case what is the lifetime of the buffer returned (e.g. SBError::GetCString() returns a buffer what goes out of scope when the SBError goes out of scope).

Additionally I am pretty sure we have at least some functions returning various indices what require remapping other then the pointers either because they are just indexing into a data structure with undefined internal order or they referencing some other resource. Just by randomly browsing some of the SB APIs I found for example SBHostOS::ThreadCreate what returns the pid/tid for the newly created thread what will have to be remapped (it also takes a function as an argument what is a problem as well). Because of this I am not sure if we can get away with an automatically generated set of API descriptions instead of wring one with explicit annotations for the various remapping rules.

If there is interest I can try to take a deeper look into the topic sometime later but I hope that those initial thoughts are useful.

Tamas

Hi Everyone,

In September I sent out an RFC [1] about adding reproducers to LLDB.
Over the
past few months, I landed the reproducer framework, support for the GDB
remote
protocol and a bunch of preparatory changes. There’s still an open code
review
[2] for dealing with files, but that one is currently blocked by a change to
the VFS in LLVM [3].

The next big piece of work is supporting user commands (e.g. in the
driver) and
SB API calls. Originally I expected these two things to be separate, but
Pavel
made a good case [4] that they’re actually very similar.

I created a prototype of how I envision this to work. As usual, we can
differentiate between capture and replay.

SB API Capture

When capturing a reproducer, every SB function/method is instrumented
using a
macro at function entry. The added code tracks the function identifier
(currently we use its name with PRETTY_FUNCTION) and its arguments.

It also tracks when a function crosses the boundary between internal and
external use. For example, when someone (be it the driver, the python
binding
or the RPC server) call SBFoo, and in its implementation SBFoo calls
SBBar, we
don’t need to record SBBar. When invoking SBFoo during replay, it will
itself
call SBBar.

When a boundary is crossed, the function name and arguments are
serialized to a
file. This is trivial for basic types. For objects, we maintain a table that
maps pointer values to indices and serialize the index.

To keep our table consistent, we also need to track return for functions
that
return an object by value. We have a separate macro that wraps the returned
object.

The index is sufficient because every object that is passed to a
function has
crossed the boundary and hence was recorded. During replay (see below)
we map
the index to an address again which ensures consistency.

SB API Replay

To replay the SB function calls we need a way to invoke the corresponding
function from its serialized identifier. For every SB function, there’s a
counterpart that deserializes its arguments and invokes the function. These
functions are added to the map and are called by the replay logic.

Replaying is just a matter looping over the function identifiers in the
serialized file, dispatching the right deserialization function, until
no more
data is available.

The deserialization function for constructors or functions that return
by value
contains additional logic for dealing with the aforementioned indices. The
resulting objects are added to a table (similar to the one described
earlier)
that maps indices to pointers. Whenever an object is passed as an
argument, the
index is used to get the actual object from the table.

Tool

Even when using macros, adding the necessary capturing and replay code is
tedious and scales poorly. For the prototype, we did this by hand, but we
propose a new clang-based tool to streamline the process.

For the capture code, the tool would validate that the macro matches the
function signature, suggesting a fixit if the macros are incorrect or
missing.
Compared to generating the macros altogether, it has the advantage that we
don’t have “configured” files that are harder to debug (without faking line
numbers etc).

The deserialization code would be fully generated. As shown in the prototype
there are a few different cases, depending on whether we have to account for
objects or not.

Prototype Code

I created a differential [5] on Phabricator with the prototype. It
contains the
necessary methods to re-run the gdb remote (reproducer) lit test.

Feedback

Before moving forward I’d like to get the community’s input. What do you
think
about this approach? Do you have concerns or can we be smarter
somewhere? Any
feedback would be greatly appreciated!

Thanks,
Jonas

[1] http://lists.llvm.org/pipermail/lldb-dev/2018-September/014184.html
[2] https://reviews.llvm.org/D54617
[3] https://reviews.llvm.org/D54277
[4] https://reviews.llvm.org/D55582
[5] https://reviews.llvm.org/D56322


lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Thanks for the feedback Pavel!

[Adding Tamas for his experience with recording and replaying APIs.]

Thank you for sharing the prototype Jonas. It looks very interesting,
but there are a couple of things that worry me about it.

The first one is the usage of PRETTY_FUNCTION. That sounds like a
non-starter even for an initial implementation, as the string that
expands to is going to differ between compilers (gcc and clang will
probably agree on it, but I know for a fact it will be different on
msvc). It that was just an internal property of the serialization
format, then it might be fine, but it looks like you are hardcoding the
values in code to connect the methods with their replayers, which is
going to be a problem.

I should’ve made it clear that I didn’t intend to have that checked in. Originally the idea was that we would always use the clang-based tool to generate the IDs (the PRETTY_FUNCTION was just a placeholder for the prototype). I didn’t give that much more thought, but if the tool would not generate the ID but just check the macro, that’s becomes a valid concern again.

I’ve been thinking about how could this be done better, and the best
(though not ideal) way I came up with is using the functions address as
the key. That’s guaranteed to be unique everywhere. Of course, you
cannot serialize that to a file, but since you already have a central
place where you list all intercepted functions (to register their
replayers), that place can be also used to assign unique integer IDs to
these functions. So then the idea would be that the SB_RECORD macro
takes the address of the current function, that gets converted to an ID
in the lookup table, and the ID gets serialized.

It sound like you would generate the indices at run-time. How would that work with regards to the the reverse mapping?

The part that bugs me about this is that taking the address of an
overloaded function is extremely tedious (you have to write something
like static_cast<function_prototype>(&SBFoo::Bar)). That would mean all
of these things would have to be passed to the RECORD macro. OTOH, the
upshot of this would be that the macro would now have sufficient
information to perform pretty decent error checking on its invocation.
Another nice about this could be that once you already have a prototype
and an address of the function, it should be possible (with sufficient
template-fu) to synthesize replay code for the function automatically,
at least in the simple cases, which would avoid the repetitiveness of
the current replay code. Together, this might obviate the need for any
clang plugins or other funny build steps.

See my previous question. I see how the signature would help with decoding but still missing how you’d get the mapping.

The second thing I noticed is the usage of pointers for identifying
object. A pointer is good for that but only while the object it points
to is alive. Once the object is gone, the pointer can (and most likely
will) be reused. So, it sounds to me like you also need to track the
lifetime of these objects. That may be as simple as intercepting
constructor/destructor calls, but I haven’t seen anything like that yet
(though I haven’t looked at all details of the patch).

This shouldn’t be a problem. When a new object is created it will be recorded in the table with a new identifier.

Tying into that is the recording of return values. It looks like the
current RECORD_RETURN macro will record the address of the temporary
object in the frame of the current function. However, that address will
become invalid as soon as the function returns as the result object will
be copied into a location specified by the caller as a part of the
return processing. Are you handling this in any way?

We capture the temporary and the call to the copy-assignment constructor. This is not super efficient but it’s the best we can do.

The final thing, which I noticed is the lack of any sign of threading
support. I’m not too worried about that, as that sounds like something
that could be fitted into the existing framework incrementally, but it
is something worth keeping in mind, as you’re going to run into that
pretty soon.

Yup, I’ve intentially ignored this for now.

    I've been thinking about how could this be done better, and the best
    (though not ideal) way I came up with is using the functions address as
    the key. That's guaranteed to be unique everywhere. Of course, you
    cannot serialize that to a file, but since you already have a central
    place where you list all intercepted functions (to register their
    replayers), that place can be also used to assign unique integer IDs to
    these functions. So then the idea would be that the SB_RECORD macro
    takes the address of the current function, that gets converted to an ID
    in the lookup table, and the ID gets serialized.

It sound like you would generate the indices at run-time. How would that work with regards to the the reverse mapping?

In the current implementation, SBReplayer::Init contains a list of all intercepted methods, right? Each of the SB_REGISTER calls takes two arguments: The method name, and the replay implementation.

I would change that so that this macro takes three arguments:
- the function address (the "runtime" ID)
- an integer (the "serialized" ID)
- the replay implementation

This creates a link between the function address and the serialized ID. So when, during capture, a method calls SB_RECORD_ENTRY and passes in the function address, that address can be looked up and translated to an ID for serialization.

The only thing that would need to be changed is to have SBReplayer::Init execute during record too (which probably means it shouldn't be called SBReplayer, but whatever..), so that the ID mapping is also available when capturing.

Does that make sense?

    The part that bugs me about this is that taking the address of an
    overloaded function is extremely tedious (you have to write something
    like static_cast<function_prototype>(&SBFoo::Bar)). That would mean all
    of these things would have to be passed to the RECORD macro. OTOH, the
    upshot of this would be that the macro would now have sufficient
    information to perform pretty decent error checking on its invocation.
    Another nice about this could be that once you already have a prototype
    and an address of the function, it should be possible (with sufficient
    template-fu) to synthesize replay code for the function automatically,
    at least in the simple cases, which would avoid the repetitiveness of
    the current replay code. Together, this might obviate the need for any
    clang plugins or other funny build steps.

See my previous question. I see how the signature would help with decoding but still missing how you'd get the mapping.

    The second thing I noticed is the usage of pointers for identifying
    object. A pointer is good for that but only while the object it points
    to is alive. Once the object is gone, the pointer can (and most likely
    will) be reused. So, it sounds to me like you also need to track the
    lifetime of these objects. That may be as simple as intercepting
    constructor/destructor calls, but I haven't seen anything like that yet
    (though I haven't looked at all details of the patch).

This shouldn't be a problem. When a new object is created it will be recorded in the table with a new identifier.

Ok, sounds good.

    Tying into that is the recording of return values. It looks like the
    current RECORD_RETURN macro will record the address of the temporary
    object in the frame of the current function. However, that address will
    become invalid as soon as the function returns as the result object
    will
    be copied into a location specified by the caller as a part of the
    return processing. Are you handling this in any way?

We capture the temporary and the call to the copy-assignment constructor. This is not super efficient but it's the best we can do.

Ok, cool. I must have missed that part in the code.

    The final thing, which I noticed is the lack of any sign of threading
    support. I'm not too worried about that, as that sounds like something
    that could be fitted into the existing framework incrementally, but it
    is something worth keeping in mind, as you're going to run into that
    pretty soon.

Yup, I've intentially ignored this for now.

Awasome.

Thanks Pavel for looping me in. I haven’t looked into the actual implementation of the prototype yet but reading your description I have some concern regarding the amount of data you capture as I feel it isn’t sufficient to reproduce a set of usecases.

Thanks Tamas!

One problem is when the behavior of LLDB is not deterministic for whatever reason (e.g. multi threading, unordered maps, etc…). Lets take SBModule::FindSymbols() what returns an SBSymbolContextList without any specific order (haven’t checked the implementation but I would consider a random order to be valid). If a user calls this function, then iterates through the elements to find an index I, calls GetContextAtIndex(I) and pass the result into a subsequent function then what will we do. Will we capture what did GetContextAtIndex(I) returned in the trace and use that value or will we capture the value of I, call GetContextAtIndex(I) during reproduction and use that value. Doing the first would be correct in this case but would mean we don’t call GetContextAtIndex(I) while doing the second case would mean we call GetContextAtIndex(I) with a wrong index if the order in SBSymbolContextList is non deterministic. In this case as we know that GetContextAtIndex is just an accessor into a vector the first option is the correct one but I can imagine cases where this is not the case (e.g. if GetContextAtIndex would have some useful side effect).

Indeed, in this scenario we would replay the call with the same I resulting in an incorrect value. I think the only solution is fixing the non-derterminism. This should be straightforward for lists (some kind of sensible ordering), but maybe there are other issues I’m not aware of.

Other interesting question is what to do with functions taking raw binary data in the form of a pointer + size (e.g. SBData::SetData). I think we will have to annotate these APIs to make the reproducer system aware of the amount of data they have to capture and then allocate these buffers with the correct lifetime during replay. I am not sure what would be the best way to attach these annotations but I think we might need a fairly generic framework because I won’t be surprised if there are more situation when we have to add annotations to the API. I slightly related question is if a function returns a pointer to a raw buffer (e.g. const char* or void*) then do we have to capture the content of it or the pointer for it and in either case what is the lifetime of the buffer returned (e.g. SBError::GetCString() returns a buffer what goes out of scope when the SBError goes out of scope).

This a good concern and not something I had a good solution for at this point. For const char* string we work around this by serializing the actual string. Obviously that won’t always work. Also we have the void* batons for callsback, which is another tricky thing that wouldn’t be supported. I’m wondering if we can get away with ignoring these at first (maybe printing something in the replay logic that warns the user that the reproducer contains an unsupported function?).

Additionally I am pretty sure we have at least some functions returning various indices what require remapping other then the pointers either because they are just indexing into a data structure with undefined internal order or they referencing some other resource. Just by randomly browsing some of the SB APIs I found for example SBHostOS::ThreadCreate what returns the pid/tid for the newly created thread what will have to be remapped (it also takes a function as an argument what is a problem as well). Because of this I am not sure if we can get away with an automatically generated set of API descriptions instead of wring one with explicit annotations for the various remapping rules.

Fixing the non-determinism should also address this, right?

If there is interest I can try to take a deeper look into the topic sometime later but I hope that those initial thoughts are useful.

Thank you. I’ll start by incorporating the feedback and ping the thread when the patch is ready for another look.

I’ve been thinking about how could this be done better, and the best
(though not ideal) way I came up with is using the functions address as
the key. That’s guaranteed to be unique everywhere. Of course, you
cannot serialize that to a file, but since you already have a central
place where you list all intercepted functions (to register their
replayers), that place can be also used to assign unique integer IDs to
these functions. So then the idea would be that the SB_RECORD macro
takes the address of the current function, that gets converted to an ID
in the lookup table, and the ID gets serialized.
It sound like you would generate the indices at run-time. How would that work with regards to the the reverse mapping?

In the current implementation, SBReplayer::Init contains a list of all intercepted methods, right? Each of the SB_REGISTER calls takes two arguments: The method name, and the replay implementation.

I would change that so that this macro takes three arguments:

  • the function address (the “runtime” ID)
  • an integer (the “serialized” ID)
  • the replay implementation

This creates a link between the function address and the serialized ID. So when, during capture, a method calls SB_RECORD_ENTRY and passes in the function address, that address can be looked up and translated to an ID for serialization.

The only thing that would need to be changed is to have SBReplayer::Init execute during record too (which probably means it shouldn’t be called SBReplayer, but whatever…), so that the ID mapping is also available when capturing.

Does that make sense?

I think I understand what you’re explaining, and the mapping side of things makes sense. But I’m concerned about the size and complexity of the SB_RECORD macro that will need to be written. IIUC, those would need to take the address of the current function and the prototype, which is a lot of cumbersome text to type. It seems like having a specialized tool to generate those would be nice, but once you have a tool you also don’t need all this complexity, do you?

Fred

Yes, if the tool generates the IDs for you and checks that the macro invocations are correct, then you don't need the function prototype. However, that tool also doesn't come for free: Somebody has to write it, and it adds complexity in the form of an extra step in the build process.

My point is that this extended macro could provide all the error-checking benefits of this tool. It's a tradeoff, of course, and the cost here is a more complex macro invocation. I think the choice here is mostly down to personal preference of whoever implements this. However, if I was implementing this, I'd go for an extended macro, because I don't find the extra macro complexity to be too much. For example, this should be the macro invocation for SBData::SetDataFromDoubleArray:

SB_RECORD(bool, SBData, SetDataFromDoubleArray, (double *, size_t),
     array, array_len);

It's a bit long, but it's not that hard to type, and all of this information should be present on the previous line, where SBData::SetDataFromDoubleArray is defined (I deliberately made the macro argument order match the function definition syntax).

And this approach can be further tweaked. For instance, if we're willing to take the hit of having "weird" function definitions, then we can avoid the repetition altogether, and make the macro define the function too:

SB_METHOD2(bool, SBData, SetDataFromDoubleArray, double *, array,
     size_t, array_len, {
   // Method body
})

This would also enable you to automatically capture method return value for the "object" results.

pl

    One problem is when the behavior of LLDB is not deterministic for
    whatever reason (e.g. multi threading, unordered maps, etc...). Lets
    take SBModule::FindSymbols() what returns an SBSymbolContextList
    without any specific order (haven't checked the implementation but I
    would consider a random order to be valid). If a user calls this
    function, then iterates through the elements to find an index `I`,
    calls `GetContextAtIndex(I)` and pass the result into a subsequent
    function then what will we do. Will we capture what did
    `GetContextAtIndex(I)` returned in the trace and use that value or
    will we capture the value of `I`, call `GetContextAtIndex(I)` during
    reproduction and use that value. Doing the first would be correct in
    this case but would mean we don't call `GetContextAtIndex(I)` while
    doing the second case would mean we call `GetContextAtIndex(I)` with
    a wrong index if the order in SBSymbolContextList is non
    deterministic. In this case as we know that GetContextAtIndex is
    just an accessor into a vector the first option is the correct one
    but I can imagine cases where this is not the case (e.g. if
    GetContextAtIndex would have some useful side effect).

Indeed, in this scenario we would replay the call with the same `I` resulting in an incorrect value. I think the only solution is fixing the non-derterminism. This should be straightforward for lists (some kind of sensible ordering), but maybe there are other issues I'm not aware of.

For this, I think we should adopt the same rules that llvm has for nondeterminism: returning entries in an unspecified order is fine as long as that order is always the same for the given set of inputs. So, using unordered maps is fine as long as it doesn't use any runtime-dependent values (pointers), or this nondeterminism is removed (explicit sort) before letting the values out.

This way, when you're debugging something, and your replay doesn't work because of nondeterminism, you fix the nodeterministic bug. It may not have been the bug you set out to fix, but you still reduce the overall number of bugs.

    Other interesting question is what to do with functions taking raw
    binary data in the form of a pointer + size (e.g. SBData::SetData).
    I think we will have to annotate these APIs to make the reproducer
    system aware of the amount of data they have to capture and then
    allocate these buffers with the correct lifetime during replay. I am
    not sure what would be the best way to attach these annotations but
    I think we might need a fairly generic framework because I won't be
    surprised if there are more situation when we have to add
    annotations to the API. I slightly related question is if a function
    returns a pointer to a raw buffer (e.g. const char* or void*) then
    do we have to capture the content of it or the pointer for it and in
    either case what is the lifetime of the buffer returned (e.g.
    SBError::GetCString() returns a buffer what goes out of scope when
    the SBError goes out of scope).

This a good concern and not something I had a good solution for at this point. For const char* string we work around this by serializing the actual string. Obviously that won't always work. Also we have the void* batons for callsback, which is another tricky thing that wouldn't be supported. I'm wondering if we can get away with ignoring these at first (maybe printing something in the replay logic that warns the user that the reproducer contains an unsupported function?).

Overall, I think we should leave some space to enable hand-written record/replay logic for the tricky cases. Batons will be one of those cases, but I don't think they're unsolvable. The only thing we can do with a user-specified void* baton is pass it back to the user callback. But we're not going to that since we don't have the code for the callback anyway. Nor do we need to do that since we're not interested in what happens outside of SB boundary.

For this, I think the right solution is to replay the *effects* of the call to the user callback by re-executing the SB calls it made, which can be recorded as usual, when they cross the SB boundary.

    Additionally I am pretty sure we have at least some functions
    returning various indices what require remapping other then the
    pointers either because they are just indexing into a data structure
    with undefined internal order or they referencing some other
    resource. Just by randomly browsing some of the SB APIs I found for
    example SBHostOS::ThreadCreate what returns the pid/tid for the
    newly created thread what will have to be remapped (it also takes a
    function as an argument what is a problem as well). Because of this
    I am not sure if we can get away with an automatically generated set
    of API descriptions instead of wring one with explicit annotations
    for the various remapping rules.

Fixing the non-determinism should also address this, right?

I think threads should be handled the same way as callbacks, by replaying the contained SB calls. Since this will certainly require some custom replay logic, as a part of that logic we can remap the thread IDs.

   I've been thinking about how could this be done better, and the best
   (though not ideal) way I came up with is using the functions address as
   the key. That's guaranteed to be unique everywhere. Of course, you
   cannot serialize that to a file, but since you already have a central
   place where you list all intercepted functions (to register their
   replayers), that place can be also used to assign unique integer IDs to
   these functions. So then the idea would be that the SB_RECORD macro
   takes the address of the current function, that gets converted to an ID
   in the lookup table, and the ID gets serialized.
It sound like you would generate the indices at run-time. How would that work with regards to the the reverse mapping?

In the current implementation, SBReplayer::Init contains a list of all intercepted methods, right? Each of the SB_REGISTER calls takes two arguments: The method name, and the replay implementation.

I would change that so that this macro takes three arguments:
- the function address (the "runtime" ID)
- an integer (the "serialized" ID)
- the replay implementation

This creates a link between the function address and the serialized ID. So when, during capture, a method calls SB_RECORD_ENTRY and passes in the function address, that address can be looked up and translated to an ID for serialization.

The only thing that would need to be changed is to have SBReplayer::Init execute during record too (which probably means it shouldn't be called SBReplayer, but whatever..), so that the ID mapping is also available when capturing.

Does that make sense?

I think I understand what you’re explaining, and the mapping side of things makes sense. But I’m concerned about the size and complexity of the SB_RECORD macro that will need to be written. IIUC, those would need to take the address of the current function and the prototype, which is a lot of cumbersome text to type. It seems like having a specialized tool to generate those would be nice, but once you have a tool you also don’t need all this complexity, do you?
Fred

Yes, if the tool generates the IDs for you and checks that the macro invocations are correct, then you don't need the function prototype. However, that tool also doesn't come for free: Somebody has to write it, and it adds complexity in the form of an extra step in the build process.

Definitely agreed, the complexity has to be somewhere.

My point is that this extended macro could provide all the error-checking benefits of this tool. It's a tradeoff, of course, and the cost here is a more complex macro invocation. I think the choice here is mostly down to personal preference of whoever implements this. However, if I was implementing this, I'd go for an extended macro, because I don't find the extra macro complexity to be too much. For example, this should be the macro invocation for SBData::SetDataFromDoubleArray:

SB_RECORD(bool, SBData, SetDataFromDoubleArray, (double *, size_t),
   array, array_len);

Yeah, this doesn’t seem so bad. For some reason I imagined it much more verbose. Note that a verification tool that checks that every SB method is instrumented correctly would still be nice (but it can come as a follow-up).

It's a bit long, but it's not that hard to type, and all of this information should be present on the previous line, where SBData::SetDataFromDoubleArray is defined (I deliberately made the macro argument order match the function definition syntax).

And this approach can be further tweaked. For instance, if we're willing to take the hit of having "weird" function definitions, then we can avoid the repetition altogether, and make the macro define the function too:

SB_METHOD2(bool, SBData, SetDataFromDoubleArray, double *, array,
   size_t, array_len, {
// Method body
})

I personally don’t like this.

Fred

I’ve been thinking about how could this be done better, and the best
(though not ideal) way I came up with is using the functions address as
the key. That’s guaranteed to be unique everywhere. Of course, you
cannot serialize that to a file, but since you already have a central
place where you list all intercepted functions (to register their
replayers), that place can be also used to assign unique integer IDs to
these functions. So then the idea would be that the SB_RECORD macro
takes the address of the current function, that gets converted to an ID
in the lookup table, and the ID gets serialized.
It sound like you would generate the indices at run-time. How would that work with regards to the the reverse mapping?
In the current implementation, SBReplayer::Init contains a list of all intercepted methods, right? Each of the SB_REGISTER calls takes two arguments: The method name, and the replay implementation.

I would change that so that this macro takes three arguments:

  • the function address (the “runtime” ID)
  • an integer (the “serialized” ID)
  • the replay implementation

This creates a link between the function address and the serialized ID. So when, during capture, a method calls SB_RECORD_ENTRY and passes in the function address, that address can be looked up and translated to an ID for serialization.

The only thing that would need to be changed is to have SBReplayer::Init execute during record too (which probably means it shouldn’t be called SBReplayer, but whatever…), so that the ID mapping is also available when capturing.

Does that make sense?
I think I understand what you’re explaining, and the mapping side of things makes sense. But I’m concerned about the size and complexity of the SB_RECORD macro that will need to be written. IIUC, those would need to take the address of the current function and the prototype, which is a lot of cumbersome text to type. It seems like having a specialized tool to generate those would be nice, but once you have a tool you also don’t need all this complexity, do you?
Fred

Yes, if the tool generates the IDs for you and checks that the macro invocations are correct, then you don’t need the function prototype. However, that tool also doesn’t come for free: Somebody has to write it, and it adds complexity in the form of an extra step in the build process.

Definitely agreed, the complexity has to be somewhere.

My point is that this extended macro could provide all the error-checking benefits of this tool. It’s a tradeoff, of course, and the cost here is a more complex macro invocation. I think the choice here is mostly down to personal preference of whoever implements this. However, if I was implementing this, I’d go for an extended macro, because I don’t find the extra macro complexity to be too much. For example, this should be the macro invocation for SBData::SetDataFromDoubleArray:

SB_RECORD(bool, SBData, SetDataFromDoubleArray, (double *, size_t),
array, array_len);

Yeah, this doesn’t seem so bad. For some reason I imagined it much more verbose. Note that a verification tool that checks that every SB method is instrumented correctly would still be nice (but it can come as a follow-up).

It sounds like this should work but we should try it out to be sure. I’ll rework the prototype to use the function address and update the thread with my findings. I also like the idea of using templates to generate the parsers so I’ll try that as well.

Before I got around to coding this up I realized you can’t take the address of constructors in C++, so the function address won’t work as an identifier.

You gave up way too easily. :stuck_out_tongue:

I realized that constructors are going to be tricky, but I didn't want to dive into those details until I knew if you liked the general idea. The most important thing to realize here is that for the identifier thingy to work, you don't actually need to use the address of that method/constructor as the identifier. It is sufficient to have something that can be deterministically computed from the function. Then you can use the address of *that* as the identifier.

I've created a very simple prototype <https://godbolt.org/z/_xDt5r>, where I do just that. The way I handle constructors there is that I create a special class template (construct), whose instantiations are going to be unique for each constructor (I achieve that by making the class name and the constructor argument types the template parameters of that function). Then I can take the address of the static member function inside this class (&construct<class, arguments...>::doit), and use *that* as the ID.

As a nice side-effect, the "doit" method actually does invoke the constructor in question, so I can also use that in the replay code to treat constructors like any other method that returns an object.

I also do the same thing for (non-static) member functions via the "invoke" template, because even though it is possible to take the address of those, it is very hard to do anything else with the retrieved pointer. So the effect of this that in the rest of the code, I only have to work with free functions, as both constructors and member functions are converted into equivalent free functions. I haven't tried to handle destructors yet, but I don't think those should pose any problems that we haven't encountered already.

The example also show how you can use templates to automatically generate replay code for "simple" (i.e. those where you can (de)serialize each argument independently) functions, and then uses that to record/replay a very simple API.

You can see it in action like this:
$ g++ a.cc # compile
$ ./a.out 2>/tmp/recording # generate the recording
SBFoo 47 42
Method 1 2
Static 10 11
$ cat /tmp/recording
0 # ID of the constructor
47 # constructor arg 1
42 # constructor arg 2
0x7ffd74d9a0f7 # constructor result
1 # id of SBFoo::Method
0x7ffd74d9a0f7 # this
1 # arg 1
2 # arg 2
2 # id of SBFoo::Static
10 # arg 1
11 # arg 2
$ ./a.out 1 < /tmp/recording # replay the recording
SBFoo 47 42
SBFoo 42 47
Method 1 2
Static 10 11

Note that when replaying the SBFoo constructor is called twice. This is because this code does not attempt to track the object instances in any way... it just creates a new one each time. This obviously needs to be fixed, but that's independent of the function ID issue.

hope you find that useful,
pl

Before I got around to coding this up I realized you can’t take the
address of constructors in C++, so the function address won’t work as an
identifier.

You gave up way too easily. :stuck_out_tongue:

I counted on you having something in mind, it sounded too obvious for you to have missed. :wink:

I realized that constructors are going to be tricky, but I didn’t want
to dive into those details until I knew if you liked the general idea.
The most important thing to realize here is that for the identifier
thingy to work, you don’t actually need to use the address of that
method/constructor as the identifier. It is sufficient to have something
that can be deterministically computed from the function. Then you can
use the address of that as the identifier.

I was thinking about that yesterday. I still feel like it would be better to have this mapping all done at compile time. I was considering some kind of constexpr hashing but that sounded overkill.

I’ve created a very simple prototype <https://godbolt.org/z/_xDt5r>,
where I do just that. The way I handle constructors there is that I
create a special class template (construct), whose instantiations are
going to be unique for each constructor (I achieve that by making the
class name and the constructor argument types the template parameters of
that function). Then I can take the address of the static member
function inside this class (&construct<class, arguments…>::doit), and
use that as the ID.

Clever!

As a nice side-effect, the “doit” method actually does invoke the
constructor in question, so I can also use that in the replay code to
treat constructors like any other method that returns an object.

This is really neat.

I also do the same thing for (non-static) member functions via the
“invoke” template, because even though it is possible to take the
address of those, it is very hard to do anything else with the retrieved
pointer. So the effect of this that in the rest of the code, I only have
to work with free functions, as both constructors and member functions
are converted into equivalent free functions. I haven’t tried to handle
destructors yet, but I don’t think those should pose any problems that
we haven’t encountered already.

The example also show how you can use templates to automatically
generate replay code for “simple” (i.e. those where you can
(de)serialize each argument independently) functions, and then uses that
to record/replay a very simple API.

You can see it in action like this:
$ g++ a.cc # compile
$ ./a.out 2>/tmp/recording # generate the recording
SBFoo 47 42
Method 1 2
Static 10 11
$ cat /tmp/recording
0 # ID of the constructor
47 # constructor arg 1
42 # constructor arg 2
0x7ffd74d9a0f7 # constructor result
1 # id of SBFoo::Method
0x7ffd74d9a0f7 # this
1 # arg 1
2 # arg 2
2 # id of SBFoo::Static
10 # arg 1
11 # arg 2
$ ./a.out 1 < /tmp/recording # replay the recording
SBFoo 47 42
SBFoo 42 47
Method 1 2
Static 10 11

Note that when replaying the SBFoo constructor is called twice. This is
because this code does not attempt to track the object instances in any
way… it just creates a new one each time. This obviously needs to be
fixed, but that’s independent of the function ID issue.

hope you find that useful,
pl

Definitely, thank you for taking the time to code up a prototype.

Cheers,
Jonas

Well.. most of this is done through template meta-programming, which _is_ compile-time. And the fact that I have a use for the new construct/invoke functions I create this way means that even the space used by those isn't completely wasted (although I'm sure this could be made smaller with hard-coded IDs). The biggest impact of this I can think of is the increased number of dynamic relocations that need to be performed by the loader, as it introduces a bunch of function pointers floating around. But even that shouldn't too bad as we have plenty of other sources of dynamic relocs (currently about 4% of the size of liblldb and 10% of lldb-server).

Yeah of course, it wasn’t my intention to critique your approach. I was talking specifically about the mapping (the std::map) in the prototype, something I asked about earlier in the thread. FWIW I think this would be an excellent trade-off is we don’t need a tool to generate code for us. I’m hopeful that we can have the gross of the deserialization code generated this way, most of the “special” cases are still pretty similar and dealing with basic types. Anyway, that should become clear later today as I integrate this into the lldb prototype.

I’ve updated the patch with a new version of the prototype: https://reviews.llvm.org/D56322

It uses Pavel’s suggestion to use the function address as a runtime ID. All the deserialization code is generated using templates, with automatic mapping on indices during serialization and deserialization.

I (again) manually added the macros for the same set of functions I had in the original prototype. Unsurprisingly this is very error-prone. It’s easy to forget to add the right macros for the registry, the function, and the return type. Some of these things can be detected at compile time, other only blow up at run-time. I strongly believe that a tool to add the macros is the way forward. It would be more of a developer tool rather than something that hooks up in the build process.

Note that it’s still a prototype, there are outstanding issues like void pointers, callbacks and other types of argument that require some kind of additional information to serialize. I also didn’t get around yet to the lifetime issue yet that was discussed on IRC last week.

Please let me know what you think.

Thanks,
Jonas

I’ve put up a (WIP) patch for the tool (https://reviews.llvm.org/D56822) in case anybody is curious about that.

Thank you for the update Jonas. I haven't looked at the patch in detail, but (for better or worse) it seems more-or-less like what I would expect at a first glance.

One of the things that occurred to me while looking at this is that it would be great to be able to test this code in greater isolation instead of just the full "run lldb.exe" test. (I've already said this on the file provider patch, but I think that is doubly true here.) There shouldn't be anything (except the prolific use of the string SB) tying this code to the SB api, and so we should be able to use it to record/replay a dummy api for testing purposes.

That would make it easier to test some of the trickier corner cases. For instance you could use placement new to make sure you get two objects at the same address instead of hoping that this will eventually happen during the full lldb run. It would also enable you to make incremental patches while you work out the remaining issues instead of having to have a fully working system from the get-go.

I think the only big change that you would need to make for this to work is to move the core of this code out of the API folder (maybe all the way down to Utility?), as in the way the build is set up now, it won't get exported in a way that can be used by anyone else.

I think the patch tool you've made strikes a good compromise between ease of use and ease of developing it. I expect it will be particularly useful during the initial stages, when you need to annotate a large number of functions in batches.