[libc++] debug mode

A few years back, prior to the open sourcing of libc++, I attempted to create a debug mode. That first effort had a little bit of good work in it, but was basically still-born. It is defined under _LIBCPP_DEBUG and doesn't even compile today.

Lately I've been working on a new debug mode. It is currently defined under _LIBCPP_DEBUG2. The plan is that if this experiment is successful, then _LIBCPP_DEBUG2 will be renamed to _LIBCPP_DEBUG, and the original _LIBCPP_DEBUG will either be integrated, or it will disappear.

The initial check-in for _LIBCPP_DEBUG2 was committed revision 139711.

There isn't a lot there yet. I've tried to get vector working (just the primary, not vector<bool> so far), and I believe I have done so. Though I don't have tests for it yet (I've just been spot checking). At this point I'm simply exploring whether the basic design is viable or not.

A major goal of the design of this debug mode is to keep the ABI stable as debug mode is turned on/off. This would enable clients to turn debug mode on/off on a translation unit basis, and not have to worry whether data structures such as vector which are visible across translation units are the debug version or not. To achieve this, all debug data which typically relates iterators to their containers (and vice-versa) is stored in an external, global database. This database is initially empty and compiled into libc++.dylib unconditionally. But it is only referenced when _LIBCPP_DEBUG2 is defined by the client.

The database is currently protected by a mutex. However it is currently set up such that it would be easy to switch to a shared_mutex should testing reveal that this would offer a performance benefit.

Every time a container is constructed (just vector so far), then the address of the container is added to the database. When the container is destructed it is removed from the database. And every time an iterator is constructed within a container, then the address of the iterator is added to the database. Additionally the information of which container the iterator references is also stored in the database. The database can be searched either by iterator address, or by container address. And given a container entry, all iterators which reference the container can be enumerated. And given an iterator entry, which container it references (if any) can be queried.

The database is a custom hash table.

So far everything is quite immature. I've only added API to the database as has become necessary to add debug mode to vector. I don't expect this API to remain stable during the development of debug mode. I do expect the API to remain as small as possible, and to only address the needs of libc++. I don't expect clients to be able to plug their own containers into the libc++ debug mode database (at least I don't view that as a primary goal).

In addition to the database, there is an assert-like macro:

    _LIBCPP_ASSERT(true/false, "message");

I decided not to use assert itself as preliminary clients advised me that they would like to turn libc++ debug mode on/off independently of NDEBUG.

Suggestions concerning debug mode and/or contributions are welcome.

Howard

I have often seen interest on a "cheap" debugging mode, which would add checks which were possible without breaking complexity requirements, or "excessive" cost.

For vector I would include in this (as a non-complete list)

check in front(), back() and pop_back() the vector is non-empty.
Check operator accesses a valid index.
scribble over memory in destructor (or set to a clearly invalid value, such as vector memory is at (void*)1). Perhaps this last point could/should be implemented by the compiler, or malloc implementation?

I don't know how much code this could share with a larger debugger. obviously the same checks could be left in, but they might be redundant.

Also, I am not 100% sure if people wanted this previously because it would not cause ABI breakage. I would still find it worthwhile, as container debugging modes can get very expensive.

Chris

Thanks Chris. I've checked in an experimental implementation of this.

-D_LIBCPP_DEBUG2

gets you the "standard" debugging mode, which includes checking for iterator invalidation.

-D_LIBCPP_DEBUG2=0

gets you "debug lite". This only checks things that are really cheap such as those you mention.

I've left the door open for:

-D_LIBCPP_DEBUG2=2

which would make even more expensive checks (e.g. check the invariants of a red/black tree). But that isn't done at this point.

I'm still concentrating only on vector at this point.

Howard

Howard Hinnant wrote:

There isn't a lot there yet. I've tried to get vector working (just the primary, not vector<bool> so far), and I believe I have done so. Though I don't have tests for it yet (I've just been spot checking). At this point I'm simply exploring whether the basic design is viable or not.

A major goal of the design of this debug mode is to keep the ABI stable as debug mode is turned on/off.

Sounds good, but it'd be useful to have a more detailed design document somewhere so that people can see where you're going and how.

... and Christopher Jefferson replied:

I have often seen interest on a "cheap" debugging mode, which would add checks which were possible without breaking complexity requirements, or "excessive" cost.

Without seeing Howard's design, I don't know how "excessive" the costs of his design might be, but in principle, you can do some checks like iterator invalidation very cheaply -- if you have the right design.

This may be obvious (in which case I apologize), it may even be what Howard has already done (in which case, oops, and, uh, cool), but in case it isn't, let me outline how:

- For every container, associate a 64-bit tag (a.k.a. version stamp).

- For every iterator, also associate a 64-bit tag.

- When you create a new container, just pick a random 64-bit value.

- When you create an iterator, copy the 64-bit tag from the associated container. This represents the container/version the iterator belongs to. Any access via the iterator checks that the tag of the iterator matches the tag of the container. If it doesn't, BANG!

- When iterators are invalidated, generate a new tag for container. (For speed, you could just increment it, but the important point is that the value is a new and different one)

This method is probabilistic -- there is a 1 in 18446744073709551616 chance that it won't catch an invalid access, but personally I like those odds. It also imposes very very little in runtime overhead -- you can invalidate N iterators in O(1) time.

I've always stored the tag in the containers/iterators, but it should also work with a database-based scheme.

    M.E.O.

Howard Hinnant wrote:

There isn’t a lot there yet. I’ve tried to get vector working (just the primary, not vector so far), and I believe I have done so. Though I don’t have tests for it yet (I’ve just been spot checking). At this point I’m simply exploring whether the basic design is viable or not.

A major goal of the design of this debug mode is to keep the ABI stable as debug mode is turned on/off.

Sounds good, but it’d be useful to have a more detailed design document somewhere so that people can see where you’re going and how.

… and Christopher Jefferson replied:

I have often seen interest on a “cheap” debugging mode, which would add checks which were possible without breaking complexity requirements, or “excessive” cost.

Without seeing Howard’s design, I don’t know how “excessive” the costs of his design might be, but in principle, you can do some checks like iterator invalidation very cheaply – if you have the right design.

This may be obvious (in which case I apologize), it may even be what Howard has already done (in which case, oops, and, uh, cool), but in case it isn’t, let me outline how:

  • For every container, associate a 64-bit tag (a.k.a. version stamp).

  • For every iterator, also associate a 64-bit tag.

  • When you create a new container, just pick a random 64-bit value.

  • When you create an iterator, copy the 64-bit tag from the associated container. This represents the container/version the iterator belongs to. Any access via the iterator checks that the tag of the iterator matches the tag of the container. If it doesn’t, BANG!

  • When iterators are invalidated, generate a new tag for container. (For speed, you could just increment it, but the important point is that the value is a new and different one)

This method is probabilistic – there is a 1 in 18446744073709551616 chance that it won’t catch an invalid access, but personally I like those odds. It also imposes very very little in runtime overhead – you can invalidate N iterators in O(1) time.

Why not just use the cool c++11 random number generators, if performance would be an issue, time to optimise those :wink: I don’t like the idea of probabilistic odds, especially in hugely iterated and containered program this would make debugging useless (especially for long-running server-like programs that would need to be debugged)

Ruben

Ruben Van Boxem wrote:

I don't like the idea of probabilistic odds, especially in hugely iterated and containered program this would make debugging useless (especially for long-running server-like programs that would need to be debugged)

You don't like an undefined-behavior detection scheme that has a 1 in 2**64 chance of not detecting erroneous use?

So you're saying that 99.99999999999999999457899% reliability isn't good enough? You realize that at one operation per nanosecond, that's an expected time to failure-to-detect-undefined-behavior of about 585 years? Somehow, I doubt anything else in your system behaves with that level of reliability.

Put another way, if each time we *do* detect undefined behavior, that's a distinct bug we need to fix, we can expect to find 18446744073709551615 bugs before we get one that we fail to find. If we were to task every human being who has ever lived (106.5 billion people) with fixing those bugs, that gives each of them 173 million bugs to fix.

I'd say if your code is *that* buggy, you have other worries.

    M.E.O.

P.S. Now I'll be struck by lightning. Twice. In unrelated incidents. And gored by a stray bull.

if I correctly understand the issue, I would say this is because a random generator would not increase the number of possible values, and would instead increase the chance of collision.

With 64 bit counter, even if you increase it 1 billion times per second, you need more than 500 years to overflow.

– Jean-Daniel

Thanks M.E.O.

This is an interesting approach. But one problem with it is that it only allows the system to invalidate all iterators referring to a container. It can't invalidate a subset of them. And we need to be able to invalidate subsets of iterators (e.g. during vector::erase).

Or have I missed something?

Howard

I wrote:

let me outline how:

- For every container, associate a 64-bit tag (a.k.a. version stamp).

- For every iterator, also associate a 64-bit tag.

- When you create a new container, just pick a random 64-bit value.

- When you create an iterator, copy the 64-bit tag from the associated container. This represents the container/version the iterator belongs to. Any access via the iterator checks that the tag of the iterator matches the tag of the container. If it doesn't, BANG!

- When iterators are invalidated, generate a new tag for container. (For speed, you could just increment it, but the important point is that the value is a new and different one)

and Howard Hinnant replied:

This is an interesting approach. But one problem with it is that it only allows the system to invalidate all iterators referring to a container. It can't invalidate a subset of them. And we need to be able to invalidate subsets of iterators (e.g. during vector::erase).

Or have I missed something?

As outlined, yes, I was doing all-iterator invalidation, as happens in std::string, because that's simpler and quick to explain (and I've experience implementing it!).

But I think the method does generalize. In situations where *some* of the iterators get invalidated, you need to do a little bit more work. You need to maintain an invalidation log:

- For each (active) tag value, associate an optional record. Initially there is no record for a tag.

- When an object updates in a way that invalidates iterators, add a record to its (now former) tag saying enough about what the operation was that you can know what got invalidated (e.g., for vector, the invalidated range, or possibly better, the range that remains valid). Also, record the container's new tag.

Now when the tags don't match, instead of just blowing up, you use your tag to follow update chain through all the tags and see if you're still valid through the operations that have occurred. You'll either find the new tag you need, or find you're invalid.

There are details, of course, such as reference counting things so that it all cleans up nicely (i.e., you only keep as much of the invalidation log as you actually need), and handling tag collisions for record information (because you really shouldn't *ever* fail for valid code, even if it is more likely that the user will be struck by lightning while in the bathroom). But I think those details are pretty straightforward.

    M.E.O.

Howard Hinnant wrote:

This is an interesting approach. But one problem with it is that it only allows the system to invalidate all iterators referring to a container. It can't invalidate a subset of them. And we need to be able to invalidate subsets of iterators (e.g. during vector::erase).

Or have I missed something?

and I replied:

In situations where *some* of the iterators get invalidated, you need to do a little bit more work. You need to maintain an invalidation log [...]

Incidentally, I was thinking about things like std::vector here. For some other cases, such as std::list or the tree-based data structures, things are easier. For those, I think you just have to associate the tags with the "nodes" of the container rather than the whole container.

    M.E.O.

AMDG

As outlined, yes, I was doing all-iterator invalidation, as happens in std::string, because that's simpler and quick to explain (and I've experience implementing it!).

But I think the method does generalize. In situations where *some* of the iterators get invalidated, you need to do a little bit more work. You need to maintain an invalidation log:

- For each (active) tag value, associate an optional record. Initially there is no record for a tag.

- When an object updates in a way that invalidates iterators, add a record to its (now former) tag saying enough about what the operation was that you can know what got invalidated (e.g., for vector, the invalidated range, or possibly better, the range that remains valid). Also, record the container's new tag.

Now when the tags don't match, instead of just blowing up, you use your tag to follow update chain through all the tags and see if you're still valid through the operations that have occurred. You'll either find the new tag you need, or find you're invalid.

There are details, of course, such as reference counting things so that it all cleans up nicely (i.e., you only keep as much of the invalidation log as you actually need), and handling tag collisions for record information (because you really shouldn't *ever* fail for valid code, even if it is more likely that the user will be struck by lightning while in the bathroom). But I think those details are pretty straightforward.

Reference counting the history doesn't sound
at all straightforward to me. The methods I
can think of all have some problem:

a) When a history element is created, set the
   reference count to the number of iterators
   it affects. Counting these iterators requires
   as much work as just marking them as invalid.
b) When a history element is created, set the
   reference count to the total number of
   iterators that currently exist.
   1) When an iterator is destroyed, it has
      to adjust the reference counts of all
      the iterators
   2) A single long-lived iterator can cause
      the history to accumulate indefinitely.
c) We can solve b.2 by storing the reference
   count incrementally. i.e. If the reference
   count of history item 1 is 3 and the reference
   count of history item 2 is 5, then store
   the values 3 and 2. Then we one need to
   count in the first item in the list. This
   still leaves problem b.2, however.

In Christ,
Steven Watanabe

I wrote:

There are details, of course, such as reference counting things so that it all cleans up nicely (i.e., you only keep as much of the invalidation log as you actually need), and handling tag collisions for record information (because you really shouldn't *ever* fail for valid code, even if it is more likely that the user will be struck by lightning while in the bathroom). But I think those details are pretty straightforward.

and Steven Watanabe replied:

[ideas deleted]

I was certainly thinking more along the lines of your second idea. The reference count is the number of direct references, either from iterators directly using that tag, or the reference from the tag's predecessor in history. (*)

A single long-lived iterator can cause the history to accumulate indefinitely.

Yes, this is true. But in practice it only happens if you keep doing lots of erases and you have one long-lived iterator you never actually use. Any actual full invalidation breaks the chain. Likewise actually using the iterator will update it. I suspect it would be more of a problem in theory than practice.

Feel free to suggest alternative designs or variations; my preference though is for schemes where mass iterator invalidation is O(1) (and actually fairly cheap), and the common case (valid iterators) is fast.

    M.E.O.

AMDG

I wrote:

There are details, of course, such as reference counting things so that it all cleans up nicely (i.e., you only keep as much of the invalidation log as you actually need), and handling tag collisions for record information (because you really shouldn't *ever* fail for valid code, even if it is more likely that the user will be struck by lightning while in the bathroom). But I think those details are pretty straightforward.

and Steven Watanabe replied:

[ideas deleted]

I was certainly thinking more along the lines of your second idea. The reference count is the number of direct references, either from iterators directly using that tag, or the reference from the tag's predecessor in history. (*)

A single long-lived iterator can cause the history to accumulate indefinitely.

Yes, this is true. But in practice it only happens if you keep doing lots of erases and you have one long-lived iterator you never actually use. Any actual full invalidation breaks the chain. Likewise actually using the iterator will update it. I suspect it would be more of a problem in theory than practice.

It's quite easy to have a usage pattern where
you do a lot of erases without ever invalidating
the entire vector. That's exactly what happens
when the vector is used as a stack. I can't
think of any obvious cases where I would have
a long-lived iterator, but it seems to me that
the cost of invalidating the iterators doesn't
really matter unless you have a significant
number of them and are operating on the container
at the same time. Under these conditions
I wouldn't be at all surprised to find a
long-lived iterator that isn't used for a
while.

Feel free to suggest alternative designs or variations; my preference though is for schemes where mass iterator invalidation is O(1) (and actually fairly cheap), and the common case (valid iterators) is fast.

mass iterator invalidation is O(1) even if
you implement it by marking all valid iterators
as invalid. Recall that in the standard, O(1)
means /amortized/ constant time. The work done
invalidating the iterators can always be paired
off with the work done creating them in the
first place.

In Christ,
Steven Watanabe

I wrote:

Feel free to suggest alternative designs or variations; my preference though is for schemes where mass iterator invalidation is O(1) (and actually fairly cheap), and the common case (valid iterators) is fast.

... and Steven Watanabe replied:

mass iterator invalidation is O(1) even if you implement it by marking all valid iterators as invalid. Recall that in the standard, O(1) means /amortized/ constant time. The work done invalidating the iterators can always be paired off with the work done creating them in the first place.

The key with any amortized algorithm is that you shouldn't fall into the trap of repeating work in such a way as to break the amortization.

Thus, if you pair each invalidation with creation, yes, it's O(1). But if you have to *check* multiple times whether to invalidate an iterator or not, then you exceed O(1).

For vector erase, it's not clear to me how you would (at negligible cost) know which of your zillion iterators should be invalidated and which should not. Maybe Howard's design does this in some clever way?

    M.E.O.

Howard's not so clever. Maybe I should try to describe in more detail what I'm doing. This description should be prefaced with: nothing is set in stone, and will likely change when I do a node-based container such as list. But here's what's there right now:

There is a new header: <__debug>, and a new source: debug.cpp. The source gets built unconditionally. The header gets implicitly included if -D_LIBCPP_DEBUG2 is on the command line (or equivalent).

The header defines an assert-like macro, _LIBCPP_ASSERT(x, m), which really isn't important for this discussion, but I include it for completeness. The x is a run time boolean expression and m is a const char* message.

If _LIBCPP_DEBUG2 >= 1, then a "container" is defined. This is a singleton container with two factory functions to expose it: const version and non-const version. The container is hash based, and can be looked up by two keys instead of the traditional one: By iterator* or by container*. As we're dealing with several types of containers and iterators, everything that is stored is actually a void*, the address of the iterators and containers.

The API of this container is not stable at this time. And it is also not a general purpose container. The API is growing or changing only in direct response to the needs of the debug mode as it is developed for libc++.

Each iterator* is stored in a node that also has a pointer to a container node in the database. That pointer is null if the iterator does not reference a container (has been invalidated).

Each container* is stored in a node that includes a dynamically sized array (vector-like but not std::vector) of iterator node pointers which contain valid iterators into the referenced container.

Thus you can look up an iterator and find what, if any container it points to. And you can look up a container and find a list of valid iterators that refer to it.

Iterator constructors create iterator entries in the database. Iterator destructors remove iterator entries in the database. Container constructors create container entries in the database and container destructors remove container entries in the database.

The act of removing a container from the database will include locating each valid iterator in the database and invalidating it by zeroing out its container node pointer. The act of removing an iterator from the database will include removing the reference (if any) of this iterator from a container node in the database.

In a few places the container node in the database has to call back into the container to get the information it needs. For example even though an iterator is a valid reference into a container, it may not be dereferenceable. To find out, the container has to be queried. This involves type-erasing the container when storing it into the database as a void*, and then calling back into the container via a virtual function. The algorithm for determining whether a valid iterator is dereferenceable is going to be different for each container.

Issues:

* Iterators must sometimes be invalidated en-mass, e.g. on destruction or buffer reallocation. And sometimes only certain iterators need to be invalidated, such as those that point beyond end() after an erase().

* Iterators sometimes need to switch which containers they refer to but otherwise remain valid, such as during a swap, move or splice.

* For the node-based containers, not all iterators switch containers during a swap, move or splice. Iterators referring to the end-node always stick with their container.

* For string, iterators referring into the internal buffer will never switch containers.

Howard

AMDG

I wrote:

Feel free to suggest alternative designs or variations; my preference though is for schemes where mass iterator invalidation is O(1) (and actually fairly cheap), and the common case (valid iterators) is fast.

... and Steven Watanabe replied:

mass iterator invalidation is O(1) even if you implement it by marking all valid iterators as invalid. Recall that in the standard, O(1) means /amortized/ constant time. The work done invalidating the iterators can always be paired off with the work done creating them in the first place.

The key with any amortized algorithm is that you shouldn't fall into the trap of repeating work in such a way as to break the amortization.

Thus, if you pair each invalidation with creation, yes, it's O(1). But if you have to *check* multiple times whether to invalidate an iterator or not, then you exceed O(1).

For vector erase, it's not clear to me how you would (at negligible cost) know which of your zillion iterators should be invalidated and which should not. Maybe Howard's design does this in some clever way?

In any event, the messy case is invalidating
some iterators. This is still the same under
your scheme. The only difference is that you've
made it lazy, so we sometimes avoid the extra
work. The worst case is still the same.

In Christ,
Steven Watanabe

Steven Watanabe wrote:

In any event, the messy case is invalidating some iterators. This is still the same under your scheme. The only difference is that you've made it lazy, so we sometimes avoid the extra work. The worst case is still the same.

Actually, I'm not nearly so sure. Let's do a worked example.

WLoG, let's imagine simple linear tags, in an array with 20 elements.

- iter1 refers to tag v1 (actually element 3)
- iter2 refers to tag v1 (actually element 11)
  - v1 is partially invalid, elements 0..15 remain valid
  - v1's successor is v2
  - v1's refcount is 2

- iter3 refers to tag v2 (actually element 16)
- iter4 refers to tag v2 (actually element 9)
- iter5 refers to tag v2 (actually element 12)
  - v2 is partially invalid, elements 0..9 remain valid
  - v2's successor is v3
  - v2's predecessor is v1
  - v2's refcount is 3

- iter6 refers to tag v3 (actually element 14)
  - v3 is partially invalid, elements 0..17 remain valid
  - v3's successor is v4
  - v3's predecessor is v2
  - v3's refcount is 1

- iter7 refers to tag v4 (actually element 5)
  - v4 is valid
  - v4's predecessor is v3
  - v4's refcount is 2 (one iterator, plus the container)

Notice that although there is a "long" history (v1..v4), its length is <= the number of active iterators.

Now, suppose iter6 goes away. At this point, v3's refcount drops to zero, so no one needs it except earlier things in the history. BUT, you can fold the info from v3 down into v2. In this case, since v2 already has a smaller range, there's no change to v2's valid range. Here's the new history with v3 removed.

- iter1 refers to tag v1 (actually element 3)
- iter2 refers to tag v1 (actually element 11)
  - v1 is partially invalid, elements 0..15 remain valid
  - v1's successor is v2
  - v1's refcount is 2

- iter3 refers to tag v2 (actually element 16)
- iter4 refers to tag v2 (actually element 9)
- iter5 refers to tag v2 (actually element 12)
  - v2 is partially invalid, elements 0..9 remain valid
  - v2's successor is v4
  - v2's predecessor is v1
  - v2's refcount is 3

- iter7 refers to tag v4 (actually element 5)
  - v4 is valid
  - v4's predecessor is v2
  - v4's refcount is 2 (one iterator, plus the container)

Now suppose that iter3, iter4 and iter5 are also destroyed. v2's refcount drops to zero, so we remove the history node and merge. v1's valid range drops from 0..15 to 0..9.

- iter1 refers to tag v1 (actually element 3)
- iter2 refers to tag v1 (actually element 11)
  - v1 is partially invalid, elements 0..9 remain valid
  - v1's successor is v4
  - v1's refcount is 2

- iter7 refers to tag v4 (actually element 5)
  - v4 is valid
  - v4's predecessor is v1
  - v4's refcount is 2 (one iterator, plus the container)

Now, if iter1 is used, everything is good -- we follow the chain and find everything is valid. After the access, we've updated iter1's tag, so the world looks like this:

- iter2 refers to tag v1 (actually element 11)
  - v1 is partially invalid, elements 0..9 remain valid
  - v1's successor is v4
  - v1's refcount is 1

- iter1 refers to tag v4 (actually element 3)
- iter7 refers to tag v4 (actually element 5)
  - v4 is valid
  - v4's predecessor is v1
  - v4's refcount is 3 (two iterators, plus the container)

But, if we attempt to use iter2, we'll find it to be invalid.

As far as I can tell (and I'll admit, I may not be giving this as much attention as I should), no "unreasonably long" history chains in this case, but perhaps I'm missing some nuance somewhere...?

    M.E.O.