libc++ std::map self-assignment bug

Hi Howard, etc,

libc++3.4 (rc1/trunk) std::map doesn't support self-assignment for std <
c++11. The relevant code is:

    _LIBCPP_INLINE_VISIBILITY
    map& operator=(const map& __m)
        {
#if __cplusplus >= 201103L
            __tree_ = __m.__tree_;
#else
            __tree_.clear();
            __tree_.value_comp() = __m.__tree_.value_comp();
            __tree_.__copy_assign_alloc(__m.__tree_);
            insert(__m.begin(), __m.end());
#endif
            return *this;
        }

Maybe should be like:

    _LIBCPP_INLINE_VISIBILITY
    map& operator=(const map& __m)
        {
#if __cplusplus >= 201103L
            __tree_ = __m.__tree_;
#else
            if (this != &__m) {
                __tree_.clear();
                __tree_.value_comp() = __m.__tree_.value_comp();
                __tree_.__copy_assign_alloc(__m.__tree_);
                insert(__m.begin(), __m.end());
            }
#endif
            return *this;
        }

I see the same issue with unordered_map& operator=(const unordered_map&
__u).

Thanks!
Kal

--- end quoted text ---

Hi Kal,
I don't speak for Howard or anyone else. But my understanding is the stl library
generally doesn't perform checks like that with the goal of best possible performance.
I see a lot of code in stdlibc++ just like that.

enjoy,
Karen

Actually this does look like a bug to me (for __cplusplus < 201103L). I think Kal has the correct fix. A post-condition of copy assignment is that both lhs and rhs should be equivalent to the previous value of rhs. For move assignment this is not the case. But this is copy assignment.

Howard

Hi Howard,
Thanks for the reply. Do you think a fix for this will be committed for
the 3.4 release?

Also could you explain why the special check for __cplusplus < 201103L
is necessary at all. Both branches are doing almost the same thing
(looking into the implementation of __tree). Is there some subtle
difference in the standard that requires a different implementation here?

Kal

Note: This issue is present in at least: map, multimap, unordered_map,
unordered_multimap.

Kal

Hi Howard, etc,

libc++3.4 (rc1/trunk) std::map doesn't support self-assignment for std <
c++11. The relevant code is:

  _LIBCPP_INLINE_VISIBILITY
  map& operator=(const map& __m)
      {
#if __cplusplus >= 201103L
          __tree_ = __m.__tree_;
#else
          __tree_.clear();
          __tree_.value_comp() = __m.__tree_.value_comp();
          __tree_.__copy_assign_alloc(__m.__tree_);
          insert(__m.begin(), __m.end());
#endif
          return *this;
      }

Maybe should be like:

  _LIBCPP_INLINE_VISIBILITY
  map& operator=(const map& __m)
      {
#if __cplusplus >= 201103L
          __tree_ = __m.__tree_;
#else
          if (this != &__m) {
              __tree_.clear();
              __tree_.value_comp() = __m.__tree_.value_comp();
              __tree_.__copy_assign_alloc(__m.__tree_);
              insert(__m.begin(), __m.end());
          }
#endif
          return *this;
      }

I see the same issue with unordered_map& operator=(const unordered_map&
__u).

Thanks!
Kal

--- end quoted text ---

Hi Kal,
I don't speak for Howard or anyone else. But my understanding is the stl library
generally doesn't perform checks like that with the goal of best possible performance.
I see a lot of code in stdlibc++ just like that.

Actually this does look like a bug to me (for __cplusplus < 201103L). I think Kal has the correct fix. A post-condition of copy assignment is that both lhs and rhs should be equivalent to the previous value of rhs. For move assignment this is not the case. But this is copy assignment.

Howard

Hi Howard,
Thanks for the reply. Do you think a fix for this will be committed for
the 3.4 release?

I don't know.

Also could you explain why the special check for __cplusplus < 201103L
is necessary at all. Both branches are doing almost the same thing
(looking into the implementation of __tree). Is there some subtle
difference in the standard that requires a different implementation here?

C++11 introduces the possibility of recycling nodes of [multi]map under the copy assignment operator by pushing the requirements on [multi]map::value_type down to the key_type and mapped_type (23.2.4 [associative.reqmts]/p7). To actually implement this optimization I used new C++11 language features of union, which do not compile when -std=c++03.

In C++11, using libc++, if two equal sized map<int, int> are assigned, there are zero trips to the heap. The nodes in the lhs are assigned new values, and rebalancing takes place as necessary. In C++03, the lhs is first clear()'d, and then one does an insert for every value in the rhs.

Same design (and bug as you note) in unordered_[multi]map.

Howard

>>
>>>> Hi Howard, etc,
>>>>
>>>> libc++3.4 (rc1/trunk) std::map doesn't support self-assignment for
std <
>>>> c++11. The relevant code is:
>>>>
>>>> _LIBCPP_INLINE_VISIBILITY
>>>> map& operator=(const map& __m)
>>>> {
>>>> #if __cplusplus >= 201103L
>>>> __tree_ = __m.__tree_;
>>>> #else
>>>> __tree_.clear();
>>>> __tree_.value_comp() = __m.__tree_.value_comp();
>>>> __tree_.__copy_assign_alloc(__m.__tree_);
>>>> insert(__m.begin(), __m.end());
>>>> #endif
>>>> return *this;
>>>> }
>>>>
>>>> Maybe should be like:
>>>>
>>>> _LIBCPP_INLINE_VISIBILITY
>>>> map& operator=(const map& __m)
>>>> {
>>>> #if __cplusplus >= 201103L
>>>> __tree_ = __m.__tree_;
>>>> #else
>>>> if (this != &__m) {
>>>> __tree_.clear();
>>>> __tree_.value_comp() = __m.__tree_.value_comp();
>>>> __tree_.__copy_assign_alloc(__m.__tree_);
>>>> insert(__m.begin(), __m.end());
>>>> }
>>>> #endif
>>>> return *this;
>>>> }
>>>>
>>>> I see the same issue with unordered_map& operator=(const
unordered_map&
>>>> __u).
>>>>
>>>> Thanks!
>>>> Kal
>>> --- end quoted text ---
>>>
>>> Hi Kal,
>>> I don't speak for Howard or anyone else. But my understanding is the
stl library
>>> generally doesn't perform checks like that with the goal of best
possible performance.
>>> I see a lot of code in stdlibc++ just like that.
>> Actually this does look like a bug to me (for __cplusplus < 201103L).
I think Kal has the correct fix. A post-condition of copy assignment is
that both lhs and rhs should be equivalent to the previous value of rhs.
For move assignment this is not the case. But this is copy assignment.
>>
>> Howard
>>
> Hi Howard,
> Thanks for the reply. Do you think a fix for this will be committed for
> the 3.4 release?

I don't know.

>
> Also could you explain why the special check for __cplusplus < 201103L
> is necessary at all. Both branches are doing almost the same thing
> (looking into the implementation of __tree). Is there some subtle
> difference in the standard that requires a different implementation here?

C++11 introduces the possibility of recycling nodes of [multi]map under
the copy assignment operator by pushing the requirements on
[multi]map::value_type down to the key_type and mapped_type (23.2.4
[associative.reqmts]/p7). To actually implement this optimization I used
new C++11 language features of union, which do not compile when -std=c++03.

In C++11, using libc++, if two equal sized map<int, int> are assigned,
there are zero trips to the heap. The nodes in the lhs are assigned new
values, and rebalancing takes place as necessary.

In what situation would rebalancing need to happen during this operation?
If they are the same size, wouldn't you just simultaneously inorder walk
both, assigning element by element (or the equivalent)? The structure of an
rb-tree is independent of the actual values contained.

-- Sean Silva

One of the things this allows for is a state-full comparator where the state is not assigned under the copy assignment operator:

#include <iostream>
#include <iterator>
#include <set>
#include <cassert>

template <class C>
void
display(const C& c)
{
    std::cout << "{";
    if (!c.empty())
    {
        std::cout << *c.begin();
        for (auto i = ++c.begin(); i != c.end(); ++i)
            std::cout << ", " << *i;
    }
    std::cout << "}\n";
}

template <class T>
class my_compare
{
    bool reversed_ = false;
public:
    my_compare() = default;
    my_compare(bool r) : reversed_(r) {}
    my_compare& operator=(const my_compare&) {return *this;} // don't assign comparator state
    bool operator()(const T& x, const T& y) const
    {
        if (reversed_)
            return y < x;
        return x < y;
    }
};

int
main()
{
    std::set<int, my_compare<int> > s1;
    for (int i = 1; i <= 3; ++i)
        s1.insert(i);
    std::set<int, my_compare<int> > s2(my_compare<int>(true));
    for (int i = 4; i <= 6; ++i)
        s2.insert(i);
    display(s1);
    display(s2);
    s2 = s1;
    display(s2);
    assert(s2.find(3) != s2.end());
}

{1, 2, 3}
{6, 5, 4}
{3, 2, 1}

This is pretty weird code, and I might flunk it on a code review. One could even argue that the standard does not permit such code. But it works if you don't force a structural copy under copy assignment. And my experience is that I am continually surprised at what my customers want to do (and for very clever reasons). And so if I want to disallow code like this, I need to come up with a very strong motivation for doing so. Rebalancing is generally fairly fast, and with red-black logic, not actually done that often (one path from root to child can be up to twice as long as another). My current engineering judgement is that this capability is worth the cost of rebalancing under copy assignment.

With an implementation that does do a structural copy under assignment the output of this program is:

{1, 2, 3}
{6, 5, 4}
{1, 2, 3}
Assertion failed: (s2.find(3) != s2.end()), function main, file test.cpp, line 49.
Abort trap: 6

If this example is not standard conforming, I consider it a valuable conforming extension to support it. I find it far more preferable than putting the rb-tree into an invalid state.

Howard

>> In C++11, using libc++, if two equal sized map<int, int> are assigned,
there are zero trips to the heap. The nodes in the lhs are assigned new
values, and rebalancing takes place as necessary.
>>
> In what situation would rebalancing need to happen during this
operation? If they are the same size, wouldn't you just simultaneously
inorder walk both, assigning element by element (or the equivalent)? The
structure of an rb-tree is independent of the actual values contained.
>

One of the things this allows for is a state-full comparator where the
state is not assigned under the copy assignment operator:

Ah, that makes perfect sense! Kind of devilish, but it doesn't seem
unreasonable to support it.

-- Sean Silva

Here is a patch to resolve this issue. Please apply :slight_smile:

-K

libcxx-map-self-assignment-fix.patch (3.06 KB)

I just got around to reading my backlog of cfe-dev mail and noticed this. So, forgive me for replying but obscure is awesome.

I think the following paragraph makes the standard require rebalancing, (which suggests that the container requirements are wrong, since rebalancing isn't linear time.

23.2.4 [associative.reqmts]/p11

When an associative container is constructed by passing a comparison object the container shall not store a pointer or reference to the passed object, even if that object is passed by reference. When an associative container is copied, either through a copy constructor or an assignment operator, the target container shall then use the comparison object from the container being copied, as if that comparison object had been passed to the target container in its constructor.

This suggests that your implementation below isn't conforming, since assigning the comparison object below doesn't behave the same way as passing it to the target container's constructor. But, this would be conforming:

template <class T>
class my_compare
{
bool reversed_ = false;
public:
my_compare() = default;
my_compare(bool r) : reversed_(r) {}
my_compare(const my_compare& o) : reversed_(!o.reversed_) {}
my_compare& operator=(const my_compare& o) {
reversed_ = !o.reversed_;
return *this;
}
bool operator()(const T& x, const T& y) const
{
if (reversed_)
return y < x;
return x < y;
}
};

Note, the important thing is that both copy operations do the same thing, consistently.

I'm amused by the fact that this suggests that if the constructor copies the comparison object twice, that the copy constructor/copy assignment operator would have to also do it twice.

I wonder if I can usefully use this comparison object in algorithms...

I just got around to reading my backlog of cfe-dev mail and noticed this. So, forgive me for replying but obscure is awesome.

I think the following paragraph makes the standard require rebalancing, (which suggests that the container requirements are wrong, since rebalancing isn't linear time.

23.2.4 [associative.reqmts]/p11

When an associative container is constructed by passing a comparison object the container shall not store a pointer or reference to the passed object, even if that object is passed by reference. When an associative container is copied, either through a copy constructor or an assignment operator, the target container shall then use the comparison object from the container being copied, as if that comparison object had been passed to the target container in its constructor.

This suggests that your implementation below isn't conforming, since assigning the comparison object below doesn't behave the same way as passing it to the target container's constructor. But, this would be conforming:

template <class T>
class my_compare
{
  bool reversed_ = false;
public:
  my_compare() = default;
  my_compare(bool r) : reversed_(r) {}
  my_compare(const my_compare& o) : reversed_(!o.reversed_) {}
  my_compare& operator=(const my_compare& o) {
    reversed_ = !o.reversed_;
    return *this;
  }
  bool operator()(const T& x, const T& y) const
  {
    if (reversed_)
      return y < x;
    return x < y;
  }
};

Note, the important thing is that both copy operations do the same thing, consistently.

But what they do is not copying; they violate the CopyConstructible
requirement that the associative containers' constructors place on
their Compare type (when a comparator is passed, at least... there may
be a gap in the wording, where the standard should say that copying an
associative container requires that its comparator be
CopyConstructible).

Note that CopyConstructible is not a merely syntactic requirement: it
implied that for after copying a value v, "the value of v is unchanged
and is equivalent to [the copy]".

I'm amused by the fact that this suggests that if the constructor copies the comparison object twice, that the copy constructor/copy assignment operator would have to also do it twice.

There should be no such requirement, as copying once gets a value
equivalent to the original (and hence equivalent to copying n times,
for any n).

-- James

This sounds like the kind of thing that should be written up as an LWG issue.
http://isocpp.org/std/submit-a-library-issue

— Marshall

----------------------------------------

Date: Thu, 27 Feb 2014 00:09:51 -0800
Subject: Re: [cfe-dev] libc++ std::map self-assignment bug
From: james.dennett@gmail.com
To: acharles@outlook.com
CC: silvas@purdue.edu; howard.hinnant@gmail.com; cfe-dev@cs.uiuc.edu

I just got around to reading my backlog of cfe-dev mail and noticed this. So, forgive me for replying but obscure is awesome.

I think the following paragraph makes the standard require rebalancing, (which suggests that the container requirements are wrong, since rebalancing isn't linear time.

23.2.4 [associative.reqmts]/p11

When an associative container is constructed by passing a comparison object the container shall not store a pointer or reference to the passed object, even if that object is passed by reference. When an associative container is copied, either through a copy constructor or an assignment operator, the target container shall then use the comparison object from the container being copied, as if that comparison object had been passed to the target container in its constructor.

This suggests that your implementation below isn't conforming, since assigning the comparison object below doesn't behave the same way as passing it to the target container's constructor. But, this would be conforming:

template <class T>
class my_compare
{
bool reversed_ = false;
public:
my_compare() = default;
my_compare(bool r) : reversed_(r) {}
my_compare(const my_compare& o) : reversed_(!o.reversed_) {}
my_compare& operator=(const my_compare& o) {
reversed_ = !o.reversed_;
return *this;
}
bool operator()(const T& x, const T& y) const
{
if (reversed_)
return y < x;
return x < y;
}
};

Note, the important thing is that both copy operations do the same thing, consistently.

But what they do is not copying; they violate the CopyConstructible
requirement that the associative containers' constructors place on
their Compare type (when a comparator is passed, at least... there may
be a gap in the wording, where the standard should say that copying an
associative container requires that its comparator be
CopyConstructible).

I tried to find the place in the standard the describes the requirements on the comparison type, but I wasn't able to. Granted, I didn't spend too much time on this. If you can point to the section, that would be great.

The table of associative container requirements has some rows
describing various required constructors, and those require that the
comparator be CopyConstructible if one is passed in.

(The small defect that I think is present is that the copy operations
of the associative container itself don't appear to have this
requirement.)

It's also not obvious that CopyConstructible is related to EqualityComparable either.

Indeed, it's not so closely related. CopyConstructible doesn't imply
EqualityComparable; a copy is not required to compare equal to the
original, it's required to be "equivalent" to it, in some ill-defined
sense.

I've never seen a place in the standard that says that copying needs to maintain the same value, though it does allow elision to occur, which means that your program can't usefully rely on the side effects of the copy actually having happened.

The CopyConstructible requirements are where you want to look, I
think. (Table 21 in my draft of C++14.) I quoted them in my original
message (still quoted below), though maybe not very explicitly so:

Note that CopyConstructible is not a merely syntactic requirement: it
implied that for after copying a value v, "the value of v is unchanged
and is equivalent to [the copy]".

Hope this helps,

James