[libc++] Should IO manipulators be noncopyable?

Recently, I was trying out the new std::quoted IO manipulator and came across an interesting issue: since libc++'s std::quoted is copyable (as are all the other IO manipulators I looked at), passing a temporary as the first argument introduces a lifetime bug in the following code:

   auto ensure_printable(const std::wstring &s) {
     std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
     return std::quoted(conv.to_bytes(s));
   }

   // ...

   std::cout << ensure_printable(std::wstring(L"hi"));

In C++98, it wasn't possible to do something like this without making obviously non-standard code (i.e. explicitly writing out the implementation-defined named for the return type of std::quoted()). In C++11, and even moreso in C++14, this is no longer necessary thanks to the `auto` keyword. Hence, (mostly) innocent-looking code like above can introduce subtle bugs.

Obviously, since the return type of std::quoted() is unspecified, the above code relies on undefined behavior: there's no guarantee in the standard that it's copyable in the first place! After some discussion, this got me to thinking about why any of the IO manipulators are implemented as copyable types; presumably, this is mostly because it didn't matter back in C++98 when most of them were introduced. However, with deduced return types, this matters quite a bit more.

If the IO manipulators were written to return a const version of their proxy type, and said type were move-only, I think this would resolve the issue, and only allow the IO manipulators to be used as temporaries (much like they were in C++98). Something like so:

   struct my_manipulator_proxy {
     my_manipulator_proxy(/* ... */) {}
     my_manipulator_proxy(const my_manipulator_proxy &) = delete;
     my_manipulator_proxy(my_manipulator_proxy &&) = default;
     /* ... */
   };

   std::ostream & operator << (std::ostream &s,
                               const my_manipulator_proxy &m) {
     /* ... */
   }

   const my_manipulator_proxy my_manipulator(/* ... */) {
     return my_manipulator_proxy(/* ... */);
   }

Having looked at the implementation of std::quoted, this seems like an uncomplicated change (I imagine it's so for the other manipulators as well), and helps prevent undefined behavior. Does this sound like a reasonable way to go? I'd be happy to write a patch if others agree that this is sensible.

- Jim

+mclow

do these even need to be movable?

In my sample implementation, I think they would need to be movable so that the wrapper function (my_manipulator) can move-construct its return value from the proxy object (my_manipulator_proxy). Then the wrapper makes sure that its return value is const so it can't be used by another move-constructor, effectively making it non-copyable and non-movable.

If you have a better idea for how to implement this, that's ok too, as long as the following are disallowed:

   auto foo() {
     return my_manipulator(/* ... */);
   }

   auto x = my_manipulator(/* ... */);

- Jim

+mclow

do these even need to be movable?

They need to be moveable, because they get returned from std::quoted.

As to Jim’s question; I don’t have an answer off the top of my head.
Though I will point out that string_view has the exact same lifetime problem,
since it doesn’t own its’ storage, but rather holds a reference to other storage.

— Marshall

Recently, I was trying out the new std::quoted IO manipulator and came
across an interesting issue: since libc++'s std::quoted is copyable (as are
all the other IO manipulators I looked at), passing a temporary as the first
argument introduces a lifetime bug in the following code:

auto ensure_printable(const std::wstring &s) {
   std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
   return std::quoted(conv.to_bytes(s));
}

// ...

std::cout << ensure_printable(std::wstring(L"hi"));

Great example, btw.

Recently, I was trying out the new std::quoted IO manipulator and came across an interesting issue: since libc++'s std::quoted is copyable (as are all the other IO manipulators I looked at), passing a temporary as the first argument introduces a lifetime bug in the following code:

auto ensure_printable(const std::wstring &s) {
   std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
   return std::quoted(conv.to_bytes(s));
}

// ...

std::cout << ensure_printable(std::wstring(L"hi"));

There’s an easy fix, btw - the object returned by quoted could capture the string by value, rather than reference.
Make a copy. :frowning:

BTW - even simpler example

auto boom ( const std::string &s ) { return std::quoted(s); }
auto foo = boom ( “Hi Mom!” );

foo now has a reference to a destructed string.

— Marshall

Recently, I was trying out the new std::quoted IO manipulator and came across an interesting issue: since libc++'s std::quoted is copyable (as are all the other IO manipulators I looked at), passing a temporary as the first argument introduces a lifetime bug in the following code:

  auto ensure_printable(const std::wstring &s) {
    std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
    return std::quoted(conv.to_bytes(s));
  }

  // ...

  std::cout << ensure_printable(std::wstring(L"hi"));

There’s an easy fix, btw - the object returned by quoted could capture the string by value, rather than reference.
Make a copy. :frowning:

That's true, but I'm not sure how useful that is. If we're copying the string, it's just about as bad, memory-wise, as stringifying the result of std::quoted and storing that:

   std::string ensure_printable(const std::wstring &s) {
     std::stringstream ss;
     std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
     ss << std::quoted(conv.to_bytes(s));
     return ss.str();
   }

This is obviously a bit uglier for people trying to indirectly use std::quoted like I was, but I think this is how it was intended to be used by the standard; making a copy of the result of std::quoted is technically undefined behavior, since the type is unspecified (and thus, it's unspecified if it's copyable).

Besides, making a copy of the string probably wouldn't help if the original code were slightly changed:

   auto ensure_printable(const std::wstring &s) {
     std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
     std::string converted = conv.to_bytes(s);
     return std::quoted(converted);
   }

That would pick up the third overload of std::quoted, which needs to hold a reference so that `stream >> std::quoted(blah)` works. Then we're back to the original problem.

BTW - even simpler example

auto boom ( const std::string &s ) { return std::quoted(s); }
auto foo = boom ( “Hi Mom!” );

foo now has a reference to a destructed string.

Good catch!

- Jim