[libc++] using std::wstring_convert and std::codecvt_utf16 to convert const char16_t* to wstring


I'm learning about unicode support in C++ and I'm trying to convert
const char16_t* (UTF-16) to wstring (UCS4). Reading the standard (and
if I understand it correctly), it can be done through

"For the facet codecvt_utf16:
— The facet shall convert between UTF-16 multibyte sequences and UCS2
or UCS4 (depending on thesize of Elem) within the program."

So I tried to use std::wstring_convert to do the conversion by doing
the following:

#include <iostream>
#include <locale>
#include <codecvt>
#include <string>

using namespace std;

int main()
u16string s;


wstring\_convert&lt;codecvt\_utf16&lt;wchar\_t&gt;, wchar\_t&gt; conv;
wstring ws = conv\.from\_bytes\(reinterpret\_cast&lt;const char\*&gt; \(s\.c\_str\(\)\)\);

wcout &lt;&lt; ws &lt;&lt; endl;

return 0;


Note: the explicit push_backs to get around the fact that my version
of clang (Xcode 4.2) doesn't have unicode string literals.

When the code is run, I get terminate exception from from_bytes. Am I
misunderstanding something and doing something illegal here? I was
thinking it should work because the const char* that I passed to
wstring_convert is UTF-16 encoded. I have also considered endianness
being the issue, but I have checked that it's not the case.
If that is indeed not going to work, what would be the best approach
to convert UTF-16 to UCS4 using standard C++11?


Cubbi beat me to figuring this out by 58 minutes. :wink:



I'm sorry for being impatient and asking it here again before I got
the answer =). Now that I know you (and Cubbi) are on StackOverflow,
I'll use that for questions like these. =)


I could've just as easily missed your question on StackOverflow (or been on vacation or whatever). Don't hesitate to post here too. Glad the problem got solved. Sorry for the API you're having to deal with. Wouldn't have been my first choice.