I'm learning about unicode support in C++ and I'm trying to convert
const char16_t* (UTF-16) to wstring (UCS4). Reading the standard (and
if I understand it correctly), it can be done through
"For the facet codecvt_utf16:
— The facet shall convert between UTF-16 multibyte sequences and UCS2
or UCS4 (depending on thesize of Elem) within the program."
So I tried to use std::wstring_convert to do the conversion by doing
using namespace std;
s\.push\_back\('h'\); s\.push\_back\('e'\); s\.push\_back\('l'\); s\.push\_back\('l'\); s\.push\_back\('o'\); wstring\_convert<codecvt\_utf16<wchar\_t>, wchar\_t> conv; wstring ws = conv\.from\_bytes\(reinterpret\_cast<const char\*> \(s\.c\_str\(\)\)\); wcout << ws << endl; return 0;
Note: the explicit push_backs to get around the fact that my version
of clang (Xcode 4.2) doesn't have unicode string literals.
When the code is run, I get terminate exception from from_bytes. Am I
misunderstanding something and doing something illegal here? I was
thinking it should work because the const char* that I passed to
wstring_convert is UTF-16 encoded. I have also considered endianness
being the issue, but I have checked that it's not the case.
If that is indeed not going to work, what would be the best approach
to convert UTF-16 to UCS4 using standard C++11?