[libc++] FYI: FreeBSD libc++ status

For anyone who's interested, I've imported a recent libc++ / libcxxrt into FreeBSD and Dimirty has merged it into the 9.2 release branch, so it should be showing up in the next release (Real Soon Now). This will be the first FreeBSD release to ship libc++ in the default install (it was in the tree for 9.1 but required users to build from source). The remaining test failures were as follows:

/root/libc++/test/localization/locale.categories/category.ctype/locale.ctype.byname/tolower_many.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.ctype/locale.ctype.byname/toupper_1.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.ctype/locale.ctype.byname/toupper_many.pass.cpp failed at run time

I exchanged some emails with Howard about the tolower / toupper tests, but I don't fully understand what the correct behaviour is. I suspect it may be due to differing interpretations of what a wchar_t contains.

/root/libc++/test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_ru_RU.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_fr_FR.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp failed at run time

These all seem to fail with the facet::get() call. I need to check what's going on there: the locale code does a lot of redundant creation of locale_ts for the facets, but I'd expect that to result in fewer bugs (just less performance) as it's more defensive than it needs to be.

For these in particular, some C test cases from someone who understands the <locale> code would help me determine whether the bug is in libc++ or our libc (I'm quite familiar with the locale code in our libc, so I definitely wouldn't rule it out as a source of bugs).

/root/libc++/test/localization/locale.categories/category.monetary/locale.moneypunct.byname/curr_symbol.pass.cpp failed at run time

This one fails because we return " Eu" as the currency symbol for France, which is probably a legacy thing from before applications were expected to understand UTF-8.

/root/libc++/test/localization/locale.categories/category.monetary/locale.moneypunct.byname/grouping.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.monetary/locale.moneypunct.byname/neg_format.pass.cpp failed at run time
/root/libc++/test/localization/locale.categories/category.monetary/locale.moneypunct.byname/pos_format.pass.cpp failed at run time

These also seem to be due to differing locale data between FreeBSD and OS X. I need to spend some time to work out which is correct: previous explorations indicated that OS X was shipping with a very old set of locale data (e.g. using the Soviet era currency symbol for the Russian Ruble), but some other differences have been bugs in the FreeBSD locale data.

/root/libc++/test/re/re.traits/translate_nocase.pass.cpp failed at run time

This seems to be testing exactly the same thing as one of the toupper / tolower tests.

/root/libc++/test/strings/string.conversions/stof.pass.cpp failed at run time

This one is failing because std::stof(L" - 8", &idx) is not throwing an exception. It sets idx to 2, so I suspect here a bug in the conversion of L" - 8" to a std::wstring. Again, they may be related to some difference between clang, libc++ and FreeBSD libc over what a wchar_t contains. I need to investigate this a bit more.

numerics/complex.number/complex.transcendentals/cos.pass.cpp failed to compile
numerics/complex.number/complex.transcendentals/cosh.pass.cpp failed to compile
numerics/complex.number/complex.transcendentals/sin.pass.cpp failed to compile
numerics/complex.number/complex.transcendentals/sinh.pass.cpp failed to compile
numerics/complex.number/complex.transcendentals/tan.pass.cpp failed to compile
numerics/complex.number/complex.transcendentals/tanh.pass.cpp failed to compile

These functions are missing from our libm, so math.h omits them and they don't work. Hopefully we'll have implementations of these for 10.0.

strings/c.strings/cuchar.pass.cpp failed to compile
strings/c.strings/version_cuchar.pass.cpp failed to compile

These fail because, although FreeBSD has a <uchar.h>, libc++ lacks <uchar>

All in all, a pretty good set of results. Almost all of the bugs are in localisation, and if people actually care about localisation I'd expect them to use ICU and not STL. Once libc++ gets a <uchar> and FreeBSD libm gets the remaining C99 functions, we'll be in a very happy place. The wchar interpretation issue concerns me somewhat, and I'll have to spend a little bit of time investigating it.

David

Thanks for this most informative update David.

I've wondered if libc++'s locale ought to be built on ICU instead of xlocale. I don't know the answer to that question.

I've hesitated in adding <cuchar> because I can't test it. However if you can add and test it, that would be a welcome addition.

Howard

I've wondered if libc++'s locale ought to be built on ICU instead of xlocale. I don't know the answer to that question.

Definitely not. ICU is a huge dependency, has a relatively rapidly changing API and does vastly more than is required. The C++11 locale APIs were designed to be easy to implement on top of the POSIX2008 Locale APIs, (which were originally xlocale on OS X). We just didn't do so in the most efficient way - the locale_t is already reference counted and so we don't need to do as much copying as we do for the facets.

I've hesitated in adding <cuchar> because I can't test it. However if you can add and test it, that would be a welcome addition.

My version of the draft spec says almost nothing about this header, other than it exists.

David

Hi David,
Yes, libicu is a very significant dependency, but it all just works for every locale and every
facet. I can only speak about the g++ c++11 libstdc++-4.7.2 locale support. It has serious
limitations. The boost::locale documentation discusses a few of them.

I suggest folks ought to examine the boost::locale implementation. It is just a wrapper around
libicu. I use it and am quite pleased. I get all the advantages of libicu without ever needing
to maintain any of it. while it might not be appropriate for clang, I do believe it is an
excellent example of how clang might develop it's own wrapper around libicu, alleviating any
need to maintain it. Just a suggestion. Not my issue.

enjoy,
Karen

Yes, libicu is a very significant dependency, but it all just works for every locale and every
facet. I can only speak about the g++ c++11 libstdc++-4.7.2 locale support. It has serious
limitations. The boost::locale documentation discusses a few of them.

ICU would be an unacceptable dependency for the FreeBSD base system. It is also massively overkill for the requirements of <locale>, which were carefully designed to only rely on existing POSIX libc features. FreeBSD and Darwin libc both implement all of the required features and glibc probably will soon, if it doesn't already: they're all in POSIX2008. There may be errors in your operating system's locale files (the libc++ test suite has so far found errors in both FreeBSD and OS X locale descriptions), but I have also found errors in the locale descriptions shipped with ICU (especially in date formats), so using ICU would not be a panacea here either - especially as newer versions of ICU have managed to fix some errors and introduce new ones, so you would not even guarantee consistent errors across platforms.

ICU would also be a circular dependency for libc++, as ICU uses STL internally.

I suggest folks ought to examine the boost::locale implementation. It is just a wrapper around
libicu. I use it and am quite pleased. I get all the advantages of libicu without ever needing
to maintain any of it. while it might not be appropriate for clang, I do believe it is an
excellent example of how clang might develop it's own wrapper around libicu, alleviating any
need to maintain it. Just a suggestion. Not my issue.

I'd strongly recommend using ICU for any GUI code that requires localisation, however this is something that you typically get for free from your GUI toolkit, as most of them use ICU internally. As such, there is little benefit in adopting it for libc++, as people who actually care significantly about localisation will bypass <locale> and use ICU directly. All that we would be doing is adding an extra dependency for code that doesn't depend on localisation.

David

Hi David,
I agree. I am not aware of any application frameworks that rely on the locales provided by
the underlying system library. I am personally aware of a number of full custom application
stacks in C++ that don't rely on the underlying system library locales either. It seems to
be a concensus that teams prefer other options in place of the locales provided by system
libraries such as libstdc++, etc. I actually made an effort to use the c++11 stdlibc++
locale subsystem. But after doing my due diligence, I decided to use libicu via boost::locale.

I can appreciate your point of view. Bottom line is that folks have numerous options and
can decide which one best suits their needs. I don't think developer's can ask for more
than that.

I appreciate your comments. Thanks.

enjoy,
Karen