clang-format bug with u8 character literal

Hi all,

clang-format does not work well with character literal u8 ’ c-char '. Example

clang-format <<END
auto c1 = u8’a’;
auto c2 = u’a’;


auto c1 = u8 ‘a’;
auto c2 = u’a’;

As you can see clang-format adds an additional space between u8 and ‘a’.

clang-format --version
clang-format version 4.0.0 ( 559aa046fe3260d8640791f2249d7b0d458b5700) ( 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Denis Gladkikh

Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8’a’ as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal.
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out.

tools/clang ✓ 13:45:12
╰─ svn diff
Index: lib/Format/Format.cpp

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with ninja FormatTests && ./tools/clang/unittests/Format/FormatTests).

Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character.
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst

Hi Nico,

By any chance have you had a time to merge this change?


Denis Gladkikh

Landed in r299574:

Sorry for the delay, I missed your first reply back in March.

Awesome, thanks!

Any possibility to port this change in 4.0.x? documents how to request merges to 4.0.1 as far as I can tell. If you have an llvm bugzilla login, you can request a merge yourself.