Asher
February 18, 2022, 9:55pm
1
I recompiled LLVM from git using exactly the same process as in the past. Previous binaries have had no problem with my code.
This time, the resulting binary chokes on the first unicode character: error: expected identifier or ‘{’ (“namespace \u2063 {”).
What has changed? Is this a bug?
See ⚙ D104975 Implement P1949 (and the corresponding standard proposal C++ Identifier Syntax using Unicode Standard Annex 31 )
Could you provide a little more context for what you’re doing?
Asher
February 18, 2022, 11:02pm
3
What context would you like?
I have a namespace named the unicode character ‘\u2063’. This has worked for years. Suddenly (with a newly compiled llvm from git), it will not compile. Simply using a namespace named ‘\u2063’ is sufficient to cause compile error.
Asher
February 19, 2022, 3:52am
4
So, if I am understanding correctly, TL;DR is:
previously UTF-8 character support was part of clang, not the standard
now it is being normalized
the result is that some characters that were previously allowed are no longer allowed
this includes 2063, which is less than FFFF