I’ve just found that if UnsavedFile has russian characters in it’s content, SourceLocation’s start/end is calculated wrong - each symbol length is calculated to be 2 bytes instead of 1. This leads to wrong end and all other tokens locatoin while clang_tokenize().
I’ve check it on 3.3 on mac and linux (android via JNI) since i can’t see 3.4 release (though it was scheduled to be released in december) notes and downloads.
PS. I’ve also posted bug report but i can’t see it yet.