proposed change to remove conditional WCHAR support in libedit wrapper

Hello,

I was thinking of removing the conditional compilation to use libedit’s WCHAR support on non-Windows platforms. The conditional preprocessor logic is definitely the correct way to support both wide & narrow character APIs, but editline has enabled the wide API’s by default for some time, and there might be an opportunity today to simplify it.

I’m thinking that using the wide-character API and converting to narrow-character string whenever interacting with the rest of LLDB makes the most sense. One hopefully small issue that the history filename is determined based on whether WCHAR is used or not. So history might not work unless people rename their file after using a version with this change. From what I could tell, the format of the file is actually the same whether you use history() or history_w(), so it doesn’t have to be converted from narrow chars to wide characters, although I only tested that on OS X.

There is a WIP of what this might look like here: https://github.com/llvm/llvm-project/compare/main…nealsid:lldb-editline-remove-wchar However, it uses the narrow character libedit API. But it gives an idea of what changes will be necessary.

Anyone else think this could be useful? Or, conversely, does anyone see something that I missed that requires the conditional compilation to remain in? Since a broken shell would be bad for everyone, I am not sure what the best way is to verify who it might break, i.e. if people who use LLDB from head want to test it first, or other platforms to consider besides OS X/Linux/Windows.

Thanks,

Neal

Oldest platform Red Hat builds LLDB on is RHEL-7 (and its copies) and it
already contains el_winsertstr and your branch builds fine there.
  https://copr.fedorainfracloud.org/coprs/jankratochvil/lldb/build/2321456/

OTOH the wide character does not work there and even not on Fedora 34 x86_64:
typing: žščř
(lldb) \U+017E\U+0161\U+010D\U+0159
error: 'žščř' is not a valid command.
typing: áéíóůúý
(lldb) \U+016F
error: 'ů' is not a valid command.

While mariadb client works fine on Fedora 34 x86_64:
MariaDB [(none)]> žščřáéíóůúý;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'žščřáéíóůúý' at line 1

Do you have an idea what can be wrong in LLDB?

Thanks,
Jan

I forgot to check OSX lldb which does work so I will have to find it out
myself:

Apple Swift version 5.4.2 (swiftlang-1205.0.28.2 clang-1205.0.19.57)
(lldb) žščř
error: 'žščř' is not a valid command.

Jan

What libedit version/port is Fedora using? I think the usual Linux port[1] is suffering from two issues: One is that most unicode characters with code points <= 256 are mapped to ‘unassigned’ instead of ‘insert character’ which prevents them from being entered. Why the other characters are just echo’d as their escaped code point I have not figured out yet.

[1] https://thrysoee.dk/editline/

What libedit version/port is Fedora using?
[1] https://thrysoee.dk/editline/

Source RPM : libedit-3.1-37.20210522cvs.fc34.src.rpm
URL : https://www.thrysoee.dk/editline/

I think the usual Linux port[1] is suffering from two issues:

The question is why the same libedit library on the same OS works with mariadb
client.

Jan

Maybe mariadb is not using libedit in emacs mode?

LLDB also provides a custom callback for libedit to read characters from the command line (which mariadb probably doesn’t), so maybe we implemented that incorrectly.

- Raphael

Missing setlocale(), filed it as:
  RFC: [lldb] Fix editline unicode on Linux
  https://reviews.llvm.org/D105779

Jan

Thank you for the info and for checking the Linux build. I unfortunately do not know much about locale issues on Linux, but I tested your change with mine on OS X and it worked for all the inputs I tried.

Neal