[RFC] Adding a char set converter to Support library

Hi!

On z/OS, there is the need to convert strings from EBCDIC to UTF-8 and
vice versa.
Using the POSIX iconv functions has some challenges, so I created a small
wrapper
around this functionality to get the same result on all platforms. This
functionality
is required for reading and writing GOFF object files and can also be used
in the
frontend.
I put up the code on Phabricator ⚙ D88741 [SystemZ/z/OS] Add utility class for char set conversion.. Please
add your
comments to the review if you are interested in this topic.

Best regards,
Kai

Kai Nacke
IT Architect

IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
HRB 14562 / WEEE-Reg.-Nr. DE 99369940

As far as I remember, libiconv is under LGPL. Will this cause any troubles?

My understanding is that dynamically linking should pose no problem, but I
am no lawyer. On Linux, glibc is also under LGPL license, and LLVM usually
links against it.
(There is really no need for us to depend on libiconv. If it is deemed to
risky, then I can dropped it.)

iconv is a POSIX API; the license of any particular implementation shouldn't matter.

I'd be more concerned that the set of encodings supported by any particular implementation of iconv isn't portable. I'd like to avoid adding host-specific behavior here; ideally, the list of supported encodings should be hardcoded into LLVM, and we should support all of those encodings on every host.

-Eli

I'd be more concerned that the set of encodings supported by any
particular implementation of iconv isn't portable. I'd like to
avoid adding host-specific behavior here; ideally, the list of
supported encodings should be hardcoded into LLVM, and we should
support all of those encodings on every host.

An additional problem is that the mappings between encodings can
also differ. Most iconv implementations map the ASCII LF to the
wrong EBCDIC character, which causes lot of problems. My
implementation features therefore a conversion table to work-around
this problem.

I think this is also the only way to avoid host-specific behavior.
On some platforms (Windows and Mac), there are different implementations
available. If suddenly the library not used for building is picked up
(e.g. copying the executable to another machine), then there is no
guarantee that encodings checked for at configuration time are still
available.

Regards,
Kai

-Eli

From: llvm-dev <llvm-dev-bounces@lists.llvm.org> On Behalf Of Kai
Peter Nacke via llvm-dev
Sent: Friday, October 2, 2020 10:45 AM
To: Anton Korobeynikov <anton@korobeynikov.info>
Cc: llvm-dev <llvm-dev@lists.llvm.org>; Yusra Syeda

<Yusra.Syeda@ibm.com>

Subject: [EXT] Re: [llvm-dev] [RFC] Adding a char set converter to
Support library

My understanding is that dynamically linking should pose no problem, but

I

am no lawyer. On Linux, glibc is also under LGPL license, and LLVM

usually

links against it.
(There is really no need for us to depend on libiconv. If it is deemed

to

risky, then I can dropped it.)

From: Anton Korobeynikov <anton@korobeynikov.info>
To: Kai Peter Nacke <kai.nacke@de.ibm.com>
Cc: llvm-dev <llvm-dev@lists.llvm.org>, Yusra Syeda
<Yusra.Syeda@ibm.com>
Date: 02.10.2020 19:08
Subject: [EXTERNAL] Re: [llvm-dev] [RFC] Adding a char set
converter to Support library

As far as I remember, libiconv is under LGPL. Will this cause any
troubles?

>
> Hi!
>
> On z/OS, there is the need to convert strings from EBCDIC to UTF-8 and
> vice versa.
> Using the POSIX iconv functions has some challenges, so I created a
small
> wrapper
> around this functionality to get the same result on all platforms.

This

> functionality
> is required for reading and writing GOFF object files and can also be
used
> in the
> frontend.
> I put up the code on Phabricator
INVALID URI REMOVED
u=https-3A__reviews.llvm.org_D88741&d=DwIGaQ&c=jf_iaSHvJObTbx-

siA1ZOg&r=43FMMTMN1rMQYLfzcfWYI9JmFbjyCLLZVkpxUNJkDuQ&m=Nc2y2jzkpuZC86Fnb7g2qy8lKRiD1Ntu6_kOmq9r_P4&s=Krrj-

axshuFMr3hxQRbScTW9MOO32K2AnpAtwy8NZWI&e=
. Please
> add your
> comments to the review if you are interested in this topic.
>
> Best regards,
> Kai
>
> Kai Nacke
> IT Architect
>
> IBM Deutschland GmbH
> Vorsitzender des Aufsichtsrats: Sebastian Krause
> Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger,
Norbert
> Janzen, Markus Koerner, Christian Noll, Nicole Reimer
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
Stuttgart,
> HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev@lists.llvm.org
>
INVALID URI REMOVED

u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=jf_iaSHvJObTbx-

siA1ZOg&r=43FMMTMN1rMQYLfzcfWYI9JmFbjyCLLZVkpxUNJkDuQ&m=Nc2y2jzkpuZC86Fnb7g2qy8lKRiD1Ntu6_kOmq9r_P4&s=28UYCaSwrjZvdu3-

bfoGTFGnKUA4LJMC4TpdWj0aupg&e=

--
With best regards, Anton Korobeynikov
Department of Statistical Modelling, Saint Petersburg State University

_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
INVALID URI REMOVED

u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=jf_iaSHvJObTbx-

siA1ZOg&r=43FMMTMN1rMQYLfzcfWYI9JmFbjyCLLZVkpxUNJkDuQ&m=Nc2y2jzkpuZC86Fnb7g2qy8lKRiD1Ntu6_kOmq9r_P4&s=28UYCaSwrjZvdu3-