RFC: Only change libclang.so SONAME when the ABI changes

Hi,

I would like to propose that we only change the SONAME of libclang.so
(this is the library that contains the C API for clang) when the ABI changes.
Currently, we change the SONAME whenever we bump the major version of LLVM,
but the C API tends to not change that often.

This change will allow operating system maintainers to update the version
of libclang.so in their operating system without forcing rebuilds of
all programs that depend on it.

The steps for implementing this change would be:

- Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
SONAME).
- Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
   for all existing symbols.
- Add a test case that checks if a new symbol has been added and ensures it has
   the correct symbol version.
- Add a buildbot that uses abi-compliance-checker[1] to ensure that ABI/API does
   not change unexpectedly.
- The next time the ABI of libclang.so is changed, the SOANME will be updated to
   libclang.so.$LLVM_MAJOR_VERSION.

What do you think?

-Tom

[1] ABI Compliance Checker

This sounds like a good plan in general (not only for libclang :), but
how strict is this ABI check? Does *any* change (even like adding a new
function) trigger an ABI check error?

E.g., what are the criteria for bumping the version?

-Dimitry

>
> I would like to propose that we only change the SONAME of libclang.so
> (this is the library that contains the C API for clang) when the ABI changes.
> Currently, we change the SONAME whenever we bump the major version of LLVM,
> but the C API tends to not change that often.
>
> This change will allow operating system maintainers to update the version
> of libclang.so in their operating system without forcing rebuilds of
> all programs that depend on it.
>
> The steps for implementing this change would be:
>
> - Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
> SONAME).
> - Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
> for all existing symbols.
> - Add a test case that checks if a new symbol has been added and ensures it has
> the correct symbol version.
> - Add a buildbot that uses abi-compliance-checker[1] to ensure that ABI/API does
> not change unexpectedly.
> - The next time the ABI of libclang.so is changed, the SOANME will be updated to
> libclang.so.$LLVM_MAJOR_VERSION.
>
> What do you think?

Is there an estimate how many packages use libclang.so ?

Symbol versioning seems fine for Linux glibc and FreeBSD.

This sounds like a good plan in general (not only for libclang :), but
how strict is this ABI check? Does *any* change (even like adding a new
function) trigger an ABI check error?

E.g., what are the criteria for bumping the version?

-Dimitry

Sounds fine for libclang.so. (For C++ libLLVM-13git.so and
libclang-cpp.so, the ABI is changing very frequently, so I don't see
we could avoid DT_SONAME bump.)

I would like to propose that we only change the SONAME of libclang.so
(this is the library that contains the C API for clang) when the ABI changes.
Currently, we change the SONAME whenever we bump the major version of LLVM,
but the C API tends to not change that often.

This change will allow operating system maintainers to update the version
of libclang.so in their operating system without forcing rebuilds of
all programs that depend on it.

The steps for implementing this change would be:

- Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
SONAME).
- Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
  for all existing symbols.
- Add a test case that checks if a new symbol has been added and ensures it has
  the correct symbol version.
- Add a buildbot that uses abi-compliance-checker[1] to ensure that ABI/API does
  not change unexpectedly.
- The next time the ABI of libclang.so is changed, the SOANME will be updated to
  libclang.so.$LLVM_MAJOR_VERSION.

What do you think?

Is there an estimate how many packages use libclang.so ?

In Fedora there are 12 packages that use libclang.so.

Symbol versioning seems fine for Linux glibc and FreeBSD.

This sounds like a good plan in general (not only for libclang :), but
how strict is this ABI check? Does *any* change (even like adding a new
function) trigger an ABI check error?

E.g., what are the criteria for bumping the version?

-Dimitry

Sounds fine for libclang.so. (For C++ libLLVM-13git.so and
libclang-cpp.so, the ABI is changing very frequently, so I don't see
we could avoid DT_SONAME bump.)

Right, it only makes sense to do for a C library.

-Tom

I would like to propose that we only change the SONAME of libclang.so
(this is the library that contains the C API for clang) when the ABI changes.
Currently, we change the SONAME whenever we bump the major version of LLVM,
but the C API tends to not change that often.

This change will allow operating system maintainers to update the version
of libclang.so in their operating system without forcing rebuilds of
all programs that depend on it.

The steps for implementing this change would be:

- Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
SONAME).
- Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
  for all existing symbols.
- Add a test case that checks if a new symbol has been added and ensures it has
  the correct symbol version.
- Add a buildbot that uses abi-compliance-checker[1] to ensure that ABI/API does
  not change unexpectedly.
- The next time the ABI of libclang.so is changed, the SOANME will be updated to
  libclang.so.$LLVM_MAJOR_VERSION.

What do you think?

This sounds like a good plan in general (not only for libclang :), but
how strict is this ABI check? Does *any* change (even like adding a new
function) trigger an ABI check error?

New functions don't trigger an ABI check error. The things I've seen
it flag in the past are: function signature changes, struct size changes,
and enum value changes.

-Tom

>>
>> I would like to propose that we only change the SONAME of libclang.so
>> (this is the library that contains the C API for clang) when the ABI changes.
>> Currently, we change the SONAME whenever we bump the major version of LLVM,
>> but the C API tends to not change that often.
>>
>> This change will allow operating system maintainers to update the version
>> of libclang.so in their operating system without forcing rebuilds of
>> all programs that depend on it.
>>
>> The steps for implementing this change would be:
>>
>> - Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
>> SONAME).
>> - Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
>> for all existing symbols.
>> - Add a test case that checks if a new symbol has been added and ensures it has
>> the correct symbol version.
>> - Add a buildbot that uses abi-compliance-checker[1] to ensure that ABI/API does
>> not change unexpectedly.
>> - The next time the ABI of libclang.so is changed, the SOANME will be updated to
>> libclang.so.$LLVM_MAJOR_VERSION.
>>
>> What do you think?
>
> This sounds like a good plan in general (not only for libclang :), but
> how strict is this ABI check? Does *any* change (even like adding a new
> function) trigger an ABI check error?
>

New functions don't trigger an ABI check error. The things I've seen
it flag in the past are: function signature changes, struct size changes,
and enum value changes.

-Tom

> E.g., what are the criteria for bumping the version?
>
> -Dimitry

I find the following two paragraphs in
https://www.debian.org/doc/debian-policy/ch-sharedlibs.html#run-time-shared-libraries
useful

Hi,

I would like to propose that we only change the SONAME of libclang.so
(this is the library that contains the C API for clang) when the ABI changes.
Currently, we change the SONAME whenever we bump the major version of LLVM,
but the C API tends to not change that often.

This change will allow operating system maintainers to update the version
of libclang.so in their operating system without forcing rebuilds of
all programs that depend on it.

The steps for implementing this change would be:

- Hard-code the SONAME for libclang.so to libclang.so.13 (which is the current
SONAME).
- Hard-code the symbol versions to LLVM_13 (which is the current symbol version)
for all existing symbols.
- Add a test case that checks if a new symbol has been added and ensures it has
the correct symbol version.

Here is a patch that implements this first part:
https://reviews.llvm.org/D105527

I wasn't able to figure out how to write the test case that I wanted to do, so
I left that out. I'm open to suggestions if someone has an idea how to do it.

-Tom

We had some feedback from users that this change was confusing. Also, in practice having a stable libclang.so ABI does not really help since libLLVM.so ABI is not stable, so apps that use both will need to rebuild for a new llvm/clang version anyway.

I propose that we revert this change.

2 Likes

Hey all, I just opened a thread about the context for the reversion of this change in: Rationale for removing versioned libclang? --> Middle ground to keep it behind option?

Not sure where it’s best to discuss this now, but in short, the problems that caused the reversion seems rather social than technical, and didn’t receive a lot of discussion.

As a middle-ground, it would be good to keep the CLANG_SOVERSION information, even if it’s not used by default anymore.