Rationale for removing versioned libclang? --> Middle ground to keep it behind option?

I’m not familiar with phab (and too short on time to start now…), so here’s two branches:

  • one for main (no release note)
  • one for release/15.x (with release note in last commit); full diff

Doing separate commits should also make the review much easier I hope, but I’ll have to leave it to you how you deal with that. Hope this helps nevertheless.

I think I did a diff in phab here for several commits (I am also not a phab expert) - can you check if it looks alright? :gear: D132486 Revert “libclang.so: Make SONAME the same as LLVM version”

Thanks a lot. Commented there, LGTM

@mgorny @sylvestre do you guys have any thoughts on this?

I don’t think ⚙ D132486 SONAME introduce option CLANG_FORCE_MATCHING_LIBCLANG_SOVERSION is the right approach.
I think we should aim at getting consistency in term of packaging of the llvm toolchain in various distro, not more differences.
Giving this option will fragment it a bit more.

Now, about the naming discussion, I think llvm SONAME == clang SONAME, ABI is hard, no need to add more complexity.

Distros are quite insular in what standards they pursue across many different packages (often overruling upstreams directly; e.g. shared vs. static builds, C++ standard versions, build flags for arches and many more). Consistency is good, but can it be expected today to exchange artefacts across distros and have them work across the board…? I’d be very doubtful of that.

Whether the libclang soversion matches between distros is IMO way less impactful than making it impossible for those who are able to deal with different soversions to benefit from the advantages of not having to rebuild & redistribute everything quite so often.

Yes, ABI is hard. So why throw away useful work (and information) by those who take care of it (and take pains not to change it)…? I feel the complexity/utility trade-off is highly subjective here. I’m aware that this affects me equally, but “ABI is hard” is not really a thorough argument IMO.

I don’t find the change to a fixed version (13) particularly useful, it breaks existing software [1] and IMO risks confusion. In the FreeBSD ports context the need to rebuild to use a new version is just not a concern and people typically have several versions of LLVM installed due to differences between versions required for e.g., Mesa and Firefox.

If D132486 is committed, I will build FreeBSD packages with CLANG_FORCE_MATCHING_LIBCLANG_SOVERSION enabled.

[1] Sure, one could argue (and people have) that the software has a broken build system, but the change was not widely telegraphed and software stopped working as a result…

True, but libclang.so has a pretty strict ABI compatibility guarantee, which is one of the reasons why the library exists at all. Doesn’t Debian in fact use libclang.so.1? (Which ideally we would have done in upstream from the beginning.)

Could you be more specific here? I’m not aware of a single breakage caused by this among openSUSE packages. I’m just curious what kind of software makes assumptions about the SO version here and how such an assumption might look like. The packages that I’m aware of are basically only doing -lclang or -l$LIBDIR/libclang.so, then hardcode the builtin header directory from the LLVM version or determine it at runtime (using a call that technically wasn’t meant for this).

I can’t recall immediately and am skeptical I’d find it in my email archives. IIRC it was software that assumed it could depend on manually adding the .so.<LLVM_VERSION> to the link path. I could let the version stay at 13, but doing so provides no obvious value so our users as the different versions of LLVM are installed at different paths.

On a unrelated noted, IMO if we are going to fix the version we should fix it to 100 or something that clearly doesn’t match LLVM releases.

That’s indeed a bit weird, usually you link with the unversioned library (often a link like libclang.so -> libclang.so.X) and let the linker add a NEEDED to .dynamic corresponding to the library’s SONAME. But perhaps this software wanted to support builds against multiple LLVM versions in some way.

Indeed 13 is an awfully confusing number to get stuck at, but I think as time goes by the dust will settle on this. If we had a time machine or had no worries about the fallout I’d be for going to 1 as Debian seems to have wisely done, but we’ll get used to this.

1 Like

I have been thinking about this for a bit now - and discussed it a bit with @hansw2000.

Just so we are clear what we are talking about so we are all on the same page:

  • abiversion: SOVERSION == ABIVERSION would be 13 in this case
  • llvmversion : SOVERSION == LLVMVERSION would be 15 in this case

In 14 we changed it to be abiversion, but it was reverted quickly after 14 by Tom and @MaskRay because it created a lot of problems and confusions for our users with several issues filed as linked above. I don’t think this situation have really changed that much and if we released 15 with abiversion it will probably create similar problems. Especially since llvmversion have been the default for quite some time (a month) in main now.

With that in mind I think the most pragmatic thing to do for 15.x release that is just 2 weeks away and we don’t want to delay it unless it is absolutely necessary I think we should accept the patch above but switch the default setting to keep llvmversion. This still allows packaging and downstream vendors to keep abiversion.

Yes this ping-pongs between the releases, but it was reverted for a reason and while I see the benefits from both sides I think I would error on the side that have worked for many many releases and not created much of a confusion problem.

When a option was first suggested I was very against it since I had the same fears that @sylvestre expressed with introducing fragmentation, but considering where we are in the release cycle and that committing hard to one side would create problems for someone I think it’s the most workable idea for now and we should commit to a solution during the development period leading up to 16. So an option is a punt for me right now.

I really want to get rc3 out soon - so I’ll do the change of the default value to ON (llvmversion by default) and hope that we can come to a consensus around that for 15. And I strongly suggest that people interested in switching to abiversion should advocate for that for the main branch and make sure it becomes the default in good time before 16 branching happens.

I wasn’t talking about exchanging artifact but more consistency in term of usage.
This is why @serge-sans-paille and I started:

with a set of tests to ensure that we have similar behaviors and options in Debian/Ubuntu & Fedora/Redhat

Just to note that “quickly” here means almost 4 month after the release of LLVM 14, and 1 year after the change landed in tree.

I think a switch that defaults to == is still better than nothing (so happy to move forward with that), even though I think it’s unfortunate to change defaults back again without having had time to discuss this thoroughly. Inevitably, this will be used as an argument against changing the default yet again in LLVM 16… :person_shrugging:

2 Likes

Yeah - I can see the issue here and it probably comes back to the “how to form an consensus” question that often comes up with changes like these. It is probably in cases like this where the “quick revert” policy might not be to our benefit.

But I think we have to work with the situation we have right now and in the place we are right now I think this is probably the best choice.

Is there a similar argument to be had for “libLLVM-15.so.1”, shouldn’t this be “libLLVM.so.15” ?

Note our setup is different than typical packaging systems, and is designed to allow runtime version and tool selection to match historic build contexts.

Our tree is dynamically setup at tool update time and has infrastructure to dynamically pick software versions. The dynamically generated tree has stuff like:

libclang.so -> ../sw/3P/llvm/current/lib64/libclang.so
libclang.so.13 -> ../sw/3P/llvm/=libclang=13/lib64/libclang.so.13
libclang.so.8 -> ../sw/3P/llvm/=libclang=8/lib64/libclang.so.8

The “=libclang=##” is a fix for the version numbers not necessarily matching release versions, otherwise would just use major version number symbolic link. Most recently updated stuff has priority, so libclang still would have flipped back and forth with toolset respins even with that.

The …/llvm/ directory has symbolic links that point to read-only installs for particular release.

We don’t load multiple versions libraries multiple in a single binary (no RTLD_LOCAL tricks), we just might call versions of the same tool, often with setting LD_SHARED_LIBRARY path via wrappers.

We use similar infrastructure (along with path rewriting) to handle the builtin headers and match exact patch level versions.

I’m not sure what the scope of your packaging system is, but libclang.so would not be the only library that doesn’t increase the SO version with every release. Just staying in the field of compilers, my GCC 12 comes with libasan.so.8, libatomic.so.1, libgcc_s.so.1, libgccjit.so.0, libgfortran.so.5, libstdc++.so.6, and a couple of others. On our side there is libc++.so.1.

We have patched the file name to libLLVM.so.XX on our end ever since the decision to not do minor releases and keep ABI stability in patch level releases. I would be in favor of doing that upstream.

1 Like

We’ve had a full release cycle with LLVM version ≠ ABI version, so wouldn’t most of those problems have been solved by now? I’ve spent quite a bit of time on this when 14 came out, but now I only have the usual work that goes into a release. So why is 13 ≠ 15 different from 13 ≠ 14?

1 Like