[RFC] Modernize CMake LLVM "Components"/libLLVM Facility

stellaraccident · January 3, 2021, 9:49pm

Hi folks, happy new year!

Proposal:

See comments at the top of LLVMComponents.cmake in my fork.
Draft phab: https://reviews.llvm.org/D94000

Background:
As I’ve been working on NPCOMP trying to come up with a release flow for MLIR derived Python projects (see py-mlir-release), I’ve repeatedly run into issues with how the LLVM build system generates shared libraries. While the problems have been varied, I pattern match most of them to a certain “pragmatic” nature to how components/libLLVM/libMLIR have come to be: in my experience, you can fix most individual dynamic linkage issues with another work-around, but the need for this tends to be rooted in a lack of definition and structure to the libraries themselves, causing various kinds of problems and scenarios that don’t arise if developed to stricter standards. (This isn’t a knock on anyone – I know how these things tend to grow. My main observation is that I think we have outgrown the ad-hoc nature of shared libraries in the LLVM build now).

I think I’m hitting this because reasonable Python projects and releases pre-supposes a robust dynamic linkage story. Also, I use Windows and am very aware that LLVM basically does not support dynamic linking on Windows – and cannot without more structure (and in my experience, this structure would also benefit the robustness of dynamic linking on the others).

Several of us got together to discuss this in November. We generally agreed that BUILD_SHARED_LIBS was closer to what we wanted vs libLLVM/libMLIR, but the result is really only factored for development (i.e. not every add_library should result in a shared object – the shared library surface should mirror public interface boundaries and add_library mirrors private boundaries). The primary difference between the two is:

BUILD_SHARED_LIBS preserves the invariant that every translation unit will be “homed” in one library at link time (either .so/.dll or .a) and the system will never try to link together shared and static dependencies of the same thing (which is what libLLVM/libMLIR do today). It turns out that this is merely a good idea on most platforms but is the core requirement on native Windows (leaving out mingw, which uses some clever and dirty tricks to try to blend the worlds).
LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a “bucket” to throw things that might benefit from shared linkage, but end binaries end up also needing to link against the static libraries in case if what you want isn’t in libLLVM.so. When this is done just right, it can work (on Unix) but it is very fragile and prone to multiple definition and other linkage issues that can be extremely hard to track down.
What I did:

Well, first, I tried looking the other way for a few months and hoping someone else would fix it
When I started trying to generalize some of the shared library handling for MLIR and NPCOMP, I noted that the LLVM_LINK_COMPONENTS (as in named groups of things) are in the right direction of having a structure to the libraries, and I found that I could actually rebase all of what the LLVM_LINK_COMPONENTS was trying to do on the same facility, relegating the existing LLVM_LINK_COMPONENTS to a name normalization layer on top of a more generic “LLVM Components” facility that enforces stricter layering and more control than the old libLLVM.so facility did.
I rewrote it twice to progressively more modern CMake and was able to eliminate all of the ad-hoc dependency tracking in favor of straight-forward use of INTERFACE libraries and $<TARGET_PROPERTY> generator expressions for selecting static or dynamic component trees based on global flags and the presence (or absence) of per-executable LLVM_LINK_STATIC properties
Note that since this is rooted only in CMake features and not LLVM macros, out of tree, non-LLVM projects should be able to depend on LLVM components in their own targets.1. I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use the new facility (leaving out a few things that can be fixed but aren’t conceptual issues), applied a bunch of fixes to the tree that were revealed by stricter checks and got all related tests passing for LLVM and MLIR (on X86 – some mechanical changes need to be made to other targets) for both dynamic and static builds.
What I’d like to do:

Get some consensus that we’d like to improve things in this area and that the approach I’m taking makes sense. I can do a lot of the work, but I don’t want to waste my time, and this stuff is fragile if we keep it in an intermediate state for too long (I’m already paying this price downstream).
Land LLVMComponents.cmake as the basis of the new facility.
Finish implementing the “Redirection” feature that would allow us to emulate an aggregate libLLVM as it is today.
Start pre-staging the various stricter constraints to the build tree that will be needed to swap AddLLVM to use the new facility.
Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits in a more principled way to use the new facility (or remove features entirely that are no longer needed) – what I did in the above patch was just a minimal amount of working around for a POC.
Agree on whether we should try to have the two co-exist for a time or do a more clean break with the old.
Start applying the facility to downstream projects like MLIR and NPCOMP.
What I would need:
Help, testing and expertise. I am reasonably confident in my understanding of how to make shared libraries work and how to use CMake, but the legacy in LLVM here is deep – I likely pattern matched some old features as no longer needed when they actually are (I am not clear at all on how much of LLVM-Config is still relevant).
Pointers to who the stakeholders are that I should be coordinating with.
Comments?

Thanks!

Stella

mehdi_amini · January 4, 2021, 4:45am

Looks great! In particular it is interesting to see how more modern CMake features could replace some of the custom-LLVM CMake macros that are likely almost a decade old now.

One thing I wonder about trying to see BUILD_SHARED_LIBS as some desirable for a production environment: I seem to remember that there were non-trivial performance regression when using many .so instead of a single libLLVM.so (even a single libLLVM.so was showing a measurable performance impact for clang IIRC).

stellaraccident · January 4, 2021, 6:58am

Hi folks, happy new year!

Proposal:

See comments at the top of LLVMComponents.cmake in my fork.

Draft phab: https://reviews.llvm.org/D94000

Background:
As I’ve been working on NPCOMP trying to come up with a release flow for MLIR derived Python projects (see py-mlir-release), I’ve repeatedly run into issues with how the LLVM build system generates shared libraries. While the problems have been varied, I pattern match most of them to a certain “pragmatic” nature to how components/libLLVM/libMLIR have come to be: in my experience, you can fix most individual dynamic linkage issues with another work-around, but the need for this tends to be rooted in a lack of definition and structure to the libraries themselves, causing various kinds of problems and scenarios that don’t arise if developed to stricter standards. (This isn’t a knock on anyone – I know how these things tend to grow. My main observation is that I think we have outgrown the ad-hoc nature of shared libraries in the LLVM build now).

I think I’m hitting this because reasonable Python projects and releases pre-supposes a robust dynamic linkage story. Also, I use Windows and am very aware that LLVM basically does not support dynamic linking on Windows – and cannot without more structure (and in my experience, this structure would also benefit the robustness of dynamic linking on the others).

Several of us got together to discuss this in November. We generally agreed that BUILD_SHARED_LIBS was closer to what we wanted vs libLLVM/libMLIR, but the result is really only factored for development (i.e. not every add_library should result in a shared object – the shared library surface should mirror public interface boundaries and add_library mirrors private boundaries). The primary difference between the two is:

BUILD_SHARED_LIBS preserves the invariant that every translation unit will be “homed” in one library at link time (either .so/.dll or .a) and the system will never try to link together shared and static dependencies of the same thing (which is what libLLVM/libMLIR do today). It turns out that this is merely a good idea on most platforms but is the core requirement on native Windows (leaving out mingw, which uses some clever and dirty tricks to try to blend the worlds).

LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a “bucket” to throw things that might benefit from shared linkage, but end binaries end up also needing to link against the static libraries in case if what you want isn’t in libLLVM.so. When this is done just right, it can work (on Unix) but it is very fragile and prone to multiple definition and other linkage issues that can be extremely hard to track down.
What I did:

Well, first, I tried looking the other way for a few months and hoping someone else would fix it

When I started trying to generalize some of the shared library handling for MLIR and NPCOMP, I noted that the LLVM_LINK_COMPONENTS (as in named groups of things) are in the right direction of having a structure to the libraries, and I found that I could actually rebase all of what the LLVM_LINK_COMPONENTS was trying to do on the same facility, relegating the existing LLVM_LINK_COMPONENTS to a name normalization layer on top of a more generic “LLVM Components” facility that enforces stricter layering and more control than the old libLLVM.so facility did.

I rewrote it twice to progressively more modern CMake and was able to eliminate all of the ad-hoc dependency tracking in favor of straight-forward use of INTERFACE libraries and $<TARGET_PROPERTY> generator expressions for selecting static or dynamic component trees based on global flags and the presence (or absence) of per-executable LLVM_LINK_STATIC properties

Note that since this is rooted only in CMake features and not LLVM macros, out of tree, non-LLVM projects should be able to depend on LLVM components in their own targets.1. I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use the new facility (leaving out a few things that can be fixed but aren’t conceptual issues), applied a bunch of fixes to the tree that were revealed by stricter checks and got all related tests passing for LLVM and MLIR (on X86 – some mechanical changes need to be made to other targets) for both dynamic and static builds.
What I’d like to do:

Get some consensus that we’d like to improve things in this area and that the approach I’m taking makes sense. I can do a lot of the work, but I don’t want to waste my time, and this stuff is fragile if we keep it in an intermediate state for too long (I’m already paying this price downstream).

Land LLVMComponents.cmake as the basis of the new facility.

Finish implementing the “Redirection” feature that would allow us to emulate an aggregate libLLVM as it is today.

Start pre-staging the various stricter constraints to the build tree that will be needed to swap AddLLVM to use the new facility.

Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits in a more principled way to use the new facility (or remove features entirely that are no longer needed) – what I did in the above patch was just a minimal amount of working around for a POC.

Agree on whether we should try to have the two co-exist for a time or do a more clean break with the old.

Start applying the facility to downstream projects like MLIR and NPCOMP.
What I would need:

Help, testing and expertise. I am reasonably confident in my understanding of how to make shared libraries work and how to use CMake, but the legacy in LLVM here is deep – I likely pattern matched some old features as no longer needed when they actually are (I am not clear at all on how much of LLVM-Config is still relevant).

Pointers to who the stakeholders are that I should be coordinating with.
Comments?

Looks great! In particular it is interesting to see how more modern CMake features could replace some of the custom-LLVM CMake macros that are likely almost a decade old now.

One thing I wonder about trying to see BUILD_SHARED_LIBS as some desirable for a production environment: I seem to remember that there were non-trivial performance regression when using many .so instead of a single libLLVM.so (even a single libLLVM.so was showing a measurable performance impact for clang IIRC).

It’s a good question, and one for which any hard data I can recall is hopelessly out of date. I know that back in x86 32bit days, there were non-trivial costs related to PIC and various data access indirections that were induced, but my mental model has these costs as having been very reduced/eliminated on x64 (and not a factor for others). On Windows, I know, it is very easy to end up with extra call and data access indirections unless if export/import are paired properly so that the compiler can eliminate them.

In the modern era, aside from fixed startup costs (which are obviously higher for shared libraries), I suspect that the biggest performance impacts will come from missed optimizations and export bloat. It is my understanding that a primary offender on that front is exporting everything with default visibility causing too many should-have-been-internal entry points to be fully materialized as black boxes and the corresponding decreased scope of any cross module optimizations. As an example, the X86 component aggregates 5 different libraries. In a BUILD_SHARED_LIBS setup, this would be 5 shared objects that expose a lot of internal symbols and potential for indirection. Since Targets are compiled with visibility=hidden, I did measure the impact of that. If IIRC, for a stripped libX86.so with visibility=hidden, the size was about 10MiB, and it was >11MiB for visibility=default. I did not measure the performance impact, but considering even just dynamic-link-time, an estimated 10% increase in the exports of one large target is going to have a measurable impact on a short lived process like clang. The finer granularity of your shared objects, the more unavoidable exported symbol bloat you are going to have (in fact, there is a special carve-out in the Targets to export everything for BUILD_SHARED_LIBS mode because it doesn’t work otherwise). The coarser your shared objects, the more fixed startup overheads you will incur.

To be clear, what I am proposing is a facility that will let us break the shared-library granularity down to the component level, but I expect most distributions will elect some coarser granularity (up to what libLLVM.so is today). For the Python MLIR distribution, I probably want something like (Support, Core, Per-Target, and OrcJit) libraries. Aside from making for potentially smaller packages that can omit components, lazy loading some of that stands to avoid startup costs of massive shared libraries. I suspect that the sweet spot, performance wise is more fine-grained than libLLVM is today and more coarse grained than BUILD_SHARED_LIBS allows (and it will vary depending on whether the user biases towards modularity over raw size/performance).

Stephen_Neuendorffer · January 4, 2021, 7:38am

I’m curious if you’ve prototyped how this would affect llvm-dependent projects? I’m thinking in particular about CIRCT and NPCOMP where properly dealing with libMLIR and libCIRCT requires duplicating alot of boilerplate. Would this simplify things? Generally, I like the idea and would like to see this move forward. (But I’m hardly one of those people with deep experience in the legacy of LLVM here…)

Steve

stellaraccident · January 4, 2021, 7:50am

I’m curious if you’ve prototyped how this would affect llvm-dependent projects? I’m thinking in particular about CIRCT and NPCOMP where properly dealing with libMLIR and libCIRCT requires duplicating alot of boilerplate. Would this simplify things? Generally, I like the idea and would like to see this move forward. (But I’m hardly one of those people with deep experience in the legacy of LLVM here…)

I stopped short of actually prototyping that (just due to time and scope before checking in with the community), but in fact, those examples (and MLIR) were my primary motivations for seeing if there was any way to generalize the libLLVM machinery: I wanted to use something like LLVM components in the other projects (but more flexible because then you have to consider different needs). It was a bit of a happy accident that I discovered that implemented today, LLVM components can be made a lot simpler and with greater flexibility virtually for free.

I think the next place to go on the downstream projects for this is to componentize MLIR. I would probably start by creating an MLIRIR component, MLIRTransforms, and a component per dialect. I believe this could largely be accomplished by swapping calls to add_mlir_library with equivalent calls to something that invoked llvm_component_add_library (possibly with project local wrappers to enforce extra things and add sugar). I would also need to find-replace the direct library deps and replace them with component deps. We would also delete mlir-shlib as the component stuff would take care of it.

If that works for MLIR (I don’t see why it wouldn’t), then the same exact thing should work for MLIR derived projects like CIRCT and NPCOMP.

It may be possible to switch the order of operations and apply the new component support to MLIR/et-al first. There are complexities, though, when it comes to having that depend on the layering unclean libLLVM. If possible I’d like to just fix the whole stack, starting with LLVM.

tstellar · January 4, 2021, 7:04pm

Hi folks, happy new year!

*Proposal:*

  * See comments at the top of LLVMComponents.cmake
    <https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
    in my fork
    <https://github.com/stellaraccident/llvm-project/tree/newcomponents>\.
  * Draft phab: ⚙ D94000 DRAFT: Teach components to link into shared libs.

*Background:*
As I've been working on NPCOMP <https://github.com/llvm/mlir-npcomp> trying to come up with a release flow for MLIR derived Python projects (see py-mlir-release <https://github.com/stellaraccident/mlir-py-release>\), I've repeatedly run into issues with how the LLVM build system generates shared libraries. While the problems have been varied, I pattern match most of them to a certain "pragmatic" nature to how components/libLLVM/libMLIR have come to be: in my experience, you can fix most individual dynamic linkage issues with another work-around, but the need for this tends to be rooted in a lack of definition and structure to the libraries themselves, causing various kinds of problems and scenarios that don't arise if developed to stricter standards. (This isn't a knock on anyone -- I know how these things tend to grow. My main observation is that I think we have outgrown the ad-hoc nature of shared libraries in the LLVM build now).

I think I'm hitting this because reasonable Python projects and releases pre-supposes a robust dynamic linkage story. Also, I use Windows and am very aware that LLVM basically does not support dynamic linking on Windows -- and cannot without more structure (and in my experience, this structure would also benefit the robustness of dynamic linking on the others).

Several of us got together to discuss this in November <https://llvm.discourse.group/t/meeting-notes-mlir-build-install-and-shared-libraries/2257>\. We generally agreed that BUILD_SHARED_LIBS was closer to what we wanted vs libLLVM/libMLIR, but the result is really only factored for development (i.e. not every add_library should result in a shared object -- the shared library surface should mirror public interface boundaries and add_library mirrors private boundaries). The primary difference between the two is:

  * BUILD_SHARED_LIBS preserves the invariant that every translation
    unit will be "homed" in one library at link time (either .so/.dll or
    .a) and the system will never try to link together shared and static
    dependencies of the same thing (which is what libLLVM/libMLIR do
    today). It turns out that this is merely a good idea on most
    platforms but is the core requirement on native Windows (leaving out
    mingw, which uses some clever and dirty tricks to try to blend the
    worlds).
  * LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a "bucket" to throw
    things that might benefit from shared linkage, but end binaries end
    up also needing to link against the static libraries in case if what
    you want isn't in libLLVM.so. When this is done just right, it can
    work (on Unix) but it is very fragile and prone to multiple
    definition and other linkage issues that can be extremely hard to
    track down.

*What I did:*

1. Well, first, I tried looking the other way for a few months and
    hoping someone else would fix it
2. When I started trying to generalize some of the shared library
    handling for MLIR and NPCOMP, I noted that the LLVM_LINK_COMPONENTS
    (as in named groups of things) are in the right direction of having
    a structure to the libraries, and I found that I could actually
    rebase all of what the LLVM_LINK_COMPONENTS was trying to do on the
    same facility, relegating the existing LLVM_LINK_COMPONENTS to a
    name normalization layer on top of a more generic "LLVM Components"
    facility that enforces stricter layering and more control than the
    old libLLVM.so facility did.
3. I rewrote it twice to progressively more modern CMake and was able
    to eliminate all of the ad-hoc dependency tracking in favor of
    straight-forward use of INTERFACE libraries and $<TARGET_PROPERTY>
    generator expressions for selecting static or dynamic component
    trees based on global flags and the presence (or absence) of
    per-executable LLVM_LINK_STATIC properties
     1. Note that since this is rooted only in CMake features and not
        LLVM macros, out of tree, non-LLVM projects should be able to
        depend on LLVM components in their own targets.
4. I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use the new
    facility (leaving out a few things that can be fixed but aren't
    conceptual issues), applied a bunch of fixes to the tree that were
    revealed by stricter checks and got all related tests passing for
    LLVM and MLIR (on X86 -- some mechanical changes need to be made to
    other targets) for both dynamic and static builds.

*What I'd like to do:*

  * Get some consensus that we'd like to improve things in this area and
    that the approach I'm taking makes sense. I can do a lot of the
    work, but I don't want to waste my time, and this stuff is fragile
    if we keep it in an intermediate state for too long (I'm already
    paying this price downstream).
  * Land LLVMComponents.cmake
    <https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
    as the basis of the new facility.

Do you have a proposed list of components yet for LLVM?

  * Finish implementing the "Redirection" feature that would allow us to
    emulate an aggregate libLLVM as it is today.
  * Start pre-staging the various stricter constraints to the build tree
    that will be needed to swap AddLLVM to use the new facility.
  * Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits in a
    more principled way to use the new facility (or remove features
    entirely that are no longer needed) -- what I did in the above patch
    was just a minimal amount of working around for a POC.
  * Agree on whether we should try to have the two co-exist for a time
    or do a more clean break with the old.
  * Start applying the facility to downstream projects like MLIR and NPCOMP.

It sounds like what you are proposing is BUILD_SHARED_LIBS=ON but with fewer total libraries, is this an accurate summary?

I would prefer for any large change like this that we do not add any net new configuration options (meaning if we add a new option we should remove an old one)to LLVM as we already have too many. Would this be able to replace BUILD_SHARED_LIBS=ON?

- Tom

stellaraccident · January 4, 2021, 7:41pm

Hi folks, happy new year!

Proposal:

See comments at the top of LLVMComponents.cmake
<https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
in my fork
<https://github.com/stellaraccident/llvm-project/tree/newcomponents>.

Draft phab: https://reviews.llvm.org/D94000

Background:
As I’ve been working on NPCOMP
<https://github.com/llvm/mlir-npcomp> trying to come up with a release
flow for MLIR derived Python projects (see py-mlir-release
<https://github.com/stellaraccident/mlir-py-release>), I’ve repeatedly
run into issues with how the LLVM build system generates shared
libraries. While the problems have been varied, I pattern match most of
them to a certain “pragmatic” nature to how components/libLLVM/libMLIR
have come to be: in my experience, you can fix most individual dynamic
linkage issues with another work-around, but the need for this tends to
be rooted in a lack of definition and structure to the libraries
themselves, causing various kinds of problems and scenarios that don’t
arise if developed to stricter standards. (This isn’t a knock on anyone
– I know how these things tend to grow. My main observation is that I
think we have outgrown the ad-hoc nature of shared libraries in the LLVM
build now).

I think I’m hitting this because reasonable Python projects and releases
pre-supposes a robust dynamic linkage story. Also, I use Windows and am
very aware that LLVM basically does not support dynamic linking on
Windows – and cannot without more structure (and in my experience, this
structure would also benefit the robustness of dynamic linking on the
others).

Several of us got together to discuss this in November
<https://llvm.discourse.group/t/meeting-notes-mlir-build-install-and-shared-libraries/2257>.
We generally agreed that BUILD_SHARED_LIBS was closer to what we wanted
vs libLLVM/libMLIR, but the result is really only factored for
development (i.e. not every add_library should result in a shared object
– the shared library surface should mirror public interface boundaries
and add_library mirrors private boundaries). The primary difference
between the two is:

BUILD_SHARED_LIBS preserves the invariant that every translation
unit will be “homed” in one library at link time (either .so/.dll or
.a) and the system will never try to link together shared and static
dependencies of the same thing (which is what libLLVM/libMLIR do
today). It turns out that this is merely a good idea on most
platforms but is the core requirement on native Windows (leaving out
mingw, which uses some clever and dirty tricks to try to blend the
worlds).

LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a “bucket” to throw
things that might benefit from shared linkage, but end binaries end
up also needing to link against the static libraries in case if what
you want isn’t in libLLVM.so. When this is done just right, it can
work (on Unix) but it is very fragile and prone to multiple
definition and other linkage issues that can be extremely hard to
track down.

What I did:

Well, first, I tried looking the other way for a few months and
hoping someone else would fix it

When I started trying to generalize some of the shared library
handling for MLIR and NPCOMP, I noted that the LLVM_LINK_COMPONENTS
(as in named groups of things) are in the right direction of having
a structure to the libraries, and I found that I could actually
rebase all of what the LLVM_LINK_COMPONENTS was trying to do on the
same facility, relegating the existing LLVM_LINK_COMPONENTS to a
name normalization layer on top of a more generic “LLVM Components”
facility that enforces stricter layering and more control than the
old libLLVM.so facility did.

I rewrote it twice to progressively more modern CMake and was able
to eliminate all of the ad-hoc dependency tracking in favor of
straight-forward use of INTERFACE libraries and $<TARGET_PROPERTY>
generator expressions for selecting static or dynamic component
trees based on global flags and the presence (or absence) of
per-executable LLVM_LINK_STATIC properties

Note that since this is rooted only in CMake features and not
LLVM macros, out of tree, non-LLVM projects should be able to
depend on LLVM components in their own targets.

I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use the new
facility (leaving out a few things that can be fixed but aren’t
conceptual issues), applied a bunch of fixes to the tree that were
revealed by stricter checks and got all related tests passing for
LLVM and MLIR (on X86 – some mechanical changes need to be made to
other targets) for both dynamic and static builds.

What I’d like to do:

Get some consensus that we’d like to improve things in this area and
that the approach I’m taking makes sense. I can do a lot of the
work, but I don’t want to waste my time, and this stuff is fragile
if we keep it in an intermediate state for too long (I’m already
paying this price downstream).

Land LLVMComponents.cmake
<https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
as the basis of the new facility.

Do you have a proposed list of components yet for LLVM?

Finish implementing the “Redirection” feature that would allow us to
emulate an aggregate libLLVM as it is today.

Start pre-staging the various stricter constraints to the build tree
that will be needed to swap AddLLVM to use the new facility.

Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits in a
more principled way to use the new facility (or remove features
entirely that are no longer needed) – what I did in the above patch
was just a minimal amount of working around for a POC.

Agree on whether we should try to have the two co-exist for a time
or do a more clean break with the old.

Start applying the facility to downstream projects like MLIR and NPCOMP.

It sounds like what you are proposing is BUILD_SHARED_LIBS=ON but with
fewer total libraries, is this an accurate summary?

I think that is a reasonable summary for the level that most people care about. It might be a bit pedantic, but what I’m aiming for is for us to be able to define the shared library set to correspond with our notion of component boundaries (follows public APIs), as that is what opens up the ability to optimize them in the future (BUILD_SHARED_LIBS is just a 1:1 add_library call → shared library approach and leaks a lot of private boundaries). Also, it preserves the ability for executables to choose to link statically or dynamically, which is important for some things (and likely will remain so, especially when considering downstream).

I would prefer for any large change like this that we do not add any net
new configuration options (meaning if we add a new option we should
remove an old one)to LLVM as we already have too many. Would this be
able to replace BUILD_SHARED_LIBS=ON?

Completely agree in the end state. I would like to converge on one configuration option that enables shared linking and then remove the others. I suspect that downstreams may want to customize things a bit more, but we should avoid adding those options to the extent possible in favor of seeing if we can make the default way workable before fragmenting.

Note that BUILD_SHARED_LIBS is a published way in the CMake ecosystem to tell a project to build in shared library mode. If we get this all fixed, we may still want to recognize when users set it and do the right thing (i.e. make it more of an alias). This viewpoint would argue for removing LLVM_BUILD_LLVM_DYLIB and just supporting BUILD_SHARED_LIBS (but with new behavior). Either way, we should keep the variants to a minimum.

tstellar · January 4, 2021, 8:03pm

     > Hi folks, happy new year!
     >
     > *Proposal:*
     >
     > * See comments at the top of LLVMComponents.cmake
     > <https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
     > in my fork
     > <https://github.com/stellaraccident/llvm-project/tree/newcomponents>\.
     > * Draft phab: ⚙ D94000 DRAFT: Teach components to link into shared libs.
     >
     > *Background:*
     > As I've been working on NPCOMP
     > <https://github.com/llvm/mlir-npcomp> trying to come up with a
    release
     > flow for MLIR derived Python projects (see py-mlir-release
     > <https://github.com/stellaraccident/mlir-py-release>\), I've
    repeatedly
     > run into issues with how the LLVM build system generates shared
     > libraries. While the problems have been varied, I pattern match
    most of
     > them to a certain "pragmatic" nature to how
    components/libLLVM/libMLIR
     > have come to be: in my experience, you can fix most individual
    dynamic
     > linkage issues with another work-around, but the need for this
    tends to
     > be rooted in a lack of definition and structure to the libraries
     > themselves, causing various kinds of problems and scenarios that
    don't
     > arise if developed to stricter standards. (This isn't a knock on
    anyone
     > -- I know how these things tend to grow. My main observation is
    that I
     > think we have outgrown the ad-hoc nature of shared libraries in
    the LLVM
     > build now).
     >
     > I think I'm hitting this because reasonable Python projects and
    releases
     > pre-supposes a robust dynamic linkage story. Also, I use Windows
    and am
     > very aware that LLVM basically does not support dynamic linking on
     > Windows -- and cannot without more structure (and in my
    experience, this
     > structure would also benefit the robustness of dynamic linking on
    the
     > others).
     >
     > Several of us got together to discuss this in November
     >
    <https://llvm.discourse.group/t/meeting-notes-mlir-build-install-and-shared-libraries/2257>\.

     > We generally agreed that BUILD_SHARED_LIBS was closer to what we
    wanted
     > vs libLLVM/libMLIR, but the result is really only factored for
     > development (i.e. not every add_library should result in a shared
    object
     > -- the shared library surface should mirror public interface
    boundaries
     > and add_library mirrors private boundaries). The primary difference
     > between the two is:
     >
     > * BUILD_SHARED_LIBS preserves the invariant that every translation
     > unit will be "homed" in one library at link time (either
    .so/.dll or
     > .a) and the system will never try to link together shared and
    static
     > dependencies of the same thing (which is what libLLVM/libMLIR do
     > today). It turns out that this is merely a good idea on most
     > platforms but is the core requirement on native Windows
    (leaving out
     > mingw, which uses some clever and dirty tricks to try to
    blend the
     > worlds).
     > * LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a "bucket" to throw
     > things that might benefit from shared linkage, but end
    binaries end
     > up also needing to link against the static libraries in case
    if what
     > you want isn't in libLLVM.so. When this is done just right,
    it can
     > work (on Unix) but it is very fragile and prone to multiple
     > definition and other linkage issues that can be extremely hard to
     > track down.
     >
     > *What I did:*
     >
     > 1. Well, first, I tried looking the other way for a few months and
     > hoping someone else would fix it
     > 2. When I started trying to generalize some of the shared library
     > handling for MLIR and NPCOMP, I noted that the
    LLVM_LINK_COMPONENTS
     > (as in named groups of things) are in the right direction of
    having
     > a structure to the libraries, and I found that I could actually
     > rebase all of what the LLVM_LINK_COMPONENTS was trying to do
    on the
     > same facility, relegating the existing LLVM_LINK_COMPONENTS to a
     > name normalization layer on top of a more generic "LLVM
    Components"
     > facility that enforces stricter layering and more control
    than the
     > old libLLVM.so facility did.
     > 3. I rewrote it twice to progressively more modern CMake and was
    able
     > to eliminate all of the ad-hoc dependency tracking in favor of
     > straight-forward use of INTERFACE libraries and
    $<TARGET_PROPERTY>
     > generator expressions for selecting static or dynamic component
     > trees based on global flags and the presence (or absence) of
     > per-executable LLVM_LINK_STATIC properties
     > 1. Note that since this is rooted only in CMake features and not
     > LLVM macros, out of tree, non-LLVM projects should be able to
     > depend on LLVM components in their own targets.
     > 4. I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use
    the new
     > facility (leaving out a few things that can be fixed but aren't
     > conceptual issues), applied a bunch of fixes to the tree that
    were
     > revealed by stricter checks and got all related tests passing for
     > LLVM and MLIR (on X86 -- some mechanical changes need to be
    made to
     > other targets) for both dynamic and static builds.
     >
     > *What I'd like to do:*
     >
     > * Get some consensus that we'd like to improve things in this
    area and
     > that the approach I'm taking makes sense. I can do a lot of the
     > work, but I don't want to waste my time, and this stuff is
    fragile
     > if we keep it in an intermediate state for too long (I'm already
     > paying this price downstream).
     > * Land LLVMComponents.cmake
     > <https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>
     > as the basis of the new facility.

    Do you have a proposed list of components yet for LLVM?

     > * Finish implementing the "Redirection" feature that would
    allow us to
     > emulate an aggregate libLLVM as it is today.
     > * Start pre-staging the various stricter constraints to the
    build tree
     > that will be needed to swap AddLLVM to use the new facility.
     > * Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits
    in a
     > more principled way to use the new facility (or remove features
     > entirely that are no longer needed) -- what I did in the
    above patch
     > was just a minimal amount of working around for a POC.
     > * Agree on whether we should try to have the two co-exist for a
    time
     > or do a more clean break with the old.
     > * Start applying the facility to downstream projects like MLIR
    and NPCOMP.
     >

    It sounds like what you are proposing is BUILD_SHARED_LIBS=ON but with
    fewer total libraries, is this an accurate summary?

I think that is a reasonable summary for the level that most people care about. It might be a bit pedantic, but what I'm aiming for is for us to be able to define the shared library set to correspond with our notion of component boundaries (follows public APIs), as that is what opens up the ability to optimize them in the future (BUILD_SHARED_LIBS is just a 1:1 add_library call -> shared library approach and leaks a lot of private boundaries). Also, it preserves the ability for executables to choose to link statically or dynamically, which is important for some things (and likely will remain so, especially when considering downstream).

As part of this change, were you planning to explicitly define what the public APIs are for LLVM? Currently, we just define this as 'everything' which is not great. It would be a nice improvement if we could limit the number of exported symbols. In addition to improving shared library performance, a smaller API would mean less fixes we have to reject from the stable branch due to API changes.

    I would prefer for any large change like this that we do not add any
    net
    new configuration options (meaning if we add a new option we should
    remove an old one)to LLVM as we already have too many. Would this be
    able to replace BUILD_SHARED_LIBS=ON?

Completely agree in the end state. I would like to converge on one configuration option that enables shared linking and then remove the others. I suspect that downstreams may want to customize things a bit more, but we should avoid adding those options to the extent possible in favor of seeing if we can make the default way workable before fragmenting.

Note that BUILD_SHARED_LIBS is a published way in the CMake ecosystem to tell a project to build in shared library mode. If we get this all fixed, we may still want to recognize when users set it and do the right thing (i.e. make it more of an alias). This viewpoint would argue for removing LLVM_BUILD_LLVM_DYLIB and just supporting BUILD_SHARED_LIBS (but with new behavior). Either way, we should keep the variants to a minimum.

I would be in favor of having BUILD_SHARED_LIBS being the only shared library related option that we support, if it produced the new behavior you described (and also libLLVM.so). I know some people (not me though) use BUILD_SHARED_LIBS, because it reduces the build times when just changing a single file, so I think we would need to make sure that anything that replaces it does not regress build times too much.

-Tom

stellaraccident · January 4, 2021, 8:17pm

Hi folks, happy new year!

Proposal:

See comments at the top of LLVMComponents.cmake

<https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>

in my fork

<https://github.com/stellaraccident/llvm-project/tree/newcomponents>.

Draft phab: https://reviews.llvm.org/D94000

Background:
As I’ve been working on NPCOMP
<https://github.com/llvm/mlir-npcomp> trying to come up with a
release
flow for MLIR derived Python projects (see py-mlir-release
<https://github.com/stellaraccident/mlir-py-release>), I’ve
repeatedly
run into issues with how the LLVM build system generates shared
libraries. While the problems have been varied, I pattern match
most of
them to a certain “pragmatic” nature to how
components/libLLVM/libMLIR
have come to be: in my experience, you can fix most individual
dynamic
linkage issues with another work-around, but the need for this
tends to
be rooted in a lack of definition and structure to the libraries
themselves, causing various kinds of problems and scenarios that
don’t
arise if developed to stricter standards. (This isn’t a knock on
anyone
– I know how these things tend to grow. My main observation is
that I
think we have outgrown the ad-hoc nature of shared libraries in
the LLVM
build now).

I think I’m hitting this because reasonable Python projects and
releases
pre-supposes a robust dynamic linkage story. Also, I use Windows
and am
very aware that LLVM basically does not support dynamic linking on
Windows – and cannot without more structure (and in my
experience, this
structure would also benefit the robustness of dynamic linking on
the
others).

Several of us got together to discuss this in November

<https://llvm.discourse.group/t/meeting-notes-mlir-build-install-and-shared-libraries/2257>.

We generally agreed that BUILD_SHARED_LIBS was closer to what we
wanted
vs libLLVM/libMLIR, but the result is really only factored for
development (i.e. not every add_library should result in a shared
object
– the shared library surface should mirror public interface
boundaries
and add_library mirrors private boundaries). The primary difference
between the two is:

BUILD_SHARED_LIBS preserves the invariant that every translation
unit will be “homed” in one library at link time (either
.so/.dll or
.a) and the system will never try to link together shared and
static
dependencies of the same thing (which is what libLLVM/libMLIR do
today). It turns out that this is merely a good idea on most
platforms but is the core requirement on native Windows
(leaving out
mingw, which uses some clever and dirty tricks to try to
blend the
worlds).

LLVM_BUILD_LLVM_DYLIB treats libLLVM.so as a “bucket” to throw
things that might benefit from shared linkage, but end
binaries end
up also needing to link against the static libraries in case
if what
you want isn’t in libLLVM.so. When this is done just right,
it can
work (on Unix) but it is very fragile and prone to multiple
definition and other linkage issues that can be extremely hard to
track down.

What I did:

Well, first, I tried looking the other way for a few months and
hoping someone else would fix it

When I started trying to generalize some of the shared library
handling for MLIR and NPCOMP, I noted that the
LLVM_LINK_COMPONENTS
(as in named groups of things) are in the right direction of
having
a structure to the libraries, and I found that I could actually
rebase all of what the LLVM_LINK_COMPONENTS was trying to do
on the
same facility, relegating the existing LLVM_LINK_COMPONENTS to a
name normalization layer on top of a more generic “LLVM
Components”
facility that enforces stricter layering and more control
than the
old libLLVM.so facility did.

I rewrote it twice to progressively more modern CMake and was
able
to eliminate all of the ad-hoc dependency tracking in favor of
straight-forward use of INTERFACE libraries and
$<TARGET_PROPERTY>
generator expressions for selecting static or dynamic component
trees based on global flags and the presence (or absence) of
per-executable LLVM_LINK_STATIC properties

Note that since this is rooted only in CMake features and not
LLVM macros, out of tree, non-LLVM projects should be able to
depend on LLVM components in their own targets.

I hacked up AddLLVM/LLVM-Build/LLVM-Config to (mostly) use
the new
facility (leaving out a few things that can be fixed but aren’t
conceptual issues), applied a bunch of fixes to the tree that
were
revealed by stricter checks and got all related tests passing for
LLVM and MLIR (on X86 – some mechanical changes need to be
made to
other targets) for both dynamic and static builds.

What I’d like to do:

Get some consensus that we’d like to improve things in this
area and
that the approach I’m taking makes sense. I can do a lot of the
work, but I don’t want to waste my time, and this stuff is
fragile
if we keep it in an intermediate state for too long (I’m already
paying this price downstream).

Land LLVMComponents.cmake

<https://github.com/stellaraccident/llvm-project/blob/newcomponents/llvm/cmake/modules/LLVMComponents.cmake>

as the basis of the new facility.

Do you have a proposed list of components yet for LLVM?

Finish implementing the “Redirection” feature that would
allow us to
emulate an aggregate libLLVM as it is today.

Start pre-staging the various stricter constraints to the
build tree
that will be needed to swap AddLLVM to use the new facility.

Rewrite component-related AddLLVM/LLVM-Build/LLVM-Config bits
in a
more principled way to use the new facility (or remove features
entirely that are no longer needed) – what I did in the
above patch
was just a minimal amount of working around for a POC.

Agree on whether we should try to have the two co-exist for a
time
or do a more clean break with the old.

Start applying the facility to downstream projects like MLIR
and NPCOMP.

It sounds like what you are proposing is BUILD_SHARED_LIBS=ON but with
fewer total libraries, is this an accurate summary?

I think that is a reasonable summary for the level that most people care
about. It might be a bit pedantic, but what I’m aiming for is for us to
be able to define the shared library set to correspond with our notion
of component boundaries (follows public APIs), as that is what opens up
the ability to optimize them in the future (BUILD_SHARED_LIBS is just a
1:1 add_library call → shared library approach and leaks a lot of
private boundaries). Also, it preserves the ability for executables to
choose to link statically or dynamically, which is important for some
things (and likely will remain so, especially when considering downstream).

As part of this change, were you planning to explicitly define what the
public APIs are for LLVM? Currently, we just define this as
‘everything’ which is not great. It would be a nice improvement if we
could limit the number of exported symbols. In addition to improving
shared library performance, a smaller API would mean less fixes we have
to reject from the stable branch due to API changes.

I had planned to make it more possible to do this at the granularity of a component by use of a new library option (EXPORT_EXPLICIT_SYMBOLS). Then we could crank through and tighten things up where appropriate. Doing it in one step seems a bit to herculean for me, but I would like to lay down a path to get there. Currently, the target components kind of do this by way of an explicit check if building LIBLLVM and then setting visibility to hidden for everything in the lib/Targets directory. Since these are visibility-safe, I would remove this carve-out and just mark the component libraries with EXPORT_EXPLICIT_SYMBOLS. We could then extend this pattern to other components in less of an ad-hoc fashion than what we do now.

I would prefer for any large change like this that we do not add any
net
new configuration options (meaning if we add a new option we should
remove an old one)to LLVM as we already have too many. Would this be
able to replace BUILD_SHARED_LIBS=ON?

Completely agree in the end state. I would like to converge on one
configuration option that enables shared linking and then remove the
others. I suspect that downstreams may want to customize things a bit
more, but we should avoid adding those options to the extent possible in
favor of seeing if we can make the default way workable before fragmenting.

Note that BUILD_SHARED_LIBS is a published way in the CMake ecosystem to
tell a project to build in shared library mode. If we get this all
fixed, we may still want to recognize when users set it and do the right
thing (i.e. make it more of an alias). This viewpoint would argue for
removing LLVM_BUILD_LLVM_DYLIB and just supporting BUILD_SHARED_LIBS
(but with new behavior). Either way, we should keep the variants to a
minimum.

I would be in favor of having BUILD_SHARED_LIBS being the only shared
library related option that we support, if it produced the new behavior
you described (and also libLLVM.so). I know some people (not me though)
use BUILD_SHARED_LIBS, because it reduces the build times when just
changing a single file, so I think we would need to make sure that
anything that replaces it does not regress build times too much.

+1 - I can verify but I think it will end up being ok. The fan out from library → component tends to not be more than 3-5x, and the largest components link to ~10MiB. For the big ones that are already visibility controlled, it may turn out to be a net savings in link time because currently, when doing fine grained linking, way too much gets exported. We’ll see, but I suspect that worse case will not be too bad and is still an order of magnitude less than the full static link that drives the current costs.

Michael_Kruse2 · January 5, 2021, 2:05am

Thank you for the proposal. Would you also consider unifying the
library handling of LLVM and clang (and potentially other subprojects)
as well? I found the differences always annoying, e.g. there is no
equivalent to llvm-config in clang, or clangXYZ/MLIRxyz libraries
creating object libraries under some conditions (e.g. to create static
as well as shared libraries in the same build), but LLVM itself not
doing that.

Michael

stellaraccident · January 5, 2021, 3:43am

Thank you for the proposal. Would you also consider unifying the
library handling of LLVM and clang (and potentially other subprojects)
as well? I found the differences always annoying, e.g. there is no
equivalent to llvm-config in clang, or clangXYZ/MLIRxyz libraries
creating object libraries under some conditions (e.g. to create static
as well as shared libraries in the same build), but LLVM itself not
doing that.

Michael

It would be great if some of the approaches we use in the core llvm/ project help unify things in the others! While I don’t have the bandwidth to go down all of those rabbit holes (I’m responsible for three downstream projects myself), I would be happy to work with folks on the clang side to help improve the situation once we get the core a bit more upgraded in this fashion.

Topic		Replies	Views
Building SVN head with CMake - shared libraries? LLVM Dev List Archives	35	262	January 22, 2016
[RFC] March Update: Progress report on CMake build system's ability to replace autoconf LLVM Dev List Archives	5	142	March 5, 2015
Long-Term Support for LLVM Projects Extension to Build System? LLVM Dev List Archives	19	128	June 22, 2015
LLVM as a shared library LLVM Dev List Archives	31	739	August 6, 2014
Supporting LLVM_BUILD_LLVM_DYLIB on Windows LLVM Project	30	2787	August 1, 2023

[RFC] Modernize CMake LLVM "Components"/libLLVM Facility

Related topics