Supporting LLVM_BUILD_LLVM_DYLIB on Windows

Hello llvm-dev,

One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often.

When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows.

Unlike Unix platforms, the default on Windows is that all symbols are treated as dso_local (that is -fvisibility-default=hidden). Symbols which are meant to participate in dynamic linking are to be attributed as __declspec(dllexport) in the module and __declspec(dllimport) external to the module. This is similar to Unix platforms where __attribute__((__visibility__(...))) controls the same type of behaviour with -fvisibility-default=hidden.

For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows.

With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with -fvisibility-default=hidden. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking.

In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations.

Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is:

  • introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, …) to mark public interfaces of shared library modules
  • decorate all the public interfaces of the shared library modules with the new decoration
  • switching the builds to use -fvisibility-default=hidden by default

I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over.

Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are:

  • it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build
  • LLVM_DYLIB_COMPONENTS becomes less tractable: in theory it is possible to apply enough CPP magic to determine where a symbol is homed, but allowing a symbol to be homed in a shared or static library is significantly more complex
  • BUILD_SHARED_LIBS becomes more expensive to maintain: the decoration is per-module, which requires that we would need to decorate the symbols of each module with module specific annotations as well

One argument that people make for BUILD_SHARED_LIBS is that it reduces the overall time build-test cycle. With the combination of lld, DWARF Fission, and LLVM_BUILD_LLVM_DYLIB, I believe that most of the benefits still can be had. The cost of linking all the tools is amortized across the link of a single library, which while not as small as the a singular library, is offset by the following:

  • The LLVM_BUILD_LLVM_DYLIB would not require the re-linking of all the libraries for each tool.
  • DWARF Fission would avoid the need to relink all of the DWARF information.
  • lld is faster than the gold and bfd linkers

Header changes would still ripple through the system as before, requiring rebuilding the transitive closure. Source file changes do not have the same impact of course.

For those would like a more concrete example of what a change like this may shape up into: https://reviews.llvm.org/D109192 contains LLVMSupportExports.h which has the expected structure for declaring the decoration macros with the rest of the change primarily being focused on applying the decoration. Please ignore the CMake changes as they are there to ensure that the CI validates this without changing the configuration and not intended to be part of the final version of the change.

1 Like

Hello llvm-dev,

In general, I am in favor of this change and think it would be a good
improvement. A few comments below:

One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often.

When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows.

Unlike Unix platforms, the default on Windows is that all symbols are treated as `dso_local` (that is `-fvisibility-default=hidden`). Symbols which are meant to participate in dynamic linking are to be attributed as `__declspec(dllexport)` in the module and `__declspec(dllimport)` external to the module. This is similar to Unix platforms where `__attribute__((__visibility__(...)))` controls the same type of behaviour with `-fvisibility-default=hidden`.

For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows.

The backends only need to export a handful of symbols, so I don't think
enabling more targets will have that big of impact on the number
of exported symbols.

With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with `-fvisibility-default=hidden`. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking.

In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations.

Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is:
- introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, ...) to mark public interfaces of shared library modules
- decorate all the public interfaces of the shared library modules with the new decoration
- switching the builds to use `-fvisibility-default=hidden` by default

My recommendation would be to start with a generic decoration (e.g. LLVM_EXPORTED_API)
and decorate all the currently exported symbols (which currently is all of them except
for the lib/Target symbols on Linux). This way you could switch to -fvisibility-default=hidden
and it would have no impact on library users. I know it's tedious work, but you can split it up
and recruit others (like me) to help make the changes.

Once you have that working we can start trimming down the ABI and determine whether or
not we actually need to have two libraries or not. If the full build with a few targets
enabled is already under the limit, I think there is a good chance we can get
a single library to work.

The advantage of this approach is that you can make a lot of progress in a very
non-invasive way.

I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over.

Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are:
- it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build

I don't think we should make any changes to the BUILD_SHARED_LIBS build.
This is intended for developer convenience only and not recommend for
end users or distros[1].

Supporting BUILD_SHARED_LIBS will require annotating more of the API
and create more work both initially and also in ongoing maintenance,
and I don't think it is worth it.

-Tom

- LLVM_DYLIB_COMPONENTS becomes less tractable: in theory it is possible to apply enough CPP magic to determine where a symbol is homed, but allowing a symbol to be homed in a shared or static library is significantly more complex
- BUILD_SHARED_LIBS becomes more expensive to maintain: the decoration is per-module, which requires that we would need to decorate the symbols of each module with module specific annotations as well

One argument that people make for BUILD_SHARED_LIBS is that it reduces the overall time build-test cycle. With the combination of lld, DWARF Fission, and LLVM_BUILD_LLVM_DYLIB, I believe that most of the benefits still can be had. The cost of linking all the tools is amortized across the link of a single library, which while not as small as the a singular library, is offset by the following:
- The LLVM_BUILD_LLVM_DYLIB would not require the re-linking of all the libraries for each tool.
- DWARF Fission would avoid the need to relink all of the DWARF information.
- lld is faster than the gold and bfd linkers

Header changes would still ripple through the system as before, requiring rebuilding the transitive closure. Source file changes do not have the same impact of course.

For those would like a more concrete example of what a change like this may shape up into: âš™ D109192 Support: introduce public API annotation support contains `LLVMSupportExports.h` which has the expected structure for declaring the decoration macros with the rest of the change primarily being focused on applying the decoration. Please ignore the CMake changes as they are there to ensure that the CI validates this without changing the configuration and not intended to be part of the final version of the change.

[1] Building LLVM with CMake — LLVM 18.0.0git documentation

Hello llvm-dev,

In general, I am in favor of this change and think it would be a good
improvement. A few comments below:

One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16> I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often.

When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows.

Unlike Unix platforms, the default on Windows is that all symbols are treated as dso_local (that is -fvisibility-default=hidden). Symbols which are meant to participate in dynamic linking are to be attributed as __declspec(dllexport) in the module and __declspec(dllimport) external to the module. This is similar to Unix platforms where __attribute__((__visibility__(...))) controls the same type of behaviour with -fvisibility-default=hidden.

For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows.

The backends only need to export a handful of symbols, so I don’t think
enabling more targets will have that big of impact on the number
of exported symbols.

That is positive at least.

With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with -fvisibility-default=hidden. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking.

In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations.

Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is:

  • introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, …) to mark public interfaces of shared library modules
  • decorate all the public interfaces of the shared library modules with the new decoration
  • switching the builds to use -fvisibility-default=hidden by default

My recommendation would be to start with a generic decoration (e.g. LLVM_EXPORTED_API)
and decorate all the currently exported symbols (which currently is all of them except
for the lib/Target symbols on Linux). This way you could switch to -fvisibility-default=hidden
and it would have no impact on library users. I know it’s tedious work, but you can split it up
and recruit others (like me) to help make the changes.

I really think that making this generic is only going to cause more churn. In the case we do run into the limit, we are going to have to split the flag. Making the flag less generic now will be easier to extend later but doesn’t cost much now. The application is going to be the same, it is merely the spelling that changes. Since the flag impacts how the library can be called, it seems that ABI might be more appropriate (and Fangrui seemed to agree with that point on the differential). I don’t care too much about the spelling being ABI or API though.

I’m less worried about the tedium (especially since I am experimenting with the possibility of trying to automate at least a large portion of the change).

Once you have that working we can start trimming down the ABI and determine whether or
not we actually need to have two libraries or not. If the full build with a few targets
enabled is already under the limit, I think there is a good chance we can get
a single library to work.

The advantage of this approach is that you can make a lot of progress in a very
non-invasive way.

I think that the invasive part is more the annotation than anything else. I don’t expect there will be many places where we will truly have to contend with library structuring.

I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over.

Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are:

  • it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build

I don’t think we should make any changes to the BUILD_SHARED_LIBS build.
This is intended for developer convenience only and not recommend for
end users or distros[1].

If you build with hidden visibility, then BUILD_SHARED_LIBS would generate libraries which do not export any interfaces, which would be a problem. In order to support that, at least on Windows, you absolutely will need to annotate all of the APIs on a per module basis. I suppose that one could then pass additional flags to counteract the defaults to make BUILD_SHARED_LIBS work. I am merely trying to point out that having a large set of libraries with a large API surface and having the flexibility to build them in a multitude of ways has an associated cost and that this will increase that.

Hi Saleem,

I am concerned that your change will increase the maintenance burden for those of us who would prefer to develop without shared libraries. Since it is unclear a priori where the macros will be required, developers will need to build both with and without shared libraries in order to verify that they aren’t breaking the build for shared library users – in effect slowing down the development for folks who prefer to develop without shared libraries.

I think your goal should be achievable without littering the code with macros. Perhaps on Windows you can achieve your goal with a variant of Leonard Chan’s “busybox” proposal [1] with some adjustments to account for a lack of symlink support on Windows. Perhaps something like:

  • Create a _main() entry point for each tool that does not use llvm::cl to parse options.

  • Create a llvm.dll in the bin directory that links together all the _main() entry points.

  • Each tool .exe consists of:
    int main() {
    _main();
    }

  • Tools that use llvm::cl will need to be linked with all of their code in the .exe for now. However, they can be incrementally switched away from llvm::cl and moved into llvm.dll.

Peter

[1] https://lists.llvm.org/pipermail/llvm-dev/2021-June/151321.html

+1

I would even go as far to make libLLVM.dll (and libClang.dll?) the
only (supported) configuration on Windows. The reason is that we
currently cannot properly support plugins on Windows because these
plugins need to know from which file to import symbols from.
llvm_add_library has a PLUGIN_TOOL parameter to set this, but
consequently the plugin can only be used with that tool (typically
opt.exe), but to have a plugin for another executable like clang.exe,
one would need another plugin binary and yet another one different one
for clang-cl.exe etc. Having a canonical libLLVM.so for all
tools+clang would finally allow us to support plugins one Windows(*).

Also note previous discussion on this, e.g. [1]. [2] already suggests
to introduce component-specific headers, from where
dllimport/dllexport could be controlled.

Michael

(*) IMHO, the plugin system is currently broken under Linux as well.
The plugin may need to link against symbols which simply do not exist
in the target executable because the tool itself does not need them
and have not been added by the linker.
[1] [llvm-dev] LLVM_DYLIB and CLANG_DYLIB with MSVC
[2] [llvm-dev] LLVM_DYLIB and CLANG_DYLIB with MSVC

     > Hello llvm-dev,
     >

    In general, I am in favor of this change and think it would be a good
    improvement. A few comments below:

     > One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16&gt; I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often.
     >
     > When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows.
     >
     > Unlike Unix platforms, the default on Windows is that all symbols are treated as `dso_local` (that is `-fvisibility-default=hidden`). Symbols which are meant to participate in dynamic linking are to be attributed as `__declspec(dllexport)` in the module and `__declspec(dllimport)` external to the module. This is similar to Unix platforms where `__attribute__((__visibility__(...)))` controls the same type of behaviour with `-fvisibility-default=hidden`.
     >
     > For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows.
     >

    The backends only need to export a handful of symbols, so I don't think
    enabling more targets will have that big of impact on the number
    of exported symbols.

That is positive at least.

     > With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with `-fvisibility-default=hidden`. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking.
     >
     > In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations.
     >
     > Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is:
     > - introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, ...) to mark public interfaces of shared library modules
     > - decorate all the public interfaces of the shared library modules with the new decoration
     > - switching the builds to use `-fvisibility-default=hidden` by default
     >

    My recommendation would be to start with a generic decoration (e.g. LLVM_EXPORTED_API)
    and decorate all the currently exported symbols (which currently is all of them except
    for the lib/Target symbols on Linux). This way you could switch to -fvisibility-default=hidden
    and it would have no impact on library users. I know it's tedious work, but you can split it up
    and recruit others (like me) to help make the changes.

I really think that making this generic is only going to cause more churn. In the case we do run into the limit, we are going to have to split the flag. Making the flag less generic now will be easier to extend later but doesn't cost much now. The application is going to be the same, it is merely the spelling that changes. Since the flag impacts how the library can be called, it seems that ABI might be more appropriate (and Fangrui seemed to agree with that point on the differential). I don't care too much about the spelling being ABI or API though.

OK, that makes sense.

I'm less worried about the tedium (especially since I am experimenting with the possibility of trying to automate at least a large portion of the change).

    Once you have that working we can start trimming down the ABI and determine whether or
    not we actually need to have two libraries or not. If the full build with a few targets
    enabled is already under the limit, I think there is a good chance we can get
    a single library to work.

    The advantage of this approach is that you can make a lot of progress in a very
    non-invasive way.

I think that the invasive part is more the annotation than anything else. I don't expect there will be many places where we will truly have to contend with library structuring.

     > I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over.
     >
     > Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are:
     > - it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build

    I don't think we should make any changes to the BUILD_SHARED_LIBS build.
    This is intended for developer convenience only and not recommend for
    end users or distros[1].

If you build with hidden visibility, then BUILD_SHARED_LIBS would generate libraries which do not export any interfaces, which would be a problem. In order to support that, at least on Windows, you absolutely will need to annotate all of the APIs on a per module basis. I suppose that one could then pass additional flags to counteract the defaults to make BUILD_SHARED_LIBS work. I am merely trying to point out that having a large set of libraries with a large API surface and having the flexibility to build them in a multitude of ways has an associated cost and that this will increase that.

Do you want to support a BUILD_SHARED_LIBS build on Windows? If yes, then
I agree you need to annotate the API. All I'm saying is that if you
don't care about BUILD_SHARED_LIBS, then don't worry about trying to
make the build work with -fvisibility=hidden. Just keep -fvisibility=default
for the BUILD_SHARED_LIBS.

-Tom

Hello llvm-dev,

In general, I am in favor of this change and think it would be a good
improvement. A few comments below:

One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16> <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16>> I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often.

When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows.

Unlike Unix platforms, the default on Windows is that all symbols are treated as dso_local (that is -fvisibility-default=hidden). Symbols which are meant to participate in dynamic linking are to be attributed as __declspec(dllexport) in the module and __declspec(dllimport) external to the module. This is similar to Unix platforms where __attribute__((__visibility__(...))) controls the same type of behaviour with -fvisibility-default=hidden.

For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows.

The backends only need to export a handful of symbols, so I don’t think
enabling more targets will have that big of impact on the number
of exported symbols.

That is positive at least.

With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with -fvisibility-default=hidden. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking.

In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations.

Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is:

  • introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, …) to mark public interfaces of shared library modules
  • decorate all the public interfaces of the shared library modules with the new decoration
  • switching the builds to use -fvisibility-default=hidden by default

My recommendation would be to start with a generic decoration (e.g. LLVM_EXPORTED_API)
and decorate all the currently exported symbols (which currently is all of them except
for the lib/Target symbols on Linux). This way you could switch to -fvisibility-default=hidden
and it would have no impact on library users. I know it’s tedious work, but you can split it up
and recruit others (like me) to help make the changes.

I really think that making this generic is only going to cause more churn. In the case we do run into the limit, we are going to have to split the flag. Making the flag less generic now will be easier to extend later but doesn’t cost much now. The application is going to be the same, it is merely the spelling that changes. Since the flag impacts how the library can be called, it seems that ABI might be more appropriate (and Fangrui seemed to agree with that point on the differential). I don’t care too much about the spelling being ABI or API though.

OK, that makes sense.

I’m less worried about the tedium (especially since I am experimenting with the possibility of trying to automate at least a large portion of the change).

Once you have that working we can start trimming down the ABI and determine whether or
not we actually need to have two libraries or not. If the full build with a few targets
enabled is already under the limit, I think there is a good chance we can get
a single library to work.

The advantage of this approach is that you can make a lot of progress in a very
non-invasive way.

I think that the invasive part is more the annotation than anything else. I don’t expect there will be many places where we will truly have to contend with library structuring.

I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over.

Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are:

  • it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build

I don’t think we should make any changes to the BUILD_SHARED_LIBS build.
This is intended for developer convenience only and not recommend for
end users or distros[1].

If you build with hidden visibility, then BUILD_SHARED_LIBS would generate libraries which do not export any interfaces, which would be a problem. In order to support that, at least on Windows, you absolutely will need to annotate all of the APIs on a per module basis. I suppose that one could then pass additional flags to counteract the defaults to make BUILD_SHARED_LIBS work. I am merely trying to point out that having a large set of libraries with a large API surface and having the flexibility to build them in a multitude of ways has an associated cost and that this will increase that.

Do you want to support a BUILD_SHARED_LIBS build on Windows? If yes, then
I agree you need to annotate the API. All I’m saying is that if you
don’t care about BUILD_SHARED_LIBS, then don’t worry about trying to
make the build work with -fvisibility=hidden. Just keep -fvisibility=default
for the BUILD_SHARED_LIBS.

Personally, no I’m not too interested in the BUILD_SHARED_LIBS, though, a special case that I think that would be nice for is LLVMSupport. I’m currently interested in having LLVM and clang as DLLs.

Hi Saleem,

I am concerned that your change will increase the maintenance burden for those of us who would prefer to develop without shared libraries. Since it is unclear a priori where the macros will be required, developers will need to build both with and without shared libraries in order to verify that they aren’t breaking the build for shared library users – in effect slowing down the development for folks who prefer to develop without shared libraries.

Failure to annotate the API wouldn’t break the build, it would mean that the API is not available. Of there are no users of the API outside of the module, everything would continue to work. It is if there are users of the API outside of the module that it matters. However, that implicitly tells you what needs to be annotated apriori.

I think your goal should be achievable without littering the code with macros.

In order to support that, we would need a secondary source of truth: a text file with the decorated names of any exported function. Such a model IMO is far worse. The name decoration scheme is not universal, and not in llvm’s control (Microsoft’s scheme is owned by Microsoft and is subject to change). But yes, theoretically, an secondary source of truth could achieve this.

Perhaps on Windows you can achieve your goal with a variant of Leonard Chan’s “busybox” proposal [1] with some adjustments to account for a lack of symlink support on Windows. Perhaps something like:

I’d like to be able to link this into server processes and tools with potential for dynamic loading. Additionally, this would make execution of the tools significantly more expensive (which is also why I’m interested in a dual library approach).

If I’m mistaken about the multicall binary approach, perhaps we should be looking at removing the library options and replacing them with the multicall binary?

Let me start by saying that I think this is important work that you are undertaking. This is a huge task, and it you pull it off you will be a hero. Do you have any notion of how big the distribution would be on Windows with the dll build? 2 GB is ridiculous!

At a previous workplace, we had a similar problem, and we solved it in much the way that you propose: by having a magic export macro that we decorated all external symbols with. The main problem was the human factor: there are those that don’t develop on Windows (we didn’t disable exporting everything by default on Linux and Mac), and it was a constant battle to get these developers to use the export macro. If we disable exporting everything on not-Windows, that will at least reduce the amount of people that fail to use the macro. (as they will get linker failures and presumably the build bots will complain) However, there will always be a contingent of people saying things like “This problem that you are solving doesn’t affect me, and now I a am being asked to do extra work! I don’t like this!” My personal opinion is that:


class LLVM_SUPPORT_ABI Foo { …

… is not materially harder to write than:


class Foo { …

… and this sort of stuff is just a fact of life in writing C++ code that is portable between Windows and Linux.

However, if you don’t dynamically link in your day-to-day work, then it’s easy to mess this up. Is it void LLVM_SUPPORT_ABI bar() or LLVM_SUPPORT_ABI void bar()? Do I have to decorate class members, or just the class? When can I omit it? Do I have to decorate templated code? If you’re statically linking LLVM_SUPPORT_ABI preprocesses to nothing, so you can put it literally anywhere and your code will compile. I can’t think of how to do it offhand, and maybe it’s not possible, but if the system could somehow be rigged to fail to compile if you do it wrong, even for statically linked builds that would be nice. If this could be done, it would also help with the transition. At some point you’re going to flip the switch on this and everybody is going to get random breakages for a few days while they hunt down the stragglers.

All I can say is: “good luck, I’ll be rooting for you!”

Thanks,

Chris Tetreault

Hi Saleem,

I am concerned that your change will increase the maintenance burden for those of us who would prefer to develop without shared libraries. Since it is unclear a priori where the macros will be required, developers will need to build both with and without shared libraries in order to verify that they aren’t breaking the build for shared library users – in effect slowing down the development for folks who prefer to develop without shared libraries.

Failure to annotate the API wouldn’t break the build, it would mean that the API is not available. Of there are no users of the API outside of the module, everything would continue to work. It is if there are users of the API outside of the module that it matters. However, that implicitly tells you what needs to be annotated apriori.

It will break the build if I add code to a tool that calls an API that isn’t exported. Because of inlining etc it may not be obvious that a particular API needs to be exported. Hence the need for two builds to check for these problems.

I think your goal should be achievable without littering the code with macros.

In order to support that, we would need a secondary source of truth: a text file with the decorated names of any exported function. Such a model IMO is far worse. The name decoration scheme is not universal, and not in llvm’s control (Microsoft’s scheme is owned by Microsoft and is subject to change). But yes, theoretically, an secondary source of truth could achieve this.

This was not my proposal. The only exports would be:

<tool name 1>_main
<tool name 2>_main
<tool name 3>_main
etc.

And that can be very easily managed simply by exporting the *_main functions, e.g. via dllexport.

Perhaps on Windows you can achieve your goal with a variant of Leonard Chan’s “busybox” proposal [1] with some adjustments to account for a lack of symlink support on Windows. Perhaps something like:

I’d like to be able to link this into server processes and tools with potential for dynamic loading.

That seems a little too open ended, and at least has a higher cost/benefit ratio than just solving the problem of 2GiB of bloat from tools, which can be solved in a much less intrusive way than the export macros.

Additionally, this would make execution of the tools significantly more expensive (which is also why I’m interested in a dual library approach).

As long as the only exports are the *_main functions, the code in the .dll would be basically the same as in the .exe, so I don’t see how it would be more expensive.

If I’m mistaken about the multicall binary approach, perhaps we should be looking at removing the library options and replacing them with the multicall binary?

Naively making it a multicall binary on Windows would hit the problem of lack of reliable symlink support, hence the proposal to make the tools stub .exes that just call into a .dll.

Peter

Hi Saleem,

I am concerned that your change will increase the maintenance burden for those of us who would prefer to develop without shared libraries. Since it is unclear a priori where the macros will be required, developers will need to build both with and without shared libraries in order to verify that they aren’t breaking the build for shared library users – in effect slowing down the development for folks who prefer to develop without shared libraries.

Failure to annotate the API wouldn’t break the build, it would mean that the API is not available. Of there are no users of the API outside of the module, everything would continue to work. It is if there are users of the API outside of the module that it matters. However, that implicitly tells you what needs to be annotated apriori.

It will break the build if I add code to a tool that calls an API that isn’t exported. Because of inlining etc it may not be obvious that a particular API needs to be exported. Hence the need for two builds to check for these problems.

I think your goal should be achievable without littering the code with macros.

In order to support that, we would need a secondary source of truth: a text file with the decorated names of any exported function. Such a model IMO is far worse. The name decoration scheme is not universal, and not in llvm’s control (Microsoft’s scheme is owned by Microsoft and is subject to change). But yes, theoretically, an secondary source of truth could achieve this.

This was not my proposal. The only exports would be:

<tool name 1>_main
<tool name 2>_main
<tool name 3>_main
etc.

And that can be very easily managed simply by exporting the *_main functions, e.g. via dllexport.

Perhaps on Windows you can achieve your goal with a variant of Leonard Chan’s “busybox” proposal [1] with some adjustments to account for a lack of symlink support on Windows. Perhaps something like:

I’d like to be able to link this into server processes and tools with potential for dynamic loading.

That seems a little too open ended, and at least has a higher cost/benefit ratio than just solving the problem of 2GiB of bloat from tools, which can be solved in a much less intrusive way than the export macros.

Additionally, this would make execution of the tools significantly more expensive (which is also why I’m interested in a dual library approach).

As long as the only exports are the *_main functions, the code in the .dll would be basically the same as in the .exe, so I don’t see how it would be more expensive.

If I’m mistaken about the multicall binary approach, perhaps we should be looking at removing the library options and replacing them with the multicall binary?

Naively making it a multicall binary on Windows would hit the problem of lack of reliable symlink support, hence the proposal to make the tools stub .exes that just call into a .dll.

I discussed this with Saleem offline. Although I still think there is scope for exploring alternative approaches as described above, it seems neither of us are willing/have time to pursue it, so I won’t stand in the way here.

My concern remains that the rules for updating the annotations may be non-obvious at times. If the burden for updating the annotations were placed on those who care about the shared library builds, that may make things easier for day-to-day development. Perhaps one way to do that would be for the annotations to be considered “peripheral tier” in terms of our support policy, so that they aren’t tracked by normal CI and only those who care about the shared library build are responsible for updating them.

Peter

I had a thought and two questions related to this.

The thought:

We informally have three types of APIs in LLVM components. We have stable-ish C APIs, unstable C++ APIs that are expected to be used outside the component and unstable C++ APIs that are internal to the component.

In some (but not all) cases we make a technical distinction between internal to the component APIs by putting the headers for those APIs in the lib folder beside the implementation.

If we’re talking about annotating APIs for symbol export control (which I’m 100% in favor of), should we also consider a more formal designation for library-internal APIs.

One thought I had was should we adopt a policy where all APIs in the llvm namespace are required to be annotated for export, but APIs in the llvm::internal namespace are not?

And the open questions:

(1) Are there changes to the MSVC or Clang-CL toolchains that we could push for/make ourselves that would make this easier to maintain?
(2) Can we implement a clang-tidy check for however we want this to be done, and enable it as part of the LLVM clang-tidy configuration? (Surely the technical answer here is yes, it is just some amount of work)

-Chris

I had a thought and two questions related to this.

The thought:

We informally have three types of APIs in LLVM components. We have stable-ish C APIs, unstable C++ APIs that are expected to be used outside the component and unstable C++ APIs that are internal to the component.

In some (but not all) cases we make a technical distinction between internal to the component APIs by putting the headers for those APIs in the lib folder beside the implementation.

If we’re talking about annotating APIs for symbol export control (which I’m 100% in favor of), should we also consider a more formal designation for library-internal APIs.

One thought I had was should we adopt a policy where all APIs in the llvm namespace are required to be annotated for export, but APIs in the llvm::internal namespace are not?

I think that this is likely far more heavy handed, but would absolutely help to automate. I think that if we adopt this and extend clang’s compilation database, the attributes could be added as a post commit hook.

That said, I’m not sure that predicating the improvement for windows in such a large change to llvm policy is something that’s entirely reasonable. I think that this happening over time and revisiting the implantation subsequently is fine though.

And the open questions:

(1) Are there changes to the MSVC or Clang-CL toolchains that we could push for/make ourselves that would make this easier to maintain?

Some changes that clang could do to help with this is to introduce the linker invocations into compile_commands.json. Additionally, we would need the module name for the output at compile time.

(2) Can we implement a clang-tidy check for however we want this to be done, and enable it as part of the LLVM clang-tidy configuration? (Surely the technical answer here is yes, it is just some amount of work)

I don’t think that there’s a good way to do this in reality. The problem is that you do not have an automated way to determine what is public and what is not. That said, I do have https://github.com/compnerd/ids to at least help with the annotation. It’s not complete and still would require some further refinement.

I had a thought and two questions related to this.

The thought:

We informally have three types of APIs in LLVM components. We have stable-ish C APIs, unstable C++ APIs that are expected to be used outside the component and unstable C++ APIs that are internal to the component.

In some (but not all) cases we make a technical distinction between internal to the component APIs by putting the headers for those APIs in the lib folder beside the implementation.

If we’re talking about annotating APIs for symbol export control (which I’m 100% in favor of), should we also consider a more formal designation for library-internal APIs.

One thought I had was should we adopt a policy where all APIs in the llvm namespace are required to be annotated for export, but APIs in the llvm::internal namespace are not?

I think that this is likely far more heavy handed, but would absolutely help to automate. I think that if we adopt this and extend clang’s compilation database, the attributes could be added as a post commit hook.

That said, I’m not sure that predicating the improvement for windows in such a large change to llvm policy is something that’s entirely reasonable. I think that this happening over time and revisiting the implantation subsequently is fine though.

Fair enough.

And the open questions:

(1) Are there changes to the MSVC or Clang-CL toolchains that we could push for/make ourselves that would make this easier to maintain?

Some changes that clang could do to help with this is to introduce the linker invocations into compile_commands.json. Additionally, we would need the module name for the output at compile time.

That would be a very interesting enhancement.

(2) Can we implement a clang-tidy check for however we want this to be done, and enable it as part of the LLVM clang-tidy configuration? (Surely the technical answer here is yes, it is just some amount of work)

I don’t think that there’s a good way to do this in reality. The problem is that you do not have an automated way to determine what is public and what is not. That said, I do have https://github.com/compnerd/ids to at least help with the annotation. It’s not complete and still would require some further refinement.

I actually disagree here. We do have an automated way to determine what is currently public and what is not, although we may have an overly broad definition of public. Today, any symbol declared under the include directory for a project is public. Whether or not it should be public is a different issue. We currently treat all of those symbols as public exports from component libraries.

-Chris

I had a thought and two questions related to this.

The thought:

We informally have three types of APIs in LLVM components. We have stable-ish C APIs, unstable C++ APIs that are expected to be used outside the component and unstable C++ APIs that are internal to the component.

In some (but not all) cases we make a technical distinction between internal to the component APIs by putting the headers for those APIs in the lib folder beside the implementation.

If we’re talking about annotating APIs for symbol export control (which I’m 100% in favor of), should we also consider a more formal designation for library-internal APIs.

One thought I had was should we adopt a policy where all APIs in the llvm namespace are required to be annotated for export, but APIs in the llvm::internal namespace are not?

I think that this is likely far more heavy handed, but would absolutely help to automate. I think that if we adopt this and extend clang’s compilation database, the attributes could be added as a post commit hook.

That said, I’m not sure that predicating the improvement for windows in such a large change to llvm policy is something that’s entirely reasonable. I think that this happening over time and revisiting the implantation subsequently is fine though.

Fair enough.

And the open questions:

(1) Are there changes to the MSVC or Clang-CL toolchains that we could push for/make ourselves that would make this easier to maintain?

Some changes that clang could do to help with this is to introduce the linker invocations into compile_commands.json. Additionally, we would need the module name for the output at compile time.

That would be a very interesting enhancement.

(2) Can we implement a clang-tidy check for however we want this to be done, and enable it as part of the LLVM clang-tidy configuration? (Surely the technical answer here is yes, it is just some amount of work)

I don’t think that there’s a good way to do this in reality. The problem is that you do not have an automated way to determine what is public and what is not. That said, I do have https://github.com/compnerd/ids to at least help with the annotation. It’s not complete and still would require some further refinement.

I actually disagree here. We do have an automated way to determine what is currently public and what is not, although we may have an overly broad definition of public. Today, any symbol declared under the include directory for a project is public. Whether or not it should be public is a different issue. We currently treat all of those symbols as public exports from component libraries.

Then we don’t really disagree :). We can automate the automation, it just assumes everything is public.

That is not a very good approach, but it is what I am currently proposing. This is maintaining the status quo with the other platforms. I simply see it as a means to something working rather than the desired state. That is, it gets us to an intermediate stage from where we should further refine the implementation.

Hi,

I have been working on this lately and I wanted to provide and update with some of the challenges that I’m facing.

My goal is to explicitly export symbols for all of libLLVM.so on both Windows and Linux, so I’ve been trying to find a way to achieve this with a single macro that will expand to __attribute__((__visibility__("default"))) on linux and __declspec(dllexport) on Windows. My first approach was to add this macro to class definitions:

class LLVM_ABI foo {
};

This worked on linux, and I was able to do a complete build of llvm, clang, lld, lldb, and mlir with these attributes. However, went I went to test this on Windows, the build failed due to C2280: attempting to reference a deleted function. This seems to be a problem with msvc generating and exporting a copy constructor for classes with containers of std:unique_ptrs. I was able to fix a few of these errors with some suggestions from online sources but there were more that I was unable to fix.

Because of this issue, I decided to change my approach and annotate the member functions instead of the classes themselves. It turns out, though, that this does not work on Linux, because neither clang nor gcc export the vtable unless the visibility attribute is attached to the class. msvc will export the vtable if any member functions have dllexport.

My next attempt after this was to try and annotate both the class and its member functions. This worked for gcc/clang, but msvc does not support this.

So now I’m stuck. I don’t really see a way to add visibility macros to the headers in a way that is portable across both Windows and Linux. We could solve this by having separate macros for the two OSes, but that seems like it would be too invasive and hard to maintain. I’ve also considered trying to use a linker script, but I’m not sure yet exactly how to do that.

I’m looking for advice on how to move forward now. Especially, if someone has solutions to the unique_ptr issue on Windows or the vtable issue on Linux.

Thanks,
Tom

Could you have a macro for the classes different than the macro for the method? Then we could just enable one or the other depending on the toolchain support?

I know googletest uses class annotations, and has unique_ptr, and works fine on Windows, so I poked around in there a little bit.

If you look in third-party/unittest/googletest/include/gtest/gtest.h there’s a class AssertionResult which has a std::unique_ptr<std::string> message_; member, and the GTEST_API_ annotation on the class. There are a couple other classes in there with const std::unique_ptr<const std::string> (don’t know if the const makes any difference).
The definition of GTEST_API_ is in googletest/include/gtest/internal/gtest-port.h.

Hopefully with a working example, that will help.

Here is the problem I’m running into with unique_ptr. It actually needs to be some kind of container of unique_ptrs in order to trigger the error.

Would it work to annotate the class with type_visibility(default) for Linux and nothing for Windows? type_visibility is a Clang-only thing though. Maybe we could add visibility(default) to the class instead only for gcc?