Using LLVM code in projects/compiler-rt

Hi,

tl;dr How can I include LLVM headers and use code from libLLVM*.a files when building compiler-rt libraries?

I’d like to create a symbolizer that would be used in AddressSanitizer (ASan) and ThreadSanitizer (TSan) tools which are now part of
projects/compiler-rt (as a first step, symbolizer should be able to return file/line info for a given address).
I’d like to use and gradually extend the interface in “llvm/DebugInfo/DIContext.h”
I see two obstacles:

  1. How can I include LLVM headers in source files inside projects/compiler-rt?
    As a local workaround, I modify configuration for ASan/TSan runtimes as follows:
    CFLAGS.asan-x86_64 := $(CFLAGS) -I$(PathToLLVMInclude) -I$(PathToLLVMBuildInclude)
    Note that I need both “/path/to/llvm_checkout/include” and “/path/to/llvm_build/include”, because some LLVM headers are generated when
    LLVM is built (e.g. “llvm/Support/DataTypes.h”). This looks very broken, as paths are hardcoded, and LLVM headers are
    not included in dependencies for ASan/TSan runtime.

  2. How can I use LLVM libraries when building compiler-rt?
    Currently, compiler-rt builds (linux) runtime and stores it as an .a file in “/path/to/llvm_build/Release+Asserts/lib/clang/3.2/linux”.
    Can I somehow make this runtime contain compiled LLVM libraries as well?
    Currently I need libLLVMDebugInfo.a and libLLVMSupport.a.

For now, I patch Clang driver, so that it would pass all the necessary LLVM libraries to linker whenever
it sees -faddress-sanitizer or -fthread-sanitizer flags, but this also looks hacky.

TIA

Hi,

tl;dr How can I include LLVM headers and use code from libLLVM*.a files when building compiler-rt libraries?

Currently, there isn’t a way, but this is something I very much want to see supported. I started working on this w/ the CMake build system, but had to go work on other things. I do hope to get back to it, and if this is on the critical path, I can try to talk you through what needs to be done and review patches. I may even have some time to work on it soon.

I’d like to create a symbolizer that would be used in AddressSanitizer (ASan) and ThreadSanitizer (TSan) tools which are now part of
projects/compiler-rt (as a first step, symbolizer should be able to return file/line info for a given address).
I’d like to use and gradually extend the interface in “llvm/DebugInfo/DIContext.h”

I’ve discussed this with several others before (including Chris IIRC), and he strongly agreed with this approach FWIW.

I see two obstacles:

  1. How can I include LLVM headers in source files inside projects/compiler-rt?
    As a local workaround, I modify configuration for ASan/TSan runtimes as follows:
    CFLAGS.asan-x86_64 := $(CFLAGS) -I$(PathToLLVMInclude) -I$(PathToLLVMBuildInclude)
    Note that I need both “/path/to/llvm_checkout/include” and “/path/to/llvm_build/include”, because some LLVM headers are generated when
    LLVM is built (e.g. “llvm/Support/DataTypes.h”). This looks very broken, as paths are hardcoded, and LLVM headers are
    not included in dependencies for ASan/TSan runtime.

The CMake build system should make much of this Just Work by building the runtime as just another LLVM sub-project. I increasingly think this is the correct approach rather than trying to bootstrap the runtime library (despite being firmly on the other side of things originally). It simplifies a large number of issues, and I think all of my prior concerns were misplaced.

I’m not an expert in the Makefile build system, but my suspicion is that the changes needed to get into the same position as the (rudimentary, and incomplete) CMake build support is to switch compiler-rt, and especially the *san libraries, to use the normal LLVM-subproject build system structure.

However, the current Makefile build for compiler-rt has a huge special feature that won’t play nicely with this: automatic cross-compilation. In order to make this really work, in either the Makefile or CMake build systems, we’re going to have to teach the root LLVM build system how to do on-demand cross-compilation of selected LLVM libraries, so that compiler-rt can build a collection of libraries for different target platforms. From my perspective, that’s the big hurdle. It’s also likely to be quite a bit of work to figure out in either CMake or Makefiles, and I’m only really competent with the CMake side of things…

  1. How can I use LLVM libraries when building compiler-rt?
    Currently, compiler-rt builds (linux) runtime and stores it as an .a file in “/path/to/llvm_build/Release+Asserts/lib/clang/3.2/linux”.
    Can I somehow make this runtime contain compiled LLVM libraries as well?
    Currently I need libLLVMDebugInfo.a and libLLVMSupport.a.

This shouldn’t be a hard problem to solve at the build system level once you can compile against the libraries. You just have to re-run ar to fuse the archives. Let’s solve this if & when we get the other solved.

Hi,

tl;dr How can I include LLVM headers and use code from libLLVM*.a files when
building compiler-rt libraries?

LLVM and compiler-rt have different licenses (compiler-rt is dual
licensed with the MIT license). Would that be a problem?

TIA

Cheers,
Rafael

This is a good point…

Chris, I’m wondering whether putting all of the runtimes into ‘compiler-rt’ is really the best structure at this point… What would seem a somewhat less awkward fit to me these days:

  • a runtimes project which contains various runtime libraries, under the usual LLVM license
  • the original ‘compiler-rt’ bits either as a sub-library of this which happens to be buildable stand-alone and dual-licensed, or as its own entirely separate project
  • coverage profile runtime, asan, tsan, and common runtime libraries separated out from compiler-rt

We can achieve the same technical result with the current structure, but its inverted and awkward: the restrictive rules (dual license / stand-alone build) are enforced in an outer layer, with the permissive rules returning in an inner layer (the asan or tsan runtimes themselves).

Thoughts?

If this is the direction to go, I’m happy to do the lion share of leg work to re-organize (with the help of ASan folks)… I think my personal preference would be for compiler-rt to be a separate top-level project from a generic ‘runtimes’ project.

Hi,

tl;dr How can I include LLVM headers and use code from libLLVM*.a files when
building compiler-rt libraries?

LLVM and compiler-rt have different licenses (compiler-rt is dual
licensed with the MIT license). Would that be a problem?

This is a good point…

Yes it is, it would be a problem :frowning:

Chris, I’m wondering whether putting all of the runtimes into ‘compiler-rt’ is really the best structure at this point… What would seem a somewhat less awkward fit to me these days:

  • a runtimes project which contains various runtime libraries, under the usual LLVM license
  • the original ‘compiler-rt’ bits either as a sub-library of this which happens to be buildable stand-alone and dual-licensed, or as its own entirely separate project
  • coverage profile runtime, asan, tsan, and common runtime libraries separated out from compiler-rt

We can achieve the same technical result with the current structure, but its inverted and awkward: the restrictive rules (dual license / stand-alone build) are enforced in an outer layer, with the permissive rules returning in an inner layer (the asan or tsan runtimes themselves).

Thoughts?

If this is the direction to go, I’m happy to do the lion share of leg work to re-organize (with the help of ASan folks)… I think my personal preference would be for compiler-rt to be a separate top-level project from a generic ‘runtimes’ project.

I’m not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don’t want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that “clang -fasan produces code that has to binary attribute the LLVM license” is pretty lame.

-Chris

I think that what is traditionally thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects…

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more “advanced” runtimes:

  1. Swallow the lame binary attribution clause requirement. Document this noisily.
  2. Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.
  3. Build the functionality needed by ASan/TSan/etc independently of LLVM’s core libraries. Code duplication here, and only a dim hope that we could package in a way that lldb or others might be able to shift to depend upon the dual-licensed functionality rather than the core LLVM functionality.
  4. Start moving core LLVM libraries into a separate ‘core library’ or ‘common library’ project which has the dual-license requirement, but is a “lower-level” component than LLVM itself.

#1 and #2 are at least clear in how they would work. They have downsides, but not in terms of implementation.

#3 seems like painting ourselves into a corner, and borrowing a lot of technical debt in the future. I suspect we’ll keep having to replicate functionality here.

#4 is interesting, but a ton of work. The Object library, most of Support and System, all would have to sink into this core module, all would have to get dual-licensed (ow!!! how? some of the contributors are around to agree to new license, but not all… likely a fair amount of rewrite required to produce new versions of libraries under the correct license).

I’m not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don’t want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that “clang -fasan produces code that has to binary attribute the LLVM license” is pretty lame.

I think that what is traditionally thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

To be clear, compiler-rt isn’t injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.

Forming the division like this might make it easier to handle the attribution issues too enough trickery.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects…

Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more “advanced” runtimes:

  1. Swallow the lame binary attribution clause requirement. Document this noisily.
  2. Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.

Maybe 2a: engineer asan so that it optionally links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?

  1. Build the functionality needed by ASan/TSan/etc independently of LLVM’s core libraries. Code duplication here, and only a dim hope that we could package in a way that lldb or others might be able to shift to depend upon the dual-licensed functionality rather than the core LLVM functionality.
  2. Start moving core LLVM libraries into a separate ‘core library’ or ‘common library’ project which has the dual-license requirement, but is a “lower-level” component than LLVM itself.

#4 is somewhat independently useful anyway. The support and system libraries are the lowest level (from a layering perspective) and most reusable across sub-projects. Actually relicensing them would be a major effort though.

#3 seems like painting ourselves into a corner, and borrowing a lot of technical debt in the future. I suspect we’ll keep having to replicate functionality here.

Yeah.

#4 is interesting, but a ton of work. The Object library, most of Support and System, all would have to sink into this core module, all would have to get dual-licensed (ow!!! how? some of the contributors are around to agree to new license, but not all… likely a fair amount of rewrite required to produce new versions of libraries under the correct license).

I think that #4 is the best long term answer, but yeah… oww. If you’re interested in stack traces in particular, making pieces “optionally enabled” seems really attractive.

-Chris

#4 is interesting, but a *ton* of work. The Object library, most of Support
and System, all would have to sink into this core module, all would have to
get dual-licensed (ow!!! how? some of the contributors are around to agree
to new license, but not all... likely a fair amount of rewrite required to
produce new versions of libraries under the correct license).

You actually don't have that many contributors. I've seen this done
for projects with 200+ contributors.
Even better, most LLVM contributors are still around.
If you have to rewrite a little code along the way to account for
folks you can't find, this is probably worth the expense anyway (and
i'm pretty sure we'd be happy to fund it :P).

The more interesting question is whether you want to dual license, add
a general exception to the LLVM license, or switch wholesale to MIT
license.

I’m not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don’t want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that “clang -fasan produces code that has to binary attribute the LLVM license” is pretty lame.

I think that what is traditionally thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

To be clear, compiler-rt isn’t injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.

Forming the division like this might make it easier to handle the attribution issues too enough trickery.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects…

Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more “advanced” runtimes:

  1. Swallow the lame binary attribution clause requirement. Document this noisily.
  2. Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.

Maybe 2a: engineer asan so that it optionally links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?

DSO? You mean to make asan.so instead of asan.a?
This will work, but may cause confusion. If asan.so is not present, the asan-ified binary will crash instantly.

For tsan, making it tsan.so instead of tsan.a will cause a huge performance penalty because tsan reads TLS on every memory access.

–kcc

#4 is interesting, but a ton of work. The Object library, most of Support
and System, all would have to sink into this core module, all would have to
get dual-licensed (ow!!! how? some of the contributors are around to agree
to new license, but not all… likely a fair amount of rewrite required to
produce new versions of libraries under the correct license).

You actually don’t have that many contributors. I’ve seen this done
for projects with 200+ contributors.
Even better, most LLVM contributors are still around.
If you have to rewrite a little code along the way to account for
folks you can’t find, this is probably worth the expense anyway (and
i’m pretty sure we’d be happy to fund it :P).

The more interesting question is whether you want to dual license, add
a general exception to the LLVM license, or switch wholesale to MIT
license.

All of asan and tsan code is authored by us
(except maybe a couple of one-line hot fixes when we broke the build and someone else fixed it).
As far as I am concerned, either license is fine.

–kcc

#4 is interesting, but a ton of work. The Object library, most of Support
and System, all would have to sink into this core module, all would have to
get dual-licensed (ow!!! how? some of the contributors are around to agree
to new license, but not all… likely a fair amount of rewrite required to
produce new versions of libraries under the correct license).

You actually don’t have that many contributors. I’ve seen this done
for projects with 200+ contributors.
Even better, most LLVM contributors are still around.
If you have to rewrite a little code along the way to account for
folks you can’t find, this is probably worth the expense anyway (and
i’m pretty sure we’d be happy to fund it :P).

After talking with DannyB, I now am strongly in the camp that we should do #4 whole-sale, and make everything hold a license that works for runtimes. We can potentially move completely away from dual-licensing.

We can definitely drive this effort if the community is supportive, including re-writing parts of the codebase from authors we can’t contact.

The more interesting question is whether you want to dual license, add
a general exception to the LLVM license, or switch wholesale to MIT
license.

This is indeed the question: what should the end state be.

Because of this, and other reasons, let’s focus on option #4 and/or whole-sale re-licensing.

#4 is interesting, but a ton of work. The Object library, most of Support
and System, all would have to sink into this core module, all would have to
get dual-licensed (ow!!! how? some of the contributors are around to agree
to new license, but not all… likely a fair amount of rewrite required to
produce new versions of libraries under the correct license).

You actually don’t have that many contributors. I’ve seen this done
for projects with 200+ contributors.
Even better, most LLVM contributors are still around.
If you have to rewrite a little code along the way to account for
folks you can’t find, this is probably worth the expense anyway (and
i’m pretty sure we’d be happy to fund it :P).

After talking with DannyB, I now am strongly in the camp that we should do #4 whole-sale, and make everything hold a license that works for runtimes. We can potentially move completely away from dual-licensing.

We can definitely drive this effort if the community is supportive, including re-writing parts of the codebase from authors we can’t contact.

What will be our (asan/tsan) next steps?

–kcc

I think you can carry on with #1 in the interim. I’ll up-prioritize the build system stuff, and maybe we can chat about how to share some of that work? All of that work is necessary even if we figure out whatever license arrangement we end up with. During that time, we should document carefully that this attaches the attribution requirement, and we should be able to have the license issues fixed prior to the next LLVM release so it doesn’t have to be permanent.

#4 is interesting, but a ton of work. The Object library, most of Support
and System, all would have to sink into this core module, all would have to
get dual-licensed (ow!!! how? some of the contributors are around to agree
to new license, but not all… likely a fair amount of rewrite required to
produce new versions of libraries under the correct license).

You actually don’t have that many contributors. I’ve seen this done
for projects with 200+ contributors.
Even better, most LLVM contributors are still around.
If you have to rewrite a little code along the way to account for
folks you can’t find, this is probably worth the expense anyway (and
i’m pretty sure we’d be happy to fund it :P).

After talking with DannyB, I now am strongly in the camp that we should do #4 whole-sale, and make everything hold a license that works for runtimes. We can potentially move completely away from dual-licensing.

We can definitely drive this effort if the community is supportive, including re-writing parts of the codebase from authors we can’t contact.

What will be our (asan/tsan) next steps?

I think you can carry on with #1 in the interim.

Ok.

I’ll up-prioritize the build system stuff, and maybe we can chat about how to share some of that work?

Yes, definitely.

–kcc

I'm not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don't want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that "clang -fasan produces code that has to binary attribute the LLVM license" is pretty lame.

I think that what is *traditionally* thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

To be clear, compiler-rt isn't injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.

Forming the division like this might make it easier to handle the attribution issues too enough trickery.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects....

Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more "advanced" runtimes:

1) Swallow the lame binary attribution clause requirement. Document this noisily.
2) Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.

Maybe 2a: engineer asan so that it *optionally* links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?

DSO? You mean to make asan.so instead of asan.a?
This will work, but may cause confusion. If asan.so is not present, the asan-ified binary will crash instantly.

For tsan, making it tsan.so instead of tsan.a will cause a huge performance penalty because tsan reads TLS on every memory access.

It depends where you need to use LLVM libraries. Going back to the initial post of this thread, LLVM libraries will be used by asan/tsan to symbolicate stack traces. If that's the only use of LLVM in asan it would make a lot of sense to split that part into a DSO and link it on demand. That way we don't have to link all the (large) debug info code into every binary built with asan and have a nice workaround for the licensing issue.

I'm mildly opposed to relicensing parts of LLVM. While it makes sense to have lib/Support as liberally licensed as possible I'm afraid that it will creep into other parts too. Asking all developers to agree to a license change and reimplementing stuff written by people who won't answer is feasible but it will be painful and, in my opinion, a waste of development resources that are better spent improving LLVM.

- Ben

I’m not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don’t want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that “clang -fasan produces code that has to binary attribute the LLVM license” is pretty lame.

I think that what is traditionally thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

To be clear, compiler-rt isn’t injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.

Forming the division like this might make it easier to handle the attribution issues too enough trickery.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects…

Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more “advanced” runtimes:

  1. Swallow the lame binary attribution clause requirement. Document this noisily.
  2. Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.

Maybe 2a: engineer asan so that it optionally links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?

DSO? You mean to make asan.so instead of asan.a?
This will work, but may cause confusion. If asan.so is not present, the asan-ified binary will crash instantly.

For tsan, making it tsan.so instead of tsan.a will cause a huge performance penalty because tsan reads TLS on every memory access.

It depends where you need to use LLVM libraries. Going back to the initial post of this thread, LLVM libraries will be used by asan/tsan to symbolicate stack traces. If that’s the only use of LLVM in asan it would make a lot of sense to split that part into a DSO and link it on demand.

That’ll work too and is indeed simpler.

–kcc

I’m not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don’t want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that “clang -fasan produces code that has to binary attribute the LLVM license” is pretty lame.

I think that what is traditionally thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.

To be clear, compiler-rt isn’t injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.

Forming the division like this might make it easier to handle the attribution issues too enough trickery.

Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects…

Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.

Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more “advanced” runtimes:

  1. Swallow the lame binary attribution clause requirement. Document this noisily.
  2. Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.

Maybe 2a: engineer asan so that it optionally links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?

DSO? You mean to make asan.so instead of asan.a?
This will work, but may cause confusion. If asan.so is not present, the asan-ified binary will crash instantly.

For tsan, making it tsan.so instead of tsan.a will cause a huge performance penalty because tsan reads TLS on every memory access.

It depends where you need to use LLVM libraries. Going back to the initial post of this thread, LLVM libraries will be used by asan/tsan to symbolicate stack traces. If that’s the only use of LLVM in asan it would make a lot of sense to split that part into a DSO and link it on demand.

That’ll work too and is indeed simpler.

Yes. Is there an easy way to build that part of LLVM (Support and DebugInfo) as .so files as well as .a?

>
>
>> I'm not sure that this solves the problem. The reason we have dual licenses for the runtime stuff is that we don't want the UIUC license (which has a binary attribution clause) to affect stuff built with the compiler. Saying that "clang -fasan produces code that has to binary attribute the LLVM license" is pretty lame.
>>
>> I think that what is *traditionally* thought of as compiler-rt has different needs from ASan/TSan/etc. The latter runtimes are really intended to be separate units from the binary; for example none of their code would ever be directly emitted into a function, etc. Certainly the scope and complexity of them are very different, and so it might still make sense to split these into two groups of runtime libraries.
>
> To be clear, compiler-rt isn't injected into functions. Maybe a better definition is that compiler-rt is statically linked in, vs the more advanced runtimes that are dynamically linked in.
>
> Forming the division like this might make it easier to handle the attribution issues too enough trickery.
>
>> Were I drawing an arbitrary line, I would draw it around the runtime libraries which are stand-alone and implement an available spec for which other implementations can and do exist. (libgcc, libstdc++, etc etc.) Regardless of licensing issues, I suspect making this bucketing more clear would simplify some of these projects....
>
> Yeah, I completely agree with your goals. This is one of the big concerns I had back when the dual licensing happened in the first place.
>
>
>> Anyways, there seem to be a few, all somewhat bad options left to us with ASan/TSan and similar more "advanced" runtimes:
>>
>> 1) Swallow the lame binary attribution clause requirement. Document this noisily.
>> 2) Require they are build as DSOs, and thus the attribution restricted to that runtime library entity.
>
> Maybe 2a: engineer asan so that it *optionally* links to the DSO. If the DSO is present, functionality is enabled, if not, it is silently disabled and the app still works (at some performance cost). Could this work?
>
> DSO? You mean to make asan.so instead of asan.a?
> This will work, but may cause confusion. If asan.so is not present, the asan-ified binary will crash instantly.
>
> For tsan, making it tsan.so instead of tsan.a will cause a huge performance penalty because tsan reads TLS on every memory access.

It depends where you need to use LLVM libraries. Going back to the initial post of this thread, LLVM libraries will be used by asan/tsan to symbolicate stack traces. If that's the only use of LLVM in asan it would make a lot of sense to split that part into a DSO and link it on demand.

That'll work too and is indeed simpler.

Yes. Is there an easy way to build that part of LLVM (Support and DebugInfo) as .so files as well as .a?

You want to create a new .so which links the .a files from LLVM and doesn't export LLVM's symbols. This is another advantage of using .so over .a, think of what will happen when you want to link LLVM trunk with asan which is using a different version of LLVM. LLVM's symbols are not versioned. With a DSO you can just hide all symbols except a couple of entry points for your purposes and none of the symbols will collide.

- Ben