[RFC] Libc++ taking a dependency on Boost.Math for the C++17 Math Special Functions

That’s correct, except for the fact that vendors don’t ship the test suite, so they don’t need to ship GoogleTest. Boost.Math would be different in the sense that it would be included inside the libc++ shared (and static) library, so vendors would be shipping that compiled code to their users (without any special action required on their end). Since there is no attribution requirement on the license when used in that way, that shouldn’t have any impact on vendors, but we still wanted to ask for feedback in case we missed something or someone could bring a new perspective.

But to reiterate, because of the way we’re planning to use Boost.Math, nobody should have to care or even notice it. It would be an implementation detail inside libc++ that doesn’t leak outside of libc++ compiled code and shouldn’t require any additional action from anyone to make that work.

I agree. This is the same concern that was brought up by @jyknight above. I believe this is a technical problem with a technical solution to it, and we’ll make sure that this isn’t a problem when we perform the integration. We might annotate Boost functions with a macro, or use a pragma, or something else. We’ll find a way to ensure that boost symbols don’t leak out and we can even probably have a test for it (something like nm -omg | grep “boost::”).

+1 for using Boost.Math for this part of the library, assuming the Foundation approves the license.

I also agree with the proposed integration strategy.

I agree with other posters it would be great if we could writer our own implementation. Since that hasn’t happened, I would prefer to get a working solution. We can always replace the code by our own implementation if there is somebody willing and able to work on it.

+1 I really like to understand what your objection is.

LLVM is a compiler toolchain, which puts it into a bootstrap path. It’s a C/C++ compiler, which puts it into the bootstrap path.

This means you should care about what dependencies you have. LLVM already has unnecessary dependencies that harm bootstrapping, such as Python and a very recent CMake version. In that linked repository, we patch out everything from third-party.

Even the optional ones such as zstd and libxml are pesky since they can be dragged in unintentionally. (Aside - do you really need the entire API surface and implementation complexity of libxml, or could a simpler solution have sufficed?)

~90% of our bug reports are failure to build from source due to LLVM, typically due to how Linux distributions or other packaging systems such as Homebrew have decided to package LLVM, Clang, and LLD. Most packaging guidelines have (reasonable!) rules that specify that dependencies must be managed by the system package manager, and not “vendored”. I promise you that what will happen, in practice, is somebody will file a bug report saying they can’t build Zig from source, and when we examine the error message it will be a linker error saying that some boost function isn’t found. This is the reality that users will be faced with, despite the fact that I always tell people to build LLVM from source to avoid issues.

In this case LLVM would be outsourcing its core competency. You’re really telling me that a bunch of C++ compiler engineers are unable to implement their own standard library?

2 Likes

I agree that bootstrapping is important, and that it is relevant that LLVM doesn’t have too many external dependencies.

However - I fail to see how this related to this thread at hand. The suggested library would, as far as I undestand it, be bundled within the llvm-project monorepo. It’d act just like regular libcxx source files, just in a different directory, with a slightly different license. And as for bootstrapping LLVM, libcxx isn’t even involved whatsoever, if you’re just building LLVM with your preexisting host C++ toolchain. (Likewise, I don’t see libcxx in your linked zig-bootstrap repo?)

(I’m not involved in this effort otherwise, and I wasn’t familiar with the Boost license beforehand, but from what I’ve read so far, it doesn’t seem like an issue to me - but I was very curious to hear about your concerns with this.)

5 Likes

As someone with somewhat expertise in floating point arithmetic, I could tell you that bootstrapping the floating point math part in a language’s standard library requires very different skills from compiler engineering.

9 Likes

Why can’t it act like regular libcxx source files, not in a different directory, and with the license pasted into wherever it is relevant?

Feels like people are trying to have their cake and eat it too. It doesn’t act like a dependency, except when it does.

Why is this thread even open? You could have just implemented the functions via copy paste, without collecting any feedback. This thread’s existence is a demonstration of how this thing is acting as a dependency, and not regular code.

Why can’t it act like regular libcxx source files

I think the idea is that it will – except that it has to be in a different directory in the same repository.

Feels like people are trying to have their cake and eat it too. It doesn’t act like a dependency, except when it does.

I’d describe this proposal differently. It seems to me as if the proposal is to include code under a different license into the LLVM repository and into the built libc++.so/.a files, but to have it be an integral part of the libc++ library – not an external dependency.

However, it must be placed into a separate directory (within the llvm repository), in order to better track the fact that the license for that code is not the standard LLVM license.

Why is this thread even open? You could have just implemented the functions via copy paste, without collecting any feedback.

It would have been a copyright violation to attempt to copy paste the boost code and claim it as being under the LLVM license, so I presume you didn’t mean to suggest OP should do that. So, given that this code is under a new, non-LLVM, license, that certainly seems to me to require opening a discussion. IMO, even if Boost.Math it wasn’t under a different license, integrating a large new chunk of code such as this into libc++ seems that it deserve a discussion. So it definitely seems appropriate that OP posted this topic.

Yeah, this was my question as a downstream packager. And the feedback to me is:

(1) It is actually about adding a lot of code to libc++ but just in another directory within LLVM. We (as downstream vendors) don’t need to care about it.

(2) The point of the RFC is majorly for license issues and developing policy for developers (We package and distribute libc++, but we don’t develop it at least now)

So the conclusion to me is, I don’t need to worry about anything.

1 Like

I’m just a libcxx user and I don’t know if my idea is feasible.

I was wondering if the relicensing of Boost.Math and the integration of Boost.Math can’t happen concurrently? I mean, still do this RFC, but also try to partially relicense the Boost.Math routines and then make those relicensed routines part of libcxx.

That could work, but who is going to undertake the relicensing? If libc++ maintainers spend time on that, this would be time they didn’t spend implementing new library features or fixing open issues. (Not to say that I don’t expect them to be eager to do this kind of work, given how relicensing of the entire LLVM project has been going.)

2 Likes

I think there’s a lot of wrong in this statement.

First, I don’t think it’s fair to say that math special functions fall into “LLVM”'s core competency. Merely suggesting that there is a “LLVM core competency” shows a lack of understanding of the structure of this community. There is a wide variety of projects under the LLVM umbrella and they require different skill sets. We have people specialized back ends and optimizers, in compiler front ends, in libraries (both C and C++), there’s a Fortran front-end, etc. While there is some intersection between those areas, folks who specialize in one area are not generally experts in other areas.

And then even within libc++, there are multiple subdomains. Some contributors are very strong at template metaprogramming, others know core language wording really well, some are better at concurrency, ABI, localization, data structures, build systems, etc. We need all of that to build a successful C++ Standard Library and the list is pretty long because the library contains a wide variety of utilities, simple as that. Implementing these mathematical functions is a very deep topic on its own, and I don’t think we currently have anyone who contributes to libc++ that has the knowledge, the time and the interest to pursue this (you need all of those at once to be successful with this project).

Finally, we also have a desire to ship a high quality and robust implementation of the math special functions, since libc++ is used in important real world applications. Identifying that we’ll ship a better implementation if we reuse Boost.Math (until we can do better) is IMO a good engineering decision. “NIH” has no place here, not just for the sake of it.

It is a demonstration that this community operates by RFCs for non-trivial decisions, and that we care to gather feedback from stakeholders instead of just assuming the right approach. It is true that we could have implemented this without a RFC and nobody would have noticed because there is effectively no change to the physical requirements to build libc++. However, that would not have been up to the standards of this community when it comes to maintaining trust and communication with our stakeholders, because including a new (compatible) license into compiled code is something new for libc++. So I think that a RFC is useful, even if it ends up serving mostly as a communication mechanism.

Finally, a meta point. I feel like the way you approached this RFC somewhat undermined your goal. We’re obviously reasonable and we wanted to gather feedback so we opened a RFC. But jumping on this forum (for the first time? welcome!) and making bold absolute claims lacking a technical justification is not typically the best way to convince people in this community, instead it tends to discredit the person who holds these views.

So I’ll reiterate: if you have concrete concerns about this proposal, please let us know and explain what’s problematic about it. Is there an issue with the code being in a different directory? Is there a problem with the license? Do you have maintainability concerns? Nothing is off limits, but we need something concrete to talk about.

10 Likes

You asked for my feedback and then criticized me when I gave it to you. Please go back and read my comment for my concrete concern, if you want it. If you don’t actually care about my feedback, then just be honest and say that you don’t care, or better yet, don’t ask for it in the first place.

A core assumption people tend to make is that “feedback“ means “constructive feedback”. So far your feedback hasn’t been constructive, since you didn’t provide any technical reason this is a problem for you. The reason this RFC exists is that we do something novel, namely including code that is under a different license. The code will simply be included in a TU that is built just like any other libc++ TU. Coming in and claiming

makes it look like you didn’t even care to read the original RFC. Then claiming that we don’t care about feedback is just digging the hole deeper. Other people were perfectly capable of giving constructive feedback in the form of clarifications and concerns (specifically regarding the static library build, which I did in fact not consider before).

10 Likes

Thanks you @ldionne for a thoughtful and well written proposal.

I’m on the LLVM Foundation BoD and was part of the discussion. The consensus with the initial discussion is that the LLVM Foundation is comfortable exploring the idea that subprojects of LLVM add additional license dependencies that make sense for their community. For example, it makes sense that libc++ can take on a Boost dependency if its community believes this is the best way for it to achieve its goals, just like the flang subproject can take on a dependency on other code if it thought that was the best way to achieve their goals.

However, while license is in the purview of the LLVM Foundation, the foundation isn’t in charge of telling the libc++ community the best thing for their use-cases. We asked Louis to start this thread to get a sense for the whether there would be any concerns from the community about adding this dependency, and to see if folks with objections were motivated to offer constructive alternatives, and to explore the parameters on what the technical solution would look like.

From what I can tell, it’s working. Thank you for everyone who is engaging on this thread!

-Chris

4 Likes

Given the restriction

would it make sense to place the code in libcxx/third-party/boost.math instead of third-party/boost.math?

If I would come as a new, outside contributor to the lldb project (or any other non-libc++ project under the LLVM umbrella), my default assumption would be that everything under the global third-party folder is allowed for global use across all LLVM projects. But this won’t be the case for boost math.

If it’s in the libcxx/third-party I would not make this wrong assumption…

3 Likes

I guess that would be possible. Having a single place where all third-party code is seems to me the better alternative though, since you don’t have to search through every directory to figure out what third-party code is used. If e.g. clang-tidy wanted to use an external dependency, then with your suggestion it would logically end up in clang-tools-extra/clang-tidy/third-party. If any other sub-project wanted to use the same third-party code at some point they could easily miss that and we could end up with multiple copies of the same third-party code in the monorepo.

Only if that 3pp dependency also has a special licensing carve-out which allows its usage only in clang-tidy. For 3pps which are allowed to be used globally, I would keep them in third-party for the exact reason you mentioned, i.e. avoiding accidental duplication.

I would expect those license carve-outs to stay the exception?

I don’t know. FWIW I do think this argument still holds. What if another project wants Boost.Math and isn’t aware of libc++ having it? It’s not super likely that nobody would notice, but possible. It’s still much harder for people to figure out what third-party code we have if we spread it throughout the monorepo, making it in turn harder to figure out what kinds of licenses we have in the monorepo. I think the main reason we want to make this code only usable in libc++ is that nobody else needs the dependency at this point in time and it’s not clear what implications there are if other projects were to use it. If this was under the LLVM license I doubt we would care make this libc+±specific in the first place.

While I agree that there may be some confusion for someone who’s not aware of this proposal, I think the likelihood of that being problematic is minimal. The main hurdle is that someone would need to want to use Boost.Math in the first place, which isn’t super likely I think. Then, they’d have to jump through hoops to make it possible to use in another project, at which point somebody will probably raise concerns about using third-party code that’s probably not needed.

In the end I don’t feel super strongly about where exactly to put the code. Maybe someone else has any thoughts where the code should be?

I think we get the best of both worlds by leaving a readme file in the top-level `third-party`, listing all third-party libraries used by individual subprojects. I agree with Adrian that there’s significant room for accidental usages of Boost.Math outside of the scope of this RFC, so it’s a problem worth addressing.

3 Likes

I thought about this further:

We absolutely need to do something like this (pulling in Boost for these things). Our standard libraries (as a C++ community) are FULL of poor implementations because the implementer decided to implement it themselves (Looking at YOU std::regex!). AND they are ones that for ABI reasons we cannot fix.

Boost::Math is decades into creation and maintaining and usage, we absolutely need to pull in that knowledge here.

As far as ‘core competency’, these math functions are HARD and very specific. Many require PhD level analysis to understand correctly, and even further education to get right. Expecting our folks to get this stuff when an expert has already put together a library is not a good idea.

5 Likes