RFC: SIMD math-function library

Dear LLVM contributors,

I am Naoki Shibata, an associate professor at Nara Institute of Science and Technology.

I and Hal Finkel would like to jointly propose to add my vectorized math library to LLVM.

The library has been available as public domain software for years, I am going to double-license the library if necessary.

Hi Naoki,

SLEEF looks very promising!

Are SLEEF routines validated against libm, in addition to libmpfr? Are
performance tracking tests in place to detect execution time or code size
regressions? If these are missing, IMO it would be good to add them to the
roadmap.

best,
vedant

Hi Vedant,

Thank you for your comment.

For checking accuracy of finite outputs and correctness of handling non-finite inputs and outputs, I believe validating against libmpfr is enough. Please tell me the kind of regressions we need to detect. Do you have concern on correctness of libmpfr?

What kind of execution time or code size regressions are we going to check? Since SLEEF is completely branch-free, there should be no serious execution time and code size regression unless branches are introduced.

It is of course okay for me to add additional regression checking, but I just want to understand the necessity.

Regards,

Naoki Shibata

Hi again,

As this RFC implies, I've been using the SLEEF library proposed here with Clang/LLVM for many years, and fully support its adoption into the LLVM project.

I'm CC'ing Matt and Xinmin from Intel who have started working on contributing support for their SVML library to LLVM (http://reviews.llvm.org/D19544), and I understand plan to contribute (some subset of) the vector math functions themselves. I'm also excited about Intel's planned contributions.

Here's how I currently see the situation: Regardless of what Intel contributes, we need a solution in this space for many different architectures. From personal experience, SLEEF is relatively easy to port to different architectures (i.e. different vector ISAs), and has already been ported to several. The performance is good as is the accuracy. I think it would make a great foundation for a vector-math-function runtime library for the LLVM project. I don't know what routines Intel is planning to contribute, or for what architectures they're tuned, but I expect we'll want to use those implementations on x86 platforms where appropriate.

Matt, Xinmin, what do you think?

Thanks again,
Hal

Hi all,

Okay, the point is whether Intel will publish the source code for their SVML. If Intel will make SVML open-source, there would be not much advantage in incorporating SLEEF into LLVM, since it would be also fairly easy to port SVML to other architectures. If Intel will not open-source SVML, then there could be advantage in using SLEEF for x86 by inlining the functions.

Is it possible to ask the person in charge what exactly Intel is going to contribute?

Naoki Shibata

I agree with Hal.

Since SLEEF library is targeted (portable) for many different architectures, it will be a great addition to LLVM community on SIMD support for all architectures

Currently, intel open sourced 6 functions (sin, cos, pow, exp, log, and sincos) GCC and LLVM for x86 ( {SS2, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, MIC, AVX512} x {mask, non-mask} ), AVX512 open source is to be done), the plan is to open source most of Intel SVML library for LLVM x86 support.

For achieving "close to metal performance for x86", I assume Intel SVML would provide a better performance and more control on accuracy for the time being, given the SVML team had tuned the SVML for many years for all x86 architectures, and we have not done performance and accuracy comparisons on SLEEF and SVML libraries.

In any case, I would suggest move this RFC forward and start this project. I think Intel's SVML code for x86 can be integrated into this project for x86, I will talk to Intel SVML library owner/stakeholders and ask them to take a look SLEEF and provide their recommendation/suggestion related to x86 and in general.

Thanks,
Xinmin

Hi Martin,

Thank you for your comment.

It is of course possible to rewrite SLEEF in more generic way, and actually I once tried to do that using the vector data type in GCC. But the code generated from such source code was far less efficient than the version with explicit SIMD intrinsics.

Adding typedefs to specify the exact types is possible.

Regards,

Naoki Shibata

Naoki,

Intel is planning open-source SVML library (most of them if it not 100%), 6 functions of SVML are open sourced for GCC and LLVM already. But, Intel SVML is x86 centric (SSE2, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 ....}. Personally, I am not sure if it would be fairly easy to port SVML to other architectures. SVML library team may provide a better answer, I will double check with them.

Given that SLEEF supports many different architectures, I think it has a value for LLVM, at least before all porting is done for SVML library to other architectures by LLVM community after Intel open sourced it.

Thanks,
Xinmin

I am looking forward to porting it to our platform, I know that this will be significant benefit.

We support 'v8f16' and v4f32' FP vector types natively, and having this library provide the optimised math functions for them will definitely be very useful.

All the best,

  MartinO

From: "Martin J. O'Riordan via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Naoki Shibata" <shibatch.sf.net@gmail.com>, "Vedant Kumar" <vsk@apple.com>
Cc: llvm-dev@lists.llvm.org
Sent: Friday, July 15, 2016 1:09:44 AM
Subject: Re: [llvm-dev] RFC: SIMD math-function library

I am looking forward to porting it to our platform, I know that this
will be significant benefit.

We support 'v8f16' and v4f32' FP vector types natively, and having
this library provide the optimised math functions for them will
definitely be very useful.

All the best,

  MartinO

From: Naoki Shibata [mailto:shibatch.sf.net@gmail.com]
Sent: 15 July 2016 05:46
To: Martin.ORiordan@Movidius.com; 'Vedant Kumar' <vsk@apple.com>
Cc: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] RFC: SIMD math-function library

Hi Martin,

Thank you for your comment.

It is of course possible to rewrite SLEEF in more generic way, and
actually I once tried to do that using the vector data type in GCC.
But the code generated from such source code was far less efficient
than the version with explicit SIMD intrinsics.

Adding typedefs to specify the exact types is possible.

Regards,

Naoki Shibata

> Having support for vector equivalents to the ISO C math functions
> is very valuable, and this kind of work of great benefit.
>
> There are a couple things though that concern me about this
> proposal:
>
> 1. OpenCL C already provides a vector math binding that for the
> most
> part provides this equivalence. It also supports vectors of
> multiple types through overloading. Perhaps it might be
> possible
> to align SLEEF with OpenCL C?

[+Tom]

This is an interesting point. It might certainly make sense to integrate these routines with our OpenCL library implementation as well for targets that would benefit. Currently, we have scalar implementations of many math functions (e.g. http://llvm.org/svn/llvm-project/libclc/trunk/generic/lib/math/tanh.cl), and "vectorized" versions which just call the scalar functions (http://llvm.org/svn/llvm-project/libclc/trunk/generic/lib/clcmacro.h). If nothing else, it might make sense to borrow their naming convention?

Thanks again,
Hal

Is it possible to see the source code of the open-sourced SVML? The diff file does not include the library. I searched the Internet but I could not find.

Regards,

Naoki Shibata

It was open sourced for GCC. I will get you the contact who did open source. Thanks.

Xinmin.

Naoki, below is the link you can get the code.

https://sourceware.org/git/?p=glibc.git;a=tree;f=sysdeps/x86_64/fpu/multiarch;h=2c567a353c2d258dbc08c50cd6fa189b825f3257;hb=HEAD

Xinmin

Thank you. And now I understand why it is not very easy to port SVML to other architectures.

Naoki Shibata

Hi everyone,

I think that everyone is on the same page. We'll put together a patch for review.

One remaining question: There seem two potential homes for this library: parallel_libs and compiler-rt. Opinions on where the vectorized math functions should live? My inclination is to target it for the new parallel_libs project, in part because I feel like compiler-rt has too many things grouped together already, and in part because vectorization is a form of parallel execution. Thoughts?

Thanks again,
Hal

I don't have a strong preference using either parallel_lib or compiler-rt as the home for vectorlib, or a new one. Assume we go with parallel_libs, the structure is more or less like below, right?

                   parallel_libs

I share your preference and the basis for it.

Why is there any motivation to bundle it with unrelated stuff at all?
What's the benefit? If it's just to prop up the existence of
parallel_libs, then I don't think that makes sense.. Should we move
llvm loop optimizations over to parallel_libs as well?

If this is just a bikeshed argument, of course chandler will get his
way and nobody else matters..

Hopefully, the decision is driven by points like: maintaining a clear
modular design, repo with the same name it had before, works
independent of any compiler, clearly defined what it is and who is
working on it as well as the goals..

(Which is the exact opposite of parallel_libs which is a meta-bucket
of dumping "stuff") Another reason why parallel_libs doesn't make
sense is that it's still extremely low visibility or relevance. Was a
mailing list setup for it? If it's a real project, why wasn't that
list on cc?

I'd opt to go with what the author wants or worst case compiler-rt in
the event people refuse to create another repo. The nature of the
functions it implements is complementary to what's there already,
better visibility as well as something people may be checking out
already.

From: "C Bergström" <cbergstrom@pathscale.com>
To: "Chandler Carruth" <chandlerc@gmail.com>
Cc: "Hal Finkel" <hfinkel@anl.gov>, "llvm-dev" <llvm-dev@lists.llvm.org>, "Matt Masten" <matt.masten@intel.com>,
"Naoki Shibata" <shibatch.sf.net@gmail.com>
Sent: Wednesday, July 27, 2016 9:43:34 PM
Subject: Re: [llvm-dev] RFC: SIMD math-function library

Why is there any motivation to bundle it with unrelated stuff at all?
What's the benefit? If it's just to prop up the existence of
parallel_libs, then I don't think that makes sense..

I don't think that parallel_libs needs propping - at the moment it is so new that parallel_libs-dev has zero messages. I don't see a strong need for another new top-level project, with whatever administrative overhead that implies. I'm not against it either. If the community wants a new top-level project for this library, then I'm sure we can make one.

Should we move
llvm loop optimizations over to parallel_libs as well?

:wink:

If this is just a bikeshed argument, of course chandler will get his
way and nobody else matters..

While many of us respect Chandler's opinion, that's not actually the way the community works.

Hopefully, the decision is driven by points like: maintaining a clear
modular design, repo with the same name it had before, works
independent of any compiler, clearly defined what it is and who is
working on it as well as the goals..

To be clear, I think the community should decide on the name. Using the name it has now is one option. That name is SLEEF (SIMD Library for Evaluating Elementary Functions). We might also wish to name it something more generic as part of the project, as is our general custom (e.g. compiler-rt, libc++, libomp, etc.).

(Which is the exact opposite of parallel_libs which is a meta-bucket
of dumping "stuff") Another reason why parallel_libs doesn't make
sense is that it's still extremely low visibility or relevance. Was a
mailing list setup for it? If it's a real project, why wasn't that
list on cc?

Because the RFC was on this list, and as you might recall, we recently had a big discussion on this list about mailing lists, and about how cross-posting between different lists is a real pain for the list moderators. Thus, I didn't. If we target the library to the parallel_libs project, then future discussion will go there. In the mean time, I am assuming that the relevant parties are on this list.

I'd opt to go with what the author wants or worst case compiler-rt in
the event people refuse to create another repo. The nature of the
functions it implements is complementary to what's there already,
better visibility as well as something people may be checking out
already.

I agree that it is complementary to what is already in compiler-rt. That is why I suggested it as the second option.

Thanks again,
Hal

I'm positive +1 for inclusion since it has users, some development and
overall fits with the compiler genre.

If there's a bikeshed discussion on changing the name or where it
lives, I'd hope that we start a new discussion for that so it can be
easier to filter as well as stating clearly the pros/cons of each
proposal. I'm really really bored of reading all the bikeshed
discussions lately and the opinions, sometimes without strong
technical backing for why people choose Green or Blue.