Suggestions on code generation for SIMD

Linchuan_Chen · January 6, 2018, 12:26am

Hi everyone,

I’m quite new to LLVM, but am working on a project that might need to generate some SIMD code using LLVM. The SIMD code will be using INTEL MIC intrinsics and I’m not sure about thesteps and tool set that I need to use to generate those.

I also have a confusion on the following problems:

Do people usually generate SIMD code at source code level, using __m512?
If not, does LLVM have corresponding IR instructions for the SIMD registers and instructions?

Since I’m new, I would appreciate any help that could give me some directions at any level. Some references would also help. Thanks in advance!

Amara_Emerson1 · January 8, 2018, 7:30pm

Hi Linchuan,

I believe clang supports Intel AVX512 intrinsics so it should be possible to generate vector code using that.

For 2), LLVM has first class vector types such as <4 x i32> and can do the usual things on those types, including masking. The vectoriser is where most of the vector code that LLVM generates will originate from. These types aren’t target specific however, and there are no notions of vector “registers” at the IR level.

Cheers,
Amara

Linchuan_Chen · January 8, 2018, 7:41pm

Thanks Amara so much for the info!

One more question: what do people usually do if they want to generate vectorized code for some existing c/c++ code?
Do they usually do C/C++ source level transformation, or do at LLVM’s IR level?

I know clang supports auto vectorizations, such as loop vectorization and SLP, but they are not flexible enough if we
want to do more custom vectorizations or handle more complex cases, for example, SLP might not be able to handle
branches in the code (or may be latest version already can handle branches using mask).

Amara_Emerson1 · January 8, 2018, 8:01pm

The vast majority of the time people will rely on source level pragmas [1], LLVM IR is designed to be machine friendly, not something intended for users to manually edit themselves. You can do it, but it’s tedious and error prone. If you need more control over the vectorisation than the pragmas allow, then the C intrinsics are the best choice.

Amara

[1] http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations

Linchuan_Chen · January 8, 2018, 9:42pm

Thanks Amara very much! I will take a look!

serge_guelton2 · January 9, 2018, 8:39am

    The vast majority of the time people will rely on source level pragmas [1],
    LLVM IR is designed to be machine friendly, not something intended for
    users to manually edit themselves. You can do it, but it’s tedious and
    error prone. If you need more control over the vectorisation than the
    pragmas allow, then the C intrinsics are the best choice.

    Amara

    [1] Clang Language Extensions — Clang 18.0.0git documentation
    extensions-for-loop-hint-optimizations

A large portion of user still use intrinsics too, as provided in
avxintrin.h and the likes. They are then lowered to a single/few
llvm instructions with vector operands.

Linchuan_Chen · January 10, 2018, 4:59am

Thanks Serge! This means for every new intrinsic set, a systematic change should be made to LLVM to support the new intrinsic set, right? The change should include frontend change, IR instruction set change, as well as low level code generation changes?

serge_guelton2 · January 10, 2018, 6:47am

It really depends. In most cases, the intrinsic is implemented in terms
of generic vector instruction, directly represented at the LLVM level:

    static __inline __m256d __DEFAULT_FN_ATTRS
    _mm256_sub_pd(__m256d __a, __m256d __b)
    {
      return (__m256d)((__v4df)__a-(__v4df)__b);
    }

But some intrinsics cn not be modeled that way:

    static __inline __m256d __DEFAULT_FN_ATTRS
    _mm256_hadd_pd(__m256d __a, __m256d __b)
    {
      return (__m256d)__builtin_ia32_haddpd256((__v4df)__a, (__v4df)__b);
    }

In that case, the builtin is relatively opaque to the Middle End, nd
lowered in the backend (see include/llvm/IR/IntrinsicsX86.td)

Topic		Replies	Views
Questions on the llvm 'vector' types and resulting SIMD instructions LLVM Dev List Archives	2	115	September 3, 2014
Auto-vectorization option LLVM Dev List Archives	7	173	May 14, 2021
easiest way to "fix-up" LLVM types generated from Intel SIMD types? Clang Frontend	1	93	June 3, 2010
Vector code LLVM Dev List Archives	20	83	May 14, 2008
AVX code gen LLVM Dev List Archives	2	85	December 12, 2013

Suggestions on code generation for SIMD

Related topics