Vector code

Nicolas_Capens1 · May 8, 2008, 3:24pm

Hi all,

I’m trying to use LLVM to generate SIMD code at runtime (in particular Intel SSE). But I’m having a bit of trouble understanding how to create even the simplest function; adding two vectors of four single-precision floating-point elements. I can get it to add the elements one at a time but not using one vector instruction.

All help much appreciated!

Nicolas Capens

clattner · May 8, 2008, 5:14pm

I'd suggest writing code in C and seeing what llvm-gcc does with it. You can also look at (for example) llvm/test/CodeGen/X86/*.ll for many examples.

-Chris

Nicolas_Capens1 · May 8, 2008, 5:09pm

Hi Chris,

Thanks for the advise, but I'm actually not trying to compile code from
text. For now I'm just trying to construct the function directly. Think of
it as the vector equivalent of the HowToUseJIT.cpp example.

Cheers,

-Nicolas

Dan_Gohman3 · May 8, 2008, 5:47pm

What is your target set to? If LLVM thinks it's targeting a processor
that doesn't have SIMD instructions, it'll split vectors into
scalars like this.

Dan

Anton_Korobeynikov · May 8, 2008, 5:58pm

Nicolas,

Thanks for the advise, but I'm actually not trying to compile code from
text. For now I'm just trying to construct the function directly. Think of
it as the vector equivalent of the HowToUseJIT.cpp example.

llvm2cpp is your friend then. It's now a separate 'target' in llc. It
will generate C++ code, which will construct provided IR.

clattner · May 8, 2008, 6:13pm

Thanks for the advise, but I'm actually not trying to compile code from
text. For now I'm just trying to construct the function directly. Think of
it as the vector equivalent of the HowToUseJIT.cpp example.

There is a one to one mapping between text and IR. If you understand what to generate it is much easier to generate it. Otherwise, if you have a specific question, we can help answer that.

-Chris

Nicolas_Capens1 · May 8, 2008, 6:35pm

Hi Chris,

I don't know how to properly create vectors and add them. I can create
arrays, take individual elements and add them, but BinaryOperator::createAdd
doesn't work on vectors for me. The documentation is very extensive for
scalar types and there are plenty of examples, but I haven't found a
straightforward way to translate scalar code to vector code yet. Please bear
with me, I've only just started exploring LLVM's capabilities and I'm still
searching though the documentation for more details about vector types.

Thanks,

-Nicolas

Nicolas_Capens1 · May 8, 2008, 6:37pm

Hi Dan,

My CPU supports up to SSSE3, and I assume LLVM uses that as a target by
default? I don't think that's the problem really, I'm just struggling to
find the right functions/classes to create and manipulate vectors...

Thank you,

-Nicolas

Nicolas_Capens1 · May 8, 2008, 6:40pm

Hi Anton,

I assume that's the same as the online demo's "Show LLVM C++ API code"
option (http://llvm.org/demo/)? I've tried that with a structure containing
four floating-point components but it also appears to add them individually
using extract/insert. Maybe I have to try an array of floats...

Thanks,

Anton

Duncan_Sands · May 8, 2008, 7:29pm

I assume that's the same as the online demo's "Show LLVM C++ API code"
option (Try out LLVM and Clang in your browser!)? I've tried that with a structure containing
four floating-point components but it also appears to add them individually
using extract/insert. Maybe I have to try an array of floats...

You need to use gcc's vector extensions.

Ciao,

Duncan.

From the gcc docs:

5.43 Using vector instructions through built-in functions

Matthijs_Kooijman1 · May 8, 2008, 7:38pm

Hi Nicolas (at least, I suspect your signing of your mail with "Anton" was not
intentional :-p),

I assume that's the same as the online demo's "Show LLVM C++ API code"
option (Try out LLVM and Clang in your browser!)? I've tried that with a structure containing
four floating-point components but it also appears to add them individually
using extract/insert. Maybe I have to try an array of floats...

Did you turn off the link-time optimization flag (or something like that)? If
not, the compiler will optimize things like small structs away (though a
struct of more than 3 elements should not be scalarized directly AFAIK...).

Gr.

Matthijs

Nicolas_Capens1 · May 8, 2008, 8:46pm

Hi Matthijs,

Yes, I've turned off the link-time optimizations (otherwise it just
propagates my constant vectors and immediate prints the result).

Here's essentially what I try to generate:

void add(float z[4], float x[4], float y[4])
{
   z[0] = x[0] + y[0];
   z[1] = x[1] + y[1];
   z[2] = x[2] + y[2];
   z[3] = x[3] + y[3];
}

And here's part of the output from the online demo:

LoadInst* float_tmp2 = new LoadInst(ptr_x, "tmp2", false, label_entry);
LoadInst* float_tmp5 = new LoadInst(ptr_y, "tmp5", false, label_entry);
BinaryOperator* float_tmp6 = BinaryOperator::create(Instruction::Add,
float_tmp2, float_tmp5, "tmp6", label_entry);
StoreInst* void_20 = new StoreInst(float_tmp6, ptr_z, false, label_entry);
GetElementPtrInst* ptr_tmp10 = new GetElementPtrInst(ptr_x, const_int32_13,
"tmp10", label_entry);
LoadInst* float_tmp11 = new LoadInst(ptr_tmp10, "tmp11", false,
label_entry);
GetElementPtrInst* ptr_tmp13 = new GetElementPtrInst(ptr_y, const_int32_13,
"tmp13", label_entry);
LoadInst* float_tmp14 = new LoadInst(ptr_tmp13, "tmp14", false,
label_entry);
BinaryOperator* float_tmp15 = BinaryOperator::create(Instruction::Add,
float_tmp11, float_tmp14, "tmp15", label_entry);
...

So it just processes one element at a time instead of with one (SIMD)
operation.

Thank you,

-Nicolas (not Anton)

Evan_Cheng1 · May 8, 2008, 9:30pm

llvm does not automatically vectorize your scalar code (as least for now). You have to write gcc generic vector code or use vector builtins.

Evan

Frits_van_Bommel1 · May 8, 2008, 9:56pm

Nicolas Capens wrote:

Here's essentially what I try to generate:

void add(float z[4], float x[4], float y[4])
{
   z[0] = x[0] + y[0];
   z[1] = x[1] + y[1];
   z[2] = x[2] + y[2];
   z[3] = x[3] + y[3];
}

This is the vectorized llvm-assembly equivalent:

Nicolas_Capens1 · May 9, 2008, 10:10am

Hi Evan,

Please note that I'm not trying to compile from C code, I try to generate
functions at run-time directly. I want to keep it target-independent too, so
I can't use intrinsics either.

Cheers,

-Nicolas

Nicolas_Capens1 · May 9, 2008, 10:27am

Hi Frits,

Thanks for the suggestions! I was first able to successfully compile it to
bitcode (.bc format). llc doesn't support "-march=cpp", but then I ran
llvm2cpp which does give me the C++ code to directly create the intermediate
representation. Now I can study that to see what I was doing wrong
earlier...

Thanks again!

-Nicolas

Mike_Stump1 · May 10, 2008, 12:03am

Ah, but you can use any intrinsic that is target independent.... The gcc vector stuff is meant to work on all targets as I recall.

Daniel_Berlin1 · May 10, 2008, 4:38am

Yes, it does.

clattner · May 10, 2008, 4:43am

FWIW, LLVM IR supports a broad superset of them and has first class support for permutation, insertion, extraction, etc.

-Chris

Topic		Replies	Views
Questions on the llvm 'vector' types and resulting SIMD instructions LLVM Dev List Archives	2	100	September 3, 2014
Seg faulting on vector ops LLVM Dev List Archives	7	84	July 30, 2007
Suggestions on code generation for SIMD LLVM Dev List Archives	7	159	January 10, 2018
Getting llvm-gcc to generate vectors LLVM Dev List Archives	1	75	March 5, 2009
Codegen of `declare simd` LLVM Dev List Archives	0	72	April 18, 2020

Vector code

Related Topics