LLVM FunctionType cannot be returned as VectorType?

Dear all,

I am using LLVM C++ API to generate some code. In particular, I am dealing with AVX2 SIMD API which uses __m256i.

My function input types a set of vectors and return type is also a vector.

///////////////////////////////////////////////////////////////////////////////////////////

arguments.push_back(VectorType::get(IntegerType::getIntNTy(TheContext, 64), 4));//int644 = __m256i
FunctionType * proto = FunctionType::get(VectorType::get(IntegerType::getIntNTy(TheContext, 64), 4),//int64
4 = __m256i

arguments, false);

///////////////////////////////////////////////////////////////////////////////////////////

I can successfully use this way to produce the IR of my function properly like this:

///////////////////////////////////////////////////////////////////////////////////////////

define <4 x i64> @tpchq6(<4 x i64> %leaf7, <4 x i64> %leaf8, <4 x i64> %leaf9, <4 x i64> %leaf10, <4 x i64> %leaf11, <4 x i64> %leaf12, <4 x i64> %leaf13, <4 x i64> %leaf14) {

entry:

%addtmp = add <4 x i64> %leaf14, %leaf13

%leaf8.neg = sub <4 x i64> zeroinitializer, %leaf8

%xortmp = xor <4 x i64> %addtmp, %leaf11

%addtmp1 = add <4 x i64> %leaf8.neg, %leaf7

%subtmp = add <4 x i64> %addtmp1, %leaf9

%addtmp2 = add <4 x i64> %subtmp, %leaf10

%addtmp3 = add <4 x i64> %addtmp2, %xortmp

ret <4 x i64> %addtmp3

}

///////////////////////////////////////////////////////////////////////////////////////////

However, when I use JIT ExectionEngine to run it, it cannot return the Vector type properly. I tried the jit execution engine with non-vector return like int64, it works fine.

My code is as follows: It always tells me segment fault

///////////////////////////////////////////////////////////////////////////////////////////
// Define the input/output data type in LLVM function
typedef std::vector<int64_t> VecInt;

auto function = reinterpret_cast<VecInt (*)(VecInt , VecInt, VecInt, VecInt, VecInt, VecInt, VecInt, VecInt)>(TheExecutionEngine->getFunctionAddress(TheFunction->getName().str()));
VecInt result = function(functionCallArgs[0],functionCallArgs[1],functionCallArgs[2],functionCallArgs[3],
functionCallArgs[4],functionCallArgs[5],functionCallArgs[6],functionCallArgs[7]);

std::cout<<“result size “<< result.size()<<”\n”;

///////////////////////////////////////////////////////////////////////////////////////////

Can someone tell me whether this is the correct way to retrieve the vector return type? Or is the vector type return supported?

Thanks,
Jia Yu

Hi Jia

I don’t think this is a problem with the ExecutionEngine. Your problem comes from the confusion of the “Vector Type” in LLVM IR [1] with the “std::vector” data structure in the C++ STL. While there is no direct relation between the two, you should be able to use a std::vector to provide the input for the <4 x i61> Vector Type by passing the std::vector’s raw data [3].

However, it would be easier with something like this:

using VecInt = int64[4];
VecInt args0 { 0, 1, 2, 3 };

VecInt result = function(args0, …);

Btw.: Note that you may need to set target-feature attributes for your function like so: [3]

Hope it helps.

Cheers,
Stefan

[1] [2] [3] [4]

Hi Stefan,

Thank you very much for answering my question!

I followed your suggestion but the function still cannot return the correct result. I also set target-feature attributes for my function. I am using LLVM 6.0.

It only prints out some random large numbers but the correct answer is supposed to be all 0.

Can you please help me figure out what’s going on here? Any help will be greatly appreciated.

///////////////////////////////////////////////////////////////////////////////////////////

My function prototype definition:

auto vectorDataType = VectorType::get(IntegerType::getIntNTy(TheContext, 64), 4);
std::vector<Type > vecArguments;
for (Uint64 nodeId = startOfLeaves; nodeId < numNodes; ++nodeId) {
vecArguments.push_back(vectorDataType);
}
proto = FunctionType::get(vectorDataType,//int64
4 = __m256i
vecArguments, false);

///////////////////////////////////////////////////////////////////////////////////////////

The generated IR:

define <4 x i64> @tpchq6(<4 x i64> %leaf7, <4 x i64> %leaf8, <4 x i64> %leaf9, <4 x i64> %leaf10, <4 x i64> %leaf11, <4 x i64> %leaf12, <4 x i64> %leaf13, <4 x i64> %leaf14) #0 {
entry:
%addtmp = add <4 x i64> %leaf8, %leaf7
%addtmp1 = add <4 x i64> %addtmp, %leaf9
%addtmp4 = add <4 x i64> %addtmp1, %leaf10
%addtmp2 = add <4 x i64> %addtmp4, %leaf11
%addtmp3 = add <4 x i64> %addtmp2, %leaf12
%addtmp5 = add <4 x i64> %addtmp3, %leaf13
%addtmp6 = add <4 x i64> %addtmp5, %leaf14
ret <4 x i64> %addtmp6
}

///////////////////////////////////////////////////////////////////////////////////////////

My JIT function call:

using VecInt = int64_t[4];

auto function = (int64_t ()(
VecInt
, VecInt
, VecInt
, VecInt
, VecInt
, VecInt
, VecInt
, VecInt
))(TheExecutionEngine->getFunctionAddress(TheFunction->getName().str()));

VecInt argsX = {0,0,0,0};
int64_t* result = function(argsX,argsX,argsX,argsX,argsX,argsX,argsX,argsX);

///////////////////////////////////////////////////////////////////////////////////////////
My output result:

422162285262848 562251371602737 843692813695832 422162285262848

It only prints out some random large numbers but the correct answer is supposed to be all 0.

///////////////////////////////////////////////////////////////////////////////////////////

Thanks,
Jia

In x86 ABI terms, a result that is a vector is returned in %xmm0 (or %ymm0/%zmm0 if the size is >128 bits). All other scalar types are returned via %rax (or some subslice thereof).

The way you’re calling the function is expecting the value to be found in %rax, where the callee is trying to return it in %xmm0, which means you’re reading the results of some scratch register. What you’ll want to do is either use the gcc/clang vector intrinsics to get a vector type for the function call or modify the code to read/write the vectors via memory references rather than passing them as arguments/return values.

Hi Joshua,

Thanks for your great comment. I made up a ConstantVector in IR. Then I successfully use AVX intrinsics to retrieve the returned vector data. The remaining thing is that how to pass the vectors to LLVM function using intrinsics. Do you have any suggestions? Please forgive me if the question is too naive.

I pasted below two IR I used. The first one works. The second one doesn’t. This means I didn’t pass the _m256I into LLVM function correctly. Could you please take a look at it?

I really appreciate your help!

Jia

///////////////////////////////////////////////////////////////////////////////////////////

My function call:

__m256i input =_mm256_set_epi64x(1, 1, 1, 1);

__m256i result = _mm256_load_si256(function(&input,&input,&input,&input,&input,&input,&input,&input));
int64_t r0 = _mm256_extract_epi64(result, 0);
int64_t r1 = _mm256_extract_epi64(result, 1);
int64_t r2 = _mm256_extract_epi64(result, 2);
int64_t r3 = _mm256_extract_epi64(result, 3);

///////////////////////////////////////////////////////////////////////////////////////////
I can retrieve the returned value using the following IR

define <4 x i64> @tpchq6(<4 x i64> %leaf7, <4 x i64> %leaf8, <4 x i64> %leaf9, <4 x i64> %leaf10, <4 x i64> %leaf11, <4 x i64> %leaf12, <4 x i64> %leaf13, <4 x i64> %leaf14) #0 {
entry:
ret <4 x i64> <i64 5, i64 6, i64 7, i64 8>
}

///////////////////////////////////////////////////////////////////////////////////////////
I cannot retrieve the returned value using the following IR

define <4 x i64> @tpchq6(<4 x i64> %leaf7, <4 x i64> %leaf8, <4 x i64> %leaf9, <4 x i64> %leaf10, <4 x i64> %leaf11, <4 x i64> %leaf12, <4 x i64> %leaf13, <4 x i64> %leaf14) #0 {
entry:
ret <4 x i64> %leaf7
}

Thanks,
Jia Yu

You’re passing the address of the vectors into the function. What you should be passing is the vectors themselves.

Hi Joshua,

Thanks for your reply!

I did try to pass the vector directly to the function but it gives me segment fault. Do you know what is the reason?

__m256i result = _mm256_load_si256(function(input,input,input,input,input,input,input,input));

Thanks,
Jia