Masked vector intrinsics and name mangling

Hi,

The proposed masked vector intrinsics are overloaded - one intrinsic ID for multiple types.
After name mangling it will look like:

%res = call <16 x i32> @llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1(i32* %addr, <16 x i32>%passthru, i32 4, <16 x i1> %mask)
6 types x 3 vector sizes = 18 names for one operation

I propose to remove name mangling from these intrinsics:
%res = call <16 x i32> @llvm.masked.load (i32* %addr, <16 x i32>%passthru, i32 4, <16 x i1> %mask)

def int_masked_load :
Intrinsic<[llvm_anyvector_ty], [llvm_anyptr_ty, llvm_anyvector_ty, llvm_anyint_ty, llvm_anyvector_ty],
[IntrReadArgMem, NoNameMangling]>; // new property

It will significantly simplify reading and manual writing.
What do you think?

Thank you.

  • Elena

From: "Elena Demikhovsky" <elena.demikhovsky@intel.com>
To: llvmdev@cs.uiuc.edu
Sent: Sunday, October 26, 2014 5:34:46 AM
Subject: [LLVMdev] Masked vector intrinsics and name mangling

Hi,

The proposed masked vector intrinsics are overloaded - one intrinsic
ID for multiple types.
After name mangling it will look like:

%res = call <16 x i32>
@llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1(i32* %addr, <16 x
i32>%passthru, i32 4, <16 x i1> %mask)
6 types x 3 vector sizes = 18 names for one operation

I propose to remove name mangling from these intrinsics:
%res = call <16 x i32> @llvm.masked.load (i32* %addr, <16 x
i32>%passthru, i32 4, <16 x i1> %mask)

def int_masked_load :
Intrinsic<[llvm_anyvector_ty], [llvm_anyptr_ty, llvm_anyvector_ty,
llvm_anyint_ty, llvm_anyvector_ty],
[IntrReadArgMem, NoNameMangling]>; // new property

It will significantly simplify reading and manual writing.
What do you think?

We already have this kind of situation for @llvm.memcpy and friends, and while it can make the IR look verbose at times, we have reasonable interfaces for creating and manipulating these at the C++ level, so I don't think it is worthwhile to further complicate the system.

-Hal

Hal, thank you for your opinion.
I just was confused when I saw so long name " llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1" .
If we stay with a short name, we do a step towards instruction form.

- Elena

From: "Elena Demikhovsky" <elena.demikhovsky@intel.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: llvmdev@cs.uiuc.edu
Sent: Sunday, October 26, 2014 10:17:49 AM
Subject: RE: [LLVMdev] Masked vector intrinsics and name mangling

Hal, thank you for your opinion.
I just was confused when I saw so long name "
llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1" .
If we stay with a short name, we do a step towards instruction form.

I completely understand, I just don't think it matters all that much, and the logic necessary to handle it will just become a source of bugs (and thus a distraction). You don't need to worry about the exact mangled name when working with the intrinsic at the IR level. Let's get the intrinsic in first, we can always shorten the name later in a backward-compatible fashion if we decide it is worthwhile. I think one could make an argument that the mangling is really unnecessary on all of the intrinsics, and maybe this is an improvement worth making, but I think we should deal with it as a separate matter.

Thanks again,
Hal

From: "Elena Demikhovsky" <elena.demikhovsky@intel.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: llvmdev@cs.uiuc.edu
Sent: Sunday, October 26, 2014 10:17:49 AM
Subject: RE: [LLVMdev] Masked vector intrinsics and name mangling

Hal, thank you for your opinion.
I just was confused when I saw so long name "
llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1" .
If we stay with a short name, we do a step towards instruction form.

I completely understand, I just don't think it matters all that much, and the logic necessary to handle it will just become a source of bugs (and thus a distraction). You don't need to worry about the exact mangled name when working with the intrinsic at the IR level. Let's get the intrinsic in first, we can always shorten the name later in a backward-compatible fashion if we decide it is worthwhile. I think one could make an argument that the mangling is really unnecessary on all of the intrinsics, and maybe this is an improvement worth making, but I think we should deal with it as a separate matter.

I agree with Hal on both points here. It's worth discussing how to remove mangling for all intrinsics, but that should be a separate discussion.

+1 for the proposed intrinsics

Philip

Thank you. I'll reopen the question when I will have the intrinsics completed.

- Elena

Thank you. I’ll reopen the question when I will have the intrinsics completed.

I think we can actually handle this with only a minor improvement to the existing intrinsic code. For example, fabs is defined as

def int_fabs : Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>]>;

and so you can just write one type, not two: @llvm.fabs.f64(double %Val)

Similarly, you gave this definition

def int_masked_load :
Intrinsic<[llvm_anyvector_ty], [llvm_anyptr_ty, llvm_anyvector_ty, llvm_anyint_ty, llvm_anyvector_ty],
[IntrReadArgMem, NoNameMangling]>; // new property

but we could add some new types (and force alignment to i32 which matches memcpy) to make it

def int_masked_load :
Intrinsic<[llvm_anyvector_ty], [LLVMPointerToType<0>, LLVMMatchType<0>, llvm_i32_ty, LLVMVectorWithSameWidth<0, llvm_i1_ty>],
[IntrReadArgMem]>;

Now LLVM can infer everything from the return type and so the whole name here is

%res = call <16 x i32> @llvm.masked.load.v16i32(<16 x i32>* %addr, <16 x i32>%passthru, i32 4, <16 x i1> %mask)

Note that i’ve defined ‘LLVMPointerToType<0>’ here which would make the first argument a vector pointer. If you prefer the scalar as you originally gave then i’m sure we can define something even more complicated like ‘LLVMPointerToType<LLVMGetScalarType<0>>’

Thanks,
Pete

Thank you, I’ll take it.