About implementing new intrinsic

Hi,

I want to implement a new intrinsic in llvm that will denote a
parallel section within a function. I followed the documentation for
extending llvm (http://llvm.org/docs/ExtendingLLVM.html) but there is
something about the working mechanism that is not clear for me.

1. Why do we have to add support for the C backend? Is this only
necessary to transform the llvm assembly (bytecode) into C code by
using the library APIs?

2. Is the support that we add for the specific target the one that
takes part in generating the target specific code (native) from the
llvm bytecode?

3. Can I introduce an intrinsic that is actually a call to my function
that implements the logic? I suppose it is possible but unfortunately
I couldn't figure it out. For example, in GCC we can write an
intrinsic that translates to a C code.

Thanks,
Ferad

Hi,

I want to implement a new intrinsic in llvm that will denote a
parallel section within a function. I followed the documentation for
extending llvm (http://llvm.org/docs/ExtendingLLVM.html) but there is
something about the working mechanism that is not clear for me.

1. Why do we have to add support for the C backend?

Because it is an important part of LLVM. For example, bugpoint uses the
CBE as a reference code generator. If you break CBE, you break bugpoint.
If you break bugpoint, we will have a much harder time finding bugs in
LLC, LLI, and JIT. The CBE is just as important a code generator as any
of the targets. For example, it would be a vital component of
bootstrapping LLVM onto a platform that had no C++ compiler.

Is this only
necessary to transform the llvm assembly (bytecode) into C code by
using the library APIs?

I'm not sure what you mean by "by using the library APIs", but yes,
that's the basic idea. The CBE takes an LLVM Module (in C++ IR form) and
generates C code that will execute the represented program. It is
necessary for your intrinsic to be supported in the CBE.

2. Is the support that we add for the specific target the one that
takes part in generating the target specific code (native) from the
llvm bytecode?

There are two parts to this. First, you must lower the intrinsic from
its IR form (a Function*) into a SelectionDAG node. This is done in
SelectionDAGISel.cpp in the visitIntrinsicCall method. You might need to
change DAGCombiner.cpp and LegalizeDAG.cpp depending on the intrinsic
and its lowering. The second part is to modify the targets or generic
code gen to handle the lowered intrinsic call. This part is target
specific.

3. Can I introduce an intrinsic that is actually a call to my function
that implements the logic? I suppose it is possible but unfortunately
I couldn't figure it out. For example, in GCC we can write an
intrinsic that translates to a C code.

As part of PR1297 (http://llvm.org/PR1297) I am about to make this
happen. There are certain kinds of intrinsics that want to have a
function body generated for them if a target or code generator cannot
handle the intrinsic natively. For example, the company I work for has
targets that know how to do a "bit_concat". That is, take two integers
of any width and concatenate them into a longer integer. Most other
targets, however, don't know how to do this natively. In such cases the
intrinsic for "bit_concat" is lowered to an internal function that does
the necessary shift/or implementation.

I expect this functionality to be incorporated into LLVM in the next
week or so. Watch for changes to lib/CodeGen/IntrinsicLowering.cpp

Reid.

3. Can I introduce an intrinsic that is actually a call to my function
that implements the logic? I suppose it is possible but unfortunately
I couldn't figure it out. For example, in GCC we can write an
intrinsic that translates to a C code.

As part of PR1297 (Support Overloaded Intrinsic Functions · Issue #1669 · llvm/llvm-project · GitHub) I am about to make this
happen. There are certain kinds of intrinsics that want to have a
function body generated for them if a target or code generator cannot
handle the intrinsic natively. For example, the company I work for has

IntrinsicLowering already does this. It lets you lower intrinsics to arbitrary LLVM calls, including calls to external functions.

-Chris

Hi,

IntrinsicLowering already does this. It lets you lower intrinsics to
arbitrary LLVM calls, including calls to external functions.

I will try to do that in IntrinscLowering class. May you point me an
intrinsic implementation that lowers to an llvm call.

Thanks for advices ans hints,
Ferad

>> 3. Can I introduce an intrinsic that is actually a call to my function
>> that implements the logic? I suppose it is possible but unfortunately
>> I couldn't figure it out. For example, in GCC we can write an
>> intrinsic that translates to a C code.
>
> As part of PR1297 (Support Overloaded Intrinsic Functions · Issue #1669 · llvm/llvm-project · GitHub) I am about to make this
> happen. There are certain kinds of intrinsics that want to have a
> function body generated for them if a target or code generator cannot
> handle the intrinsic natively. For example, the company I work for has

IntrinsicLowering already does this. It lets you lower intrinsics to
arbitrary LLVM calls, including calls to external functions.

I think that when Ferad said "in GCC we can write an intrinsic that
translates to a C code" meant that the intrinsic would be expanded to
have a body much as I'm planning on doing in the this PR1297. To my
understanding, IntrinsicLowering doesn't support expansion to a function
with a body. Or, am I just missing something on that?

Perhaps Ferad could explain in a little more detail what he meant?

Hi,

I will try to explain by giving an example.

Let's say that I have an intrinsic: int llvm.myintrinsic(int)
I have a function: int myintrinsic_handler(int)

When
%var = call int %llvm.myintrinsic( int %arg )

is met in the code, I want the code generator put in its place: a call
to function
"myintrinsic_handler" (i.e. %var = call int %myintrinsic_handler( int %arg ) )

or probably in native assembly (i.e. x86) it will look like something this
push %arg
call myintrinsic_handler
pop res

bswap expands into a series of shifts and or's, for example. It would be straight-forward to expand it into a libcall if you desired.

-Chris

>> IntrinsicLowering already does this. It lets you lower intrinsics to
>> arbitrary LLVM calls, including calls to external functions.
>
> I will try to do that in IntrinscLowering class. May you point me an
> intrinsic implementation that lowers to an llvm call.

bswap expands into a series of shifts and or's, for example. It would be
straight-forward to expand it into a libcall if you desired.

I will throw a look how bswap is implemented. Currently I work on the
1.9 source.. are these features available there or I should checkout
from svn?

Thanks again for your helpful advices,
Ferad

This is exactly what intrinsic lowering does. To answer your other email, LLVM 1.9 also has this related stuff, including bswap:

-Chris

Thanks Chris

What parallelization model are you implementing ?

Thanks,

Can you explain what you mean by a parallel section within a function?

--Vikram

> I want to implement a new intrinsic in llvm that will denote a
> parallel section within a function.

Can you explain what you mean by a parallel section within a function?

Maybe it's a reference to OMP, http://www.openmp.org/. gcc has added
support for this (GOMP).

Ciao,

Duncan.

Hi,

> I want to implement a new intrinsic in llvm that will denote a
> parallel section within a function.

Can you explain what you mean by a parallel section within a function?

I want to see how OpenMP's parallel fit in the LLVM architecture, is
it easy to implement or not. GCC is too heavy platform to work on..

Ferad

When implemented in LLVM, OpenMP will be supported in a very similiar way to what GCC does: it will extract out the parallel region (the part that runs on multiple threads) into a self-contained function.

-Chris

Hi,

>>> I want to implement a new intrinsic in llvm that will denote a
>>> parallel section within a function.
>> Can you explain what you mean by a parallel section within a function?
> I want to see how OpenMP's parallel fit in the LLVM architecture, is
> it easy to implement or not. GCC is too heavy platform to work on..

When implemented in LLVM, OpenMP will be supported in a very similiar way
to what GCC does: it will extract out the parallel region (the part that
runs on multiple threads) into a self-contained function.

I agree also. But I just noticed that the LLVM internals are much
easier to manipulate (ok at least it seems to me).

Thanks,
Ferad