Inlining

OK, I wanted to understand function inlining in LLVM but had avoided
going to the effort of finding out if the inlining was really happening.
The advice I got to "use the assembly source, Luke" suggested I go
ahead and investigate inlining for a bit of practice, since (so I
figured) even a monkey with really weak x86-fu could tell whether a
function call was happening or not.

If this monkey can tell, it isn't happening. :slight_smile: I'll try to provide
all useful information.

For my null test, I attempted to specify no inlining in a little program
that computes a Very Important Number :slight_smile: :

'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat machine-code and object-file optimizations, but it does not apply high-level optimizations like CSE or inlining. 'opt' is the tool which does IR-to-IR optimization.

John.

A vital clue, but I'm still not getting it:

Try opt -O3.

-Chris

Dustin Laurence wrote:

'llc' is an IR-to-assembly compiler; at -O3 it does some pretty neat
machine-code and object-file optimizations, but it does not apply
high-level optimizations like CSE or inlining. 'opt' is the tool
which does IR-to-IR optimization.
    
A vital clue, but I'm still not getting it:

---
gemini:~/Projects/Nil/nil(0)$ make testInline.optdis.ll
llvm-as testInline.ll
opt -always-inline testInline.bc -o testInline.optbc
llvm-dis -f testInline.optbc -o testInline.optdis.ll
rm testInline.bc testInline.optbc
gemini:~/Projects/Nil/nil(0)$ cat testInline.optdis.ll
; ModuleID = 'testInline.optbc'

define linkonce fastcc i32 @foo(i32 %arg) alwaysinline {
  

Try using 'internal' linkage instead of 'linkonce'. If you're sure you really want linkonce then you'd need to use linkonce_odr to get inlining here.

Also, drop the alwaysinline attribute and '-always-inline' flag. The normal inliner (aka. "opt -inline" which is run as part of "opt -O3") should inline it.

Nick

Hi Dustin,

define linkonce fastcc i32 @foo(i32 %arg) alwaysinline

linkonce implies that the function body may change at link time. Thus it would
be wrong to inline it, since the code being inlined would not be the final code.
Use linkonce_odr to tell the compiler that the function body can be replaced
only by an equivalent function body.

Ciao,

Duncan.

I actually had, but as Nick Lewycky noticed the 'linkonce' linkage
specification was preventing the inlining.

Dustin

That did it, thanks.

Hi Duncan-

Forgive my confusion, but I can't help notice that LangRef states:

Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded.

Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions?

Alastair

Hi Alastair,

Forgive my confusion, but I can't help notice that LangRef states:

Globals with "linkonce" linkage are merged with other globals of the same name when linkage occurs. This is typically used to implement inline functions, templates, or other code which must be generated in each translation unit that uses it. Unreferenced linkonce globals are allowed to be discarded.

Why would linkonce be used to implement inline functions if it's not safe to inline linkonce functions?

I was wrong - linkonce is an exception to the general rule that a "weak" linkage
type prevents inlining unless of the "_odr" form.

Ciao,

Duncan.

Except it really did prevent inlining in my test. If I follow, and I
probably don't, what you said matched the behavior of LLVM and the docs
didn't.

Dustin

Hello Dustin,

Always inline is the closest to a preprocessor macro you can get in LLVM Assembly since it doesn't have a preprocessor at all. LLVM does aggressive inlining for functions used only once so those instances don't require specification as alwaysinline.

--Sam

Always inline is the closest to a preprocessor macro you can get in
LLVM Assembly since it doesn't have a preprocessor at all.

Mine does. :slight_smile:

...LLVM does
aggressive inlining for functions used only once so those instances
don't require specification as alwaysinline.

What I'm trying to do is understand the practical use cases. Concrete
example: I have some little type accessor and conversion functions that
are typically two or three instructions long because all they really do
is manipulate tag data in the low-order bits of pointers. (I'm not
exactly innovative, am I?) While small, they are called all over the
place for boxing and unboxing language-level objects. In C they would
be explicitly inline. What is the LLVM equivalent?

My guess is the optimizer will always inline such tiny functions no
matter what as it's probably both a space and a time win, so maybe I
need a different example. Suppose they were typically five, or ten, or
twenty, or forty instructions long? Who is responsible for deciding on
the advisability of inlining? The front-end (which in this case is
actually me?)? That would be the equivalent of the C99/C++ 'inline'
compiler hint. Or in LLVM is it better not to give manual compiler
hints about inlining in most cases and let the optimizers decide?

I suppose it's a fuzzy question because I'm fishing for intended usage,
not just semantics.

Dustin

Actually, the inliner doesn't inline linkonce either, because we have:
InlineCost InlineCostAnalyzer::getInlineCost(CallSite CS,
                               SmallPtrSet<const Function *, 16> &NeverInline) {
...
  // Don't inline functions which can be redefined at link-time to mean
  // something else. Don't inline functions marked noinline.
  if (Callee->mayBeOverridden() ||
      Callee->hasFnAttr(Attribute::NoInline) || NeverInline.count(Callee))
    return llvm::InlineCost::getNever();

I improved the langref description of linkonce in r93066.

-Chris

Hello Dustin,

Alwaysinline is not a hint. It forces something inline that wouldn't have otherwise been as long as the linkage type permits it. (You just ran into a situation where linkage did not permit it.)

Personally, I don't see the need for a preprocessor in most circumstances. If you need to do type substitution you can use an opaque type. The only reason for conditional compilation is if you'd need to be able to generate inline assembly for the host (which shouldn't ever be absolutely necessary in LLVM except for legacy code).

One thing I wanted to do with the language we're developing is to do a custom template-like function involving always-inlines containing opaque types. It would rest heavily on the type system remaining as it is (assuming it works the way I think it works) and it seems that Chris Lattner wants to change that. Maybe it's a good thing our project is as far behind schedule as it is. I'd better do some experimenting sometime with opaque types and inlines together to see if they work as expected for producing easy macros.

Anyway, sorry for drifting off-topic,

--Sam

Alwaysinline is not a hint. It forces something inline that wouldn't
have otherwise been as long as the linkage type permits it. (You
just ran into a situation where linkage did not permit it.)

Understood. I am just wondering if one should generally trust the
optimizer, or if it's better to manually insist on inlining functions
that should obviously be inlined.

My guess is for normal usage you trust the optimizer, and use
alwaysinline for unusual things you know need inlining but the optimizer
can't figure it out (say inlining an over-large function into a tight
inner loop in your star formation hydrodynamics code)?

Personally, I don't see the need for a preprocessor in most
circumstances.

I suspect that's because in spite of my funny questions your brain
refused to believe that I am doing something as deranged as writing a
non-trivial interpreter for a "real" language in raw IR with a text
editor, and so you assumed I was doing something sane instead. :slight_smile: I
challenge you to write LLVM IR with only Stone Knives, Bearskins, a text
editor, and llvm-as (and make or anything else you like as long as it
doesn't manipulate the source unless you build the tool starting with
the Stone Knives) as well engineered as I can with preprocessor help (in
this case, simple, brain-dead CPP because it's available and m4 is
simply not to be contemplated). Seriously--I don't think it can be
done, so if you do it then I'd learn a lot.

In essence, I am the manual front-end. I'm not as consistent and
predictable as a normal one, but I like to think I make better dinner
conversation. :slight_smile:

...If you need to do type substitution you can use an
opaque type. The only reason for conditional compilation is if you'd
need to be able to generate inline assembly for the host (which
shouldn't ever be absolutely necessary in LLVM except for legacy
code).

Um...*everything* for me is the equivalent of inline IR for you. Note
well that I make absolutely no claims that it is *necessary* in any way,
however. :slight_smile:

I admit I don't really understand opaque types yet, but they won't do
what I need down here with my primitive Stone Tools. The single biggest
wins were the simple ability to #include (same reason as in C: I can
have separate source modules whose interfaces are type-checked) and to
define constants to parametrize the code with certain choices (it's
awful nice not to have things like the tagged representation of nil
hard-coded a hundred places in the code, as I found out when I realized
I'd made the wrong choice). Conditional compilation and parametrized
macros are OK but not as vital.

One thing I wanted to do with the language we're developing is to do
a custom template-like function involving always-inlines containing
opaque types.

And will that have the Turing-completeness of C++ templates? :smiley: My
advice is not to use angle brackets....

Dustin