Selectively Disable Inlining for Functions

Dear All,

I was wondering if there is a standard way of specifying a list of functions that *should not* be inlined by the -inline pass.

I'm currently working with an experimental analysis pass that checks for calls to memory allocation functions; inlining and dead code elimination might make the pass more stable, but we don't want to inline the calls to the memory allocation functions until after our analysis pass is finished.

-- John T.

I was wondering if there is a standard way of specifying a list of functions that *should not* be inlined by the -inline pass.

Nope, but you could hack something into gccas/gccld if you want. Of course, you can disable inlining completely with the -disable-inlining flag.

I'm currently working with an experimental analysis pass that checks for calls to memory allocation functions; inlining and dead code elimination might make the pass more stable, but we don't want to inline the calls to the memory allocation functions until after our analysis pass is finished.

The simplest way is to change the heuristic to consider those functions as expensive to inline.

-Chris

Changing the heuristics directly would have to be a custom change (i.e., couldn't be checked in). Is there a way for a client pass or tool to influence the heuristics? If not, does it make sense to add such a mechanism?

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

To be clear, I'll restate my position here, then follow up with more specifics of such a mechanism to Markus' email.

My basic position is that I think it's a bad thing to let the user have fine grained control over optimizations at the source level. Being able to say "always force this to be inlined" will make less or more sense as the compiler evolves. In particular, as the compiler gets better at improving code, the decision about what to inline will change, and code that uses thes attributes won't (the authors of the code are unlikely to revisit the attributes after they are written).

As one particularly pointed example, the code that uses attributes like 'always inline' are typically written and tuned for GCC, often for old versions of it. GCC and LLVM (obviously) have very very very different optimization capabilities (e.g. LLVM can do interprocedural inlining, dead argument elimination, interprocedural constant prop, etc), and forcing something to be inlined for GCC has a very different impact than does forcing LLVM to inline it.

The meta problem with this is that the code will still *work*, it will just perform more poorly than it should. As such, people are very unlikely to revisit these attributes after they are initially written.

The above is a description of why I think that "always inline" is a bad idea to support. However, I *do* [now] support the notion of "never inline". In contrast with "always inline", never inline sometimes isn't a performance hint: it can be a correctness hint and is far more invariant across compiler versions than always inline is. While it can obviously be abused, I think the chances for its abuse are reduced.

I will respond to Markus' mail with a concrete proposal for how this could be implemented.

-Chris

Hi Markus, please don't take any of these comments about your patch as disrespect: I can tell you worked hard on it, and it is nice work. Hopefully my previous email helps clarify my position about why I think these sorts of things are dangerous, and the ideas below will crystalize a way forward.

Here are the specific problems with your patch that I see:

1. ExplicitInline/ImplicitInline are currently ignored. These fall into
    the class of dangerous information that is hard to use in a
    meaningful way. For example just becase a method is defined in the
    body of a class, it doesn't tell you anything about its potential for
    inlining: huge methods can be there too. I think that using this
    information is extremely dangerous.
2. Using "always inline" is also dangerous, as described in my previous
    mail. However, as a compromise, the new front-end DOES inline "always
    inline" functions itself, which should provide this capability in a way
    that is compatible with GCC. In particular, relying on this allows
    us to honor "always inline" within a file, without honoring it across
    files. In addition these functions will be inlined before any other
    optimization is performed (a good thing for these).
3. As implemented, your 'never inline' option has a significant problem:
    it does not disable IPO of the function. In particular, one reason
    that people want to disable inlining is so that there is a well-known
    entry-point to the function. If the function isn't inlined, but
    arguments are deleted, this capability breaks.

As a specific proposal to provide "never inline" in a way that is low-impact on LLVM and solves #3, I suggest that we add a new llvm.neverinline global array variable (like the llvm.used global). This global would point to all neverinline functions, and would have appending linkage (like llvm.used).

With this design, these functions would be assumed to be used externally (because a global with external visibility points to them) so no IPO could modify their arguments, and no assumptions about the callers could be made. Because the global has appending linkage, the standard LLVM linker will correctly concatenate the list of functions to not inline when it links LLVM modules.

Finally, the inliner would be taught to look for this global, and refuse to inline anything pointed to by it.

By using a global like this, no changes to the .ll,.bc, or in-memory format would be required, and no passes need to be modified to update these attributes as they create and clone functions.

Does this seem like a reasonable approach? Who wants to implement it? :slight_smile:

-Chris

To be more explicit, here's an example of what attribute(used) does. Given:

int X __attribute__((used));
int Y;
void foo() __attribute__((used));
void foo() {}
void bar() {}

The C front-end emits this global:

; Attribute used list
%llvm.used = appending global [2 x sbyte*] [
         sbyte* cast (int* %X to sbyte*),
         sbyte* cast (void ()* %foo to sbyte*)
     ]

"llvm.used" is never internalized, so it never goes away. Its presense automatically disables IPO and prevents the functions from being removed as described earlier.

-Chris

I'm also curious to hear what your proposal will look like.
Fine-grained optimization control is something I've looked at somewhat
with my LENS project - essentially you would keep advice like this in
an external file, so that it doesn't change the source and because it
is stored with version metadata, a future version of the compiler
could notice that the advice was intended for an older version, and
issue a warning or ignore it.

Of course there are other practical problems with keeping separate
metadata files - it makes it harder to maintain drop-in compatibility
with build scripts, for instance. Also, in the case of using 'never
inline' for correctness, that probably does belong in the code.

-mike

Still, my approach makes the inline hint a first-class property of an LLVM function just like the calling convention, including preserving full source code information.

Preserving full source code information isn't a goal of LLVM, at least if you don't count debug information. :slight_smile:

Most of the patch is actually boring infrastructure, and you will note that the actual consultation of the hints in InlineSimple.cpp is just a few lines leaving much room for further improvements.

Yup.

Here are the specific problems with your patch that I see:

1. ExplicitInline/ImplicitInline are currently ignored. These fall into
   the class of dangerous information that is hard to use in a
   meaningful way. For example just becase a method is defined in the
   body of a class, it doesn't tell you anything about its potential for
   inlining: huge methods can be there too. I think that using this
   information is extremely dangerous.

Well, I didn't need to exploit that info for my purposes, that's why it says "FIXME" in the patch.

I understand that. My point above is that there doesn't seem to be a good way to *use* this information. Inline markers are essentially arbitrary: people use them to affect linkage as much as they do to mark things they want actually inlined. Except for the always and never inline cases, I think that structural analysis of the code will always be more useful than "it was defined in a class body" or "the user used inline".

3. As implemented, your 'never inline' option has a significant problem:
   it does not disable IPO of the function. In particular, one reason
   that people want to disable inlining is so that there is a well-known
   entry-point to the function. If the function isn't inlined, but
   arguments are deleted, this capability breaks.

Yes, this is on purpose. But you can use both __attribute__((noinline)) and __attribute__((used)) for disabling IPO.

Fair enough.

As a specific proposal to provide "never inline" in a way that is low-impact on LLVM and solves #3, I suggest that we add a new llvm.neverinline global array variable (like the llvm.used global). This global would point to all neverinline functions, and would have appending linkage (like llvm.used).

This is an implementation issue, but I wonder if we should use too many of those "magic" arrays instead of using proper attributes.

The nice thing about the magic arrays are that they don't require explicit IR support to keep updated and they do not require memory for functions that don't use them (adding a field to Function would increase the space usage for all Functions).

-Chris