__inline prevents inlining

I had a lot of fun writing Bolin - Does it inline?

I was intentionally messing with the source to see if I can break the optimizer. In round 2 I notice adding the __inline keyword will make it NOT inline when it originally was (despite the noinline attribute).

__attribute__ ((noinline))
__inline int fn(int v) { return v*0; }
int main(int argc, char *argv[]) { return fn(123); }

The page mentions a few differences between clang and gcc. clang did quite a few optimizations that gcc didn’t do. One case that both failed was to inline a recursive fibonacci function when you pass it 0 and it starts with if(n == 0) return 0. Perhaps it’s difficult to optimize the case where a function shouldn’t optimize. If it’s easy to check the first (few) if statement it may improve a lot of code

First, and foremost, keyword inline has nothing to do with inlining. It is a way to tell the compiler/linker that it is not an error if the function has multiple definitions in different translation units (the definitions must be semantically identical). Gory details are here and here.

Clang does not inline functions with noinline attribute. In your particular case it is interprocedural constant propagation that happened, and then the call was DCEd.
Clang does not propagate the return value into main if you add __inline, because the fn may have different body in a different module. See isDefinitionExact in GlobalValue.h.

1 Like

Language like this is directly contradictory to the community code of conduct. Please try to keep the tone constructive and avoid vulgarity (or initialisms containing vulgarity).

We treat noinline as program semantics and “inline” as hint (when it comes to inlining), so noinline will always win.

1 Like

Sorry. Didn’t sound rude to me, but my English is far from good.

Just to clarify so we’re all on the same page. The article I wrote and the title was just for laughs and I was trying to break the optimizer (I do both always_inline and noinline after that). I was surprised that __inline affected anything. I always thought that having a different body in another cu would cause undefined effects so it made no sense that __inline would prevent any optimizations from happening. gcc doesn’t seem to do ipa-cp when that function is marked inline. I don’t know if clang wants to copy them for consistency but that is one difference I noticed while writing the article.

I mostly posted so people who want to see clang results or check of consistently with gcc have an easy page to look at. Although it’s written for fun, not to compare.

What I think is a good optimization is… (The rest is a repeat of my OP after the code example) being able to peek at the first few instructions of a complex function to see if it starts with if (param == 0) return 0;. GCC doesn’t do this either. Many functions I write start with null and int range checks so it may prove beneficial.

I think it just does not like recursive functions. Returning immediately from a recursive function would mean the end of a recursion chain, which should happen at some nesting level instead of at the top level. If that happens immediately, this is likely to be a cold path and it is not worth optimizing it.
This particular case of ‘if (param == 0) return 0’ should’ve been handled by PartialInlining pass, but alas. I don’t know why it did not apply, but I’d suppose it didn’t happen because of low branch probability.