global optimizer precision

Hi all,

I had a look at the interprocedural optimizer. In my opinion the
routine 'GlobalOpt::ProcessInternalGlobal' is a little bit to
conservative. It removes global variables if the only routine using
this variable is main. Typically this condition is valid only for very
few global variables.

Here is a code snippet containing the test before the transformation:
file: llvm\lib\Transforms\IPO\GlobalOpt.cpp

    // If this is a first class global and has only one accessing function
    // and this function is main (which we know is not recursive we can make
    // this global a local variable) we replace the global with a local alloca
    // in this function.
    //
    // NOTE: It doesn't make sense to promote non single-value types since we
    // are just replacing static memory to stack memory.
    //
    // If the global is in different address space, don't bring it to stack.
    if (!GS.HasMultipleAccessingFunctions &&
        GS.AccessingFunction && !GS.HasNonInstructionUser &&
        GV->getType()->getElementType()->isSingleValueType() &&
        GS.AccessingFunction->getName() == "main" &&
        GS.AccessingFunction->hasExternalLinkage() &&
        GV->getType()->getAddressSpace() == 0) {

The comment above the test states the reason for the check for main
which is: main is not recursive.
My proposal is to introduce a routine to check if a function is
recursiv (returning false, only if its not recursiv for
sure). Than one can replace the check for main with the call to the
'isrecursiv'-Function. The implemention of 'isrecursiv' can first
check for main and later improve the precision by inspecting the call
graph. This finally would lead to a more aggressiv interprocedural
optimizer.
Whats your opinion?

kind regards,
Gordon Haak

This would be ok, but it needs to be more general than that. Specifically, this function:

extern void bar();
void foo() {
  ...
  bar();
}

Might be recursive, because 'bar' might call foo. IIRC, GCC mainline has an attribute ("leaf"?) to handle something like this. Building up this infrastructure would be great.

-Chris

That's not the only needed condition; for this to be safe you also
need to know that one of the following holds:
a) The function is only called at most once.
b) The global is constant.
c) The function always overwrites (any part of) the value before
reading it (or that part).
d) Any situations I forgot :).

The reason for this is that local variables don't preserve their value
between calls, so you can't just do this for most (non-recursive)
functions without verifying it's safe for some reason.

main() of course satisfies option (a), so there it's always safe.