[LLVM-Codegen] Query for stack operations during register allocation pass

Dear All,

In PrologEpilogInserter.cpp -> PEI::replaceFrameIndices()

Currently we've an assert inside this function which assumes stack operations before and after any function call to be balanced at basic block level.
The assert cond. checked at end of each Basic Block goes as follows

    // If we have evenly matched pairs of frame setup / destroy instructions,
    // make sure the adjustments come out to zero. If we don't have matched
    // pairs, we can't be sure the missing bit isn't in another basic block
    // due to a custom inserter playing tricks, so just asserting SPAdj==0
    // isn't sufficient. See tMOVCC on Thumb1, for example.
    assert((SPAdjCount || SPAdj == 0) &&
           "Unbalanced call frame setup / destroy pairs?");

But in practice we see stack to be balanced at function level.
My query is, Am i missing some fundamental concept while assuming that stack should be balanced at function level rather than Basic Block Level.
Is frame balancing at Basic Block level justified ?

This bug is rare as it would happen only in big function arguments (on ARM, MIPS etc.)
eg: On arm if argument size exceeds ((1 << 12) - 1) / 2) ARMFrameLowering::hasReservedCallFrame(), we can see this issue.

For reference i've created a bug on http://llvm.org/bugs/show_bug.cgi?id=15932 which contains the problem along with the proposed patch.
However since this would affect all other platforms as well, i wanted your expert opinion about my approach.

PS: I'm awaiting my account creation on phabricator, for time being posting this query on cfe-dev for discussing the solution approach.

Warm Regards,
Naveen