inlining in exception handing region

Hello,

While I was investigating C++ benchmarks, I was aware of that some small
functions with the throw statement were not inlined because the exception
handling code increased the inline cost. In the c++ example below, the
inline cost of fCallee is high mainly because of instructions for the
throw statement, and note that the constructor of MyException is even
inlined in fCallee.

Assuming that the exception handling code is rarely executed, would it
make sense to prevent CallSites in exception handling region (e.g., the
constructor of MyException) from being inlined so that we can avoid code
size blow-up in exception handling region and indirectly give more
inlining opportunity to the small unwinding functions containing exception
handling code?

class MyBaseException {
  int idx;
  int limit;
  const char* msg;
  public :
    MyBaseException(int i, int l, const char* m);
    ~MyBaseException();
    void handle(const char*m, int i, int l);
};

class MyException : MyBaseException
{
  public:
    MyException(int i, int l, const char* m): MyBaseException(i, l, m) {
      handle(m, i, l);
    }
};

int *Agg;

int fCallee(int idx, int limit) {
  if (idx >= limit)
    throw MyException(idx, limit, "error");
  return Agg[idx];
}

int fCaller(int i, int l) {
  return fCallee(i, l);
}

Thanks,
Jun

I observed about +6% performance improvement in one of c++ tests in
spec2006 in AArch64 by avoiding inlining functions in exception handling
region. I guess this is not really rare because programers often make
small methods with throw statement like fCallee() in below sample c++
code.

Thanks,
Jun

I'm interested in this idea, but I want to point out that this is simply one special case of a static heuristics for estimating call frequency. This is essentially applying profile guided optimization style logic based on a static heuristic about hotness of code. I'm not saying this to dismiss it, merely to help frame it in context of things already being discussed and considered for the inliner.

How is your patch structured w.r.t. exception handling region detection in the caller during inlining? That seems like it will be the critical design point.

Philip

I agree that this could be seen as a special case of inlining in cold
region. However, I doubt if we need to see this only in the context of
PGO. I guess, in general, considering exception handing code as cold is
not unreasonable even without relying on profiling.

Regarding detecting exception handing region, I'm relying on getting
hints from standard function name such as “__cxa_allocate_exception".
Please let me know if anyone has any better idea about it.

Thanks,
Jun

I pushed my initial patch for no inlining in exception handling region.

It tries to find functions which takes memory allocated specifically for
exception to be thrown as its first argument, which must be a constructor
or method executed in the context of exception handling. And then, the
NoInline attribute is added in CallSite in exception handling region.

Note that currently this change only handles CallSites invoked in throw
statement. For CallSite in catch blocks, we could similarly add the
NoInline attribute by traversing blocks reachable from landingpads until
EH return points. With this change, 6% performance improvement was
observed in spec2006/xalancbmk.

Please find the revision in http://reviews.llvm.org/D12979.
I hope this open up more discussion about inline decision for exception
handle region.