Hi Philip,
If I'm reading you correctly, you are relying on exception propagation
and handler (destructors for local objects) execution. You have chosen
not to add extra exception logic to LLVM itself, but are relying on the
correctness of exception propagation within the code. (The last two
sentances are intended to be a restatement of what your message said.
If I misunderstood, please correct me.)
It was probably not completely correct to say that we did not extend
the exception propagation in LLVM. In most of cases where malloc or
other C allocation functions are called, we had to add a check for NULL
and throw std::bad_alloc. But these are a kind of straightforward fixes,
that do not require much effort.
Does this mean that you're compiling your build of LLVM with exceptions
enabled? By default, I believe LLVM is built without RTTI or exception
support.
OK, I see. This explains why the destructors in LLVM are not always
prepared to be executed in exception situations. Yes, we build LLVM
with exception support. In principle, we build it with the same options
like the rest of our project.
This is useful to know. Just fair warning, you're probably running a fairly odd configuration compared to the rest of the LLVM community. That might expose "interesting" bugs. (Note: I have no evidence that the exceptions enabled config is actually rare, just my perception watching list traffic.)
Actually, I could hardly imagine that
we could handle OOM situations without error handling.
Agreed, but error handling does not imply exceptions. It might be the easiest way, but it's not the only one.
For the particular cases you mentioned with auto pointers and allocation
in destructors, are these issues also present along existing error
paths? Or for that matter simply examples of bad coding practice? If
so, pushing back selected changes would be welcomed. I'd be happy to
help review.
Yes, there are some examples of bad coding practice. The root problem,
however, is that the destructors in LLVM do a lot of complicated stuff.
Instead of just deleting objects following a strictly hierarchical ownership
structure, the objects are unregistered in various relationships, which
potentially trigger unwinding in quite different locations. Such non-trivial
coding sometimes requires dynamic allocation of new collections, which is
problematic in OOM situations.
This is probably not going to change.
One simple hack to get around this would be to have a reserved allocation set which is released on OOM. This would enable a small number of allocations during unwind without double-OOM. Getting the allocation amount right is a bit challenging, but such schemes can be made to work.
Actually, we did not manage to completely
fix the unwinding of the compiler state. That was one of the reasons to
move all compiler passes to a separate process.
Here is a rough overview of our current set of patches to LLVM3.3. In total,
we have 31 patches to LLVM related to OOM handing. They fix only the
components that we could not outsource to the separate process:
the core IR classes that we use for IR generation, IR bitcode serialization,
and the dynamic code loader.
I'm about to break these down by how likely I believe these changes are to be accepted back if you wanted to push them. Keep in mind that this is only my opinion and that I do not speak for the community as a whole.
* In 8 of the patches we fix the malloc calls to throw bad_alloc.
This is probably not going to be accepted as is.
* 10 patches fix destructors. Some of fixes are about disabling or
rewriting the code triggering dynamic allocations. Some other fixes
disable asserts that check for invariants that are not valid in exception
situation.
These might be accepted on a case by case basis. Depends on the actual change in question.
* 5 patches deal with exceptions in constructors. One kind of problems
results from the fact that if an exception is thrown in the constructor of
an object, the destructor is not called which causes a leak.
Framed as exception propagation, these probably wouldn't be accepted. However, it sounds like a general error return pattern could address the same issue. That might be accepted.
There is also a specific problem that IR objects register themselves with
their parents already in their constructor. And if a constructor fails
afterwards, the parent contains a dangling pointer to its child.
Same as previous.
* 5 patches fix temporary ownership of objects, mostly the situations
when object are created, but not added to the owning collections.
These would probably be accepted.
Note that 31 patches does not mean 31 fixes, because some of them
contain all fixes to a particular file or class.
How willing are you to share your code? If you're willing to put your changes up on github or something, we could go through them and attempt to migrate some of them back for inclusion in base llvm.
Philip