Statepoint GC : relocations on exceptional path

Hi Sanjoy,

First of all, a brief introduction. I’m Dmitry Olshansky software engineer at Google, we are currently working on experimental implementation of Dart language on top of LLVM. So far we had been successfully following the statepoint GC approach with some good results.

However one of problems that cropped up is related to item on the docs: http://llvm.org/docs/Statepoints.html#problem-areas-and-active-work. Namely “Relocations along exceptional paths are currently broken in ToT.” For us the problem is relocations exceptional path fail an assertion with a type mismatch - expected “token” but got landing pad instruction. We thought that you must have had faced the same issues with Azul’s LLVM engine, so might share a couple of pointers.

Hi Dmitry,

First of all, a brief introduction. I'm Dmitry Olshansky software engineer
at Google, we are currently working on experimental implementation of Dart
language on top of LLVM. So far we had been successfully following the
statepoint GC approach with some good results.

Thats great to hear! Please do file bugs for things that don't work.

However one of problems that cropped up is related to item on the docs:
Garbage Collection Safepoints in LLVM — LLVM 16.0.0git documentation. Namely
"Relocations along exceptional paths are currently broken in ToT." For us
the problem is relocations exceptional path fail an assertion with a type
mismatch - expected "token" but got landing pad instruction. We thought that
you must have had faced the same issues with Azul's LLVM engine, so might
share a couple of pointers.

The fundamental problem remains unsolved, but (as you guessed) we have
a partial solution implemented downstream that works for our specific
use case.

Our exception propagation scheme uses a thread local memory location
to propagate exceptions, and so we don't actually care about the
result of the landingpad instructions. We can thus type the
landingpad result as a token, and use that token to unambiguously tie
gc.relocates in the unwind block to the landingpad in the unwind
block. Since RewriteStatepointsForGC clones landingpads (see
normalizeForInvokeSafepoint) until there is only one landingpad per
invoke, this scheme unambiguously ties each gc.relocate to a
gc.statepoint.

I also noticed some downstream changes in our tree that make LLVM
robust around tokens (e.g. we don't unswitch loops if that will introduce
a PHI of token type), but I'm not sure how much of that is actually
required today. I suspect most of those changes are now upstream in
some form or another.

Of course, our scheme falls apart if you actually care about the
result of the landingpad instructions (and you run into the issues
brought up in [0]), and want to do proper Itanium style unwinding.
Right now we don't have a solution for this more general usage
pattern.

Let me know if you have more questions!

-- Sanjoy

[0]: http://lists.llvm.org/pipermail/llvm-dev/2016-January/094411.html