RFC: Garbage collection infrastructure


Attached for your review: basic infrastructure for efficient garbage collectors. Only enough information is currently gathered to support the runtime I’m working with, and -print-gc is currently the only consumer of this information.

All collector policies are presently controlled by constants. There are no regressions (on darwin-i686) if the feature is left disabled. If enabled (EnableGC = true in GC.cpp), the JIT breaks because it doesn’t support labels.

This obviously needs to be fixed; the collector needs to be turned on and off by a policy object. I’m not sure where best to hang that object off of, though.

  1. Attach it to Target/Subtarget?

  2. Attach it to Function, like the calling convention?

  3. Attach a whole-new TargetCollector (TargetRuntime?) to the Module?

#1 is simplest, but I think wrong. #2 seems best to me. It would let the inliner merge bare code with managed code, or vice versa, but avoids the risk of silliness like inlining Java into Ocaml.


Once a decision is made on that, I’ll resurrect the shadow stack GC to ensure I’m maintaining generality. There exists great variability in collectors, so supporting several is important, I think. To support:

[1] no collector
[2] ocaml collector
[3] shadow stack collector
[4] collectors for concurrent systems
[4a] zero overhead: safe points are patched to jump into the runtime
[4b] intrusive: code must check a “stop” flag at each safe point
[5] cooperative collectors, where mutators help with collection
[6] incremental and generational collectors

I see need for the target-independent code to support the following:

• Disable GC? [1*]
• Find safe points in tight loops? [4]
• " at calls? [4]
• " after calls? [2*]
• " before returning? [4]
• Pad safe points with noops to accommodate a patch? To how many bytes? [4a]
• Allow roots in registers? Otherwise force roots into stack slots at safe points [2].

And additionally provide callbacks to:

• Custom lower gcread/gcwrite intrinsics [5,6]. Otherwise replace with plain load/store [2,3*].
• Custom lower gcroot intrinsics [3*]. Otherwise leave for the code generator [2*].
• Introduce code at safe points [4b].
• Print assembly for per-function metadata.
• Print assembly for whole-module metadata [2*].
• Record collector metadata in a JIT context.

Luckily, only those marked with [*] are current needs. :slight_smile:

Thanks for your feedback!


gc.2.patch (39.4 KB)