RFC: Garbage collection infrastructure

LLVMers,

Attached for your review: basic infrastructure for efficient garbage collectors. Only enough information is currently gathered to support the runtime I’m working with, and -print-gc is currently the only consumer of this information.

All collector policies are presently controlled by constants. There are no regressions (on darwin-i686) if the feature is left disabled. If enabled (EnableGC = true in GC.cpp), the JIT breaks because it doesn’t support labels.

This obviously needs to be fixed; the collector needs to be turned on and off by a policy object. I’m not sure where best to hang that object off of, though.

  1. Attach it to Target/Subtarget?

  2. Attach it to Function, like the calling convention?

  3. Attach a whole-new TargetCollector (TargetRuntime?) to the Module?

#1 is simplest, but I think wrong. #2 seems best to me. It would let the inliner merge bare code with managed code, or vice versa, but avoids the risk of silliness like inlining Java into Ocaml.

Opinions?

Once a decision is made on that, I’ll resurrect the shadow stack GC to ensure I’m maintaining generality. There exists great variability in collectors, so supporting several is important, I think. To support:

[1] no collector
[2] ocaml collector
[3] shadow stack collector
[4] collectors for concurrent systems
[4a] zero overhead: safe points are patched to jump into the runtime
[4b] intrusive: code must check a “stop” flag at each safe point
[5] cooperative collectors, where mutators help with collection
[6] incremental and generational collectors

I see need for the target-independent code to support the following:

• Disable GC? [1*]
• Find safe points in tight loops? [4]
• " at calls? [4]
• " after calls? [2*]
• " before returning? [4]
• Pad safe points with noops to accommodate a patch? To how many bytes? [4a]
• Allow roots in registers? Otherwise force roots into stack slots at safe points [2].

And additionally provide callbacks to:

• Custom lower gcread/gcwrite intrinsics [5,6]. Otherwise replace with plain load/store [2,3*].
• Custom lower gcroot intrinsics [3*]. Otherwise leave for the code generator [2*].
• Introduce code at safe points [4b].
• Print assembly for per-function metadata.
• Print assembly for whole-module metadata [2*].
• Record collector metadata in a JIT context.

Luckily, only those marked with [*] are current needs. :slight_smile:

Thanks for your feedback!

Gordon

gc.2.patch (39.4 KB)