[PATCH] Move ownership of GCStrategy objects to LLVMContext

  • llvmdev in the hopes that a broader audience sees this and chimes in.

At this point, it’s clear Chandler and I disagree on the right approach going forward. What do others think?

There are really two independent questions here. I’ve tried to separate them and provide appropriate context for the discussion so far.

Chandler, if I managed to misrepresent your positions at all, sorry!, and please correct me.
I believe it does make sense to check some GC related properties in the Verifier. My position is that the GC attribute essentially imposes an additional set of restrictions on what it means to have valid IR. Specifically, we should be able to check cheap, local facts about the IR such as: - A GC which uses gc.statepoints does not also use gc.roots and vice versa. - A gc.statepoint/gc.relocate sequence only relocates pointers with the “right” address space. (This applies only to GCs which use address spaces to distinguish pointers.) - If a derived pointer is relocated, it’s base must also be available in the gc.statepoint. (This applies only to relocating GCs.) Otherwise, the base pointer (must?, may?) be (null?, undef?). - There should be no gc.relocates in IR used by a non-relocating GC. - A gc.statepoint whose relocations haven’t been made explicit yet - indicated by a possible future flag on gc.statepoint - isn’t legal for a GC which doesn’t use late rewriting. (See for context.) (The previous list is meant to be illustrated examples. Only some of them actually exist so far, and I make no commitment to ever implement the others since they involve infrastructure and design work which hasn’t happened yet.) To use your previously stated criteria, every one of the above is (or at least should/would be) an assertion in the lowering code or in some pass. I do not believe that the Verifier is the right place to check more complicated GC properties. For example, trying to check that a gc.statepoint relocates all pointer values which might be live over the gc.statepoint is an expensive validation check which does not belong in the Verifier. I have no opinion on what mechanism might be best to represent such checks (yet.) Both the verifyAnalysis, and Lint mechanisms might be reasonable choices. There’s also a gray area in between that I’m really not sure about. For example, the late safepoint placement/rewriting stuff relies on there not being new addrspacecast instructions inserted by the optimizer. (And none in the original IR.) It’s unclear whether having that as a GC specific check in the Verifier is the right approach or not. I think we can address these case by case as they come up. - What should the access model for GCStrategy be? Is it part of the IR? Or should it be treated like an Analysis result? Chandler - Thinking about this overnight, I’ve realized I don’t actually have much of an objection here. I still think the direction you’re pushing is the wrong one, but it doesn’t actually matter that much to my overall goals. Having said that, let me try to convince you why the current (as of change 226311 which started this thread) is better. (For the record, I will respect whatever the community decides and migrate the code in that direction. I submitted the change in question while we debate - with Chandler’s permission - while we debate, but I will update as needed.) My position comes down to two key points: the GCStrategy associated with a given Function is fundamental part of the IR definition, and the complexity we’re talking about for the core IR is a single class with a set of boolean flags* - i.e. pretty minor. * Let’s ignore the performCustomLowering mechanism. I’d like to factor that out, or get rid of it, anyway. The properties of the garbage collector that a given piece of code is being compiled for is a fundamental part of the IR definition. There is no “safe default” to use. If information about the GC is not available, the code will be likely be miscompiled. As a result, a missing GCStrategy (i.e. one whose string key we don’t recognize) is, and should be, a fatal error. (Chandler’s proposal wouldn’t change this last point.) The analogy I’d make would be to data layout information. The DataLayout effects the semantics of the IR, and is thus considered “part of” the Module, rather than something “computed from” a Module. Similarly, if you’re given a gc.statepoint for a relocating GC, it’s not legal to treat it like one for a non-relocating GC. In particular, there are legal optimizations for the later which are illegal for the former. The targeted GC is semantically “part of” the IR. Of course, you could also analogize this to information about the sub-target a particular module is targeting. We clearly don’t treat sub-targets as “part of” the Module. So honestly, it could go either way. I had previously made an argument based on object lifetime. After further reflection, having GCStrategy be a immutable object is fine by me. It does imply we need to get rid of the performCustomLowering mechanism, but that’s something I’d wanted to do anyway. Note that the in tree ShadowStackGC subclass of GCStrategy is not an immutable object today. There may also be out of tree subclasses with assumptions about mutability and lifetime baked in. This is likely the case - if there are actually any users out there other than the ones who’ve already spoken up - since the ShadowStackGC was the logic starting place for a custom GC strategy implementation. Philip

Apologies for any duplicates, I didn’t see this go through to the mailing list.