Beginner GCRoot Questions

Hello,

I’ve spent some time with the LLVM documentation and am beginning to grasp a few things, but I sometimes need very literal statements to actually understand things.

My first question is about StackMaps:

Is it true that llvm_gc_root_chain is an API? I’ve been trying to understand how exactly one accesses this structure and no where in the documentation does it mention this is a public variable that will be present in the final executable (in those or such explicit words). Secondly, if this is true, does it then mean that this variable is only accessible to C/C++ land? In turn I’ll have some sort of main.cpp that will then be linked with llc output?

My second question is about GCRoot intrinsic and ShadowStack?

I’ve read up about intrinsics and it seems if I call LLVMAddFunction(“llvm.*”) … it will be treated in special way by the code generator?

If I set my GC to Shadowstack and mark all object pointers as GCroots, will this be 100% correct? Or do I need to myself in my code gen phase spill registers? Or does ShadowStack essentially do that for me?

I guess my question here is F.SetGC(“shadow-stack”) + gcroot is enough for a precise GC or do I need to do more work in my frontend to do things so that processor register stored pointers are identified.

HA

The gc.root documentation is also just flat out painful. If there’s specific things you think need clarified, point them out. I’ll try to fix them. Now that all of the gc.statepoint pieces have landed, I’m about to start trying to improve the docs and make it clear which bits apply to gc.root and which apply to both implementations. To generate a custom assembly stack map format using gc.root, you need to provide a GCMetadataPrinter. This can be loaded as a plugin at the command line. As examples, see OcamlGCPrinter and ErlandGCPrinter in lib/CodeGen/AsmPrinter/. For gc.statepoint, custom output formats are not yet supported. Instead, you would parse the standard stack map section format and then encode your own. Functions in the llvm.* namespace are special. No matter how they are added. This sounds like a general question about what a “shadow stack” is w.r.t. garbage collection. Rather than trying to answer this myself, I’ll recommend you check the literature. It does a far better job of explaining the topic than I can. To my knowledge, the root tracking is complete in our implementation. Keep in mind, I have little hands on experience with it. You might also find this perspective useful: In theory, the shadow stack implementation is enough for a precise collector. I wouldn’t personally trust our implementation that far. Its been essentially unmaintained for several years. I would be very unsurprised to find out it was buggy. I’d really suggest you take a look at gc.statepoint. In particular, you might find the PlaceSafepoints and RewriteStatepointsForGC passes that were landed recently very useful. This is the infrastructure that I am actively working on. It will definitely get first priority for bug fixes and (informal!) support. Longer term, I plan to merge the two implementations, but the final result is far more likely to look like gc.statepoint (with some tweaks) than the current gc.root mechanism. Philip

FYI, the first round of documentation fixes landed yesterday. I’m planning more today. Expect some churn, but please do look them over and point out any sources of confusion. Philip