Static constructors: _cxx_global_var_initN vs _GLOBAL__sub_I_XXX

I’m trying to understand more about global constructors generated by clang. As far as I understand it, any compilation unit which requires static constructors will cause clang to generate LLVM IR to append to the array:

@llvm.global_ctors

This is an array of structs with 3 elements: a priority, a function pointer and an unknown (seemingly undocumented) member.

In most cases I only ever see one element in this array (per compilation unit) with name _GLOBAL__sub_I_Filename.cpp.

However in some more complex cases - which I’ve yet to pin down but I think relates to templates - I see multiple members. Those members always seem to look something like:

__cxx_global_var_init89

__cxx_global_var_init90
_GLOBAL__sub_I_Filename.cpp

Furthermore, the last function usually just wraps other functions with names prefixed by __cxx_global_var_init, e.g.

declare i8* @memset(i8*, i32, i64) #0
define internal void @__s3e__GLOBAL__sub_I_Filename.cpp() #0 {
call void @__cxx_global_var_init(), !dbg !32933
call void @__cxx_global_var_init1(), !dbg !32933
call void @__cxx_global_var_init2(), !dbg !32933
ret void
}

I’d just like to understand what the logic is behind the generation of these functions. It looks to me like there’s implicit constructor priority however the priority member of the @llvm.global_ctors array is never used (well, it’s always 65535).

Does anyone know the secrets behind this?

It’s fairly simple. Doing it this way makes it easy to emit the constructors one at a time (the ctors array is an appending global, but it’s still immutable after creation - you can’t add fields to it) and lets the inliner condense the constructors into a single function where appropriate (and allows the same code to be used to generate the static constructor initialisation paths as global initialisation).

The inliner priorities might want tweaking a bit here though, as we’ve seen cases where, after inlining, you end up with truly massive basic blocks that end up causing the register allocator to spent 5+ minutes at compile time, to save a few nanoseconds of run time.

David

Thanks David. To be more specific with my question, two things I’d like to know are:

  • Is there any significance in the naming GLOBAL__sub_I vs __cxx_global_var_init or is this just an artefact of the way clang generates the code?
  • Can I assume that the order the functions appear in @llvm.global_ctors is always the order that they should be called in (assuming there’s actually any dependencies)?

This is significant to me as I’m performing manipulations on the IR to alter the static ctrs and then recompiling (you may shudder should you wish to do so :slight_smile: ).

- Is there any significance in the naming _GLOBAL__sub_I_ vs __cxx_global_var_init or is this just an artefact of the way clang generates the code?

To the best of my knowledge, these are entirely private. I think the _GLOBAL__ names are there to help debuggers spot what is going on.

- Can I assume that the order the functions appear in @llvm.global_ctors is always the order that they should be called in (assuming there's actually any dependencies)?

Modulo the priority, yes, though the guarantees of the priority are very weak (and affected by linking. Yay).

This is significant to me as I'm performing manipulations on the IR to alter the static ctrs and then recompiling (you may shudder should you wish to do so :slight_smile: ).

There’s an optimisation pass that tries to turn these things into static initialisers. That’s probably a good place to start looking.

David

Great, thanks for the info David!

It’s documented: http://llvm.org/docs/LangRef.html#the-llvm-global-ctors-global-variable

OK, I apologize, it’s terse. :slight_smile:

The third field is a global whose comdat group the initializer should be in. On Windows, this is the mechanism that prevents double initialization of static data members of class templates. On Linux, it is an optimization that improves startup time and code size.

If the third field would be null, then we lump the initialization into the __GLOBAL_sub_I… function to allow for inilining, etc, as David explained.