Enabling statistics in release builds / static constructors

Analyzing a compilers behavior and timings is an important tool to track code size and compile time regressions. One of the tools we would like to use are Statistic variable (llvm/include/ADT/Statistic.h). However today I cannot enable them in release builds without a performance hits because the lazy initialisation present forces a memory fence on every single increment of the statistic variable. This is trivial to fix: Do the initialization in the constructor. See ⚙ D27724 Statistic: Initialize in the constructor

I am writing this mail because I expect opposition against introducing static constructors. Looking at previous discussions it is obvious that a lot of static constructor usage is bad as it promotes global state that should better be put into LLVMContext.
However lazily initializing the Statistic variables makes them in no way less static, we just somewhat delay the initialization at the price of a heavy performance penalty on usage. I don't see a better alternative at the moment.

- Matthias

Given that LLVM has so many auto-registration systems (cl::opt, target registry, pass registry, statistics, I’m sure there’s more), maybe we should spend the time to build an auto-registration system that doesn’t involve static constructors?

It needs custom code for every supported object file format, and is hard to get right when DSOs are involved, but in the long run it’s probably worth fixing this problem once and for all.

Given that LLVM has so many auto-registration systems (cl::opt, target registry, pass registry, statistics, I'm sure there's more), maybe we should spend the time to build an auto-registration system that doesn't involve static constructors?

I would volunteer to do the work, however this obviously needs some consensus first on how that would look.

It needs custom code for every supported object file format, and is hard to get right when DSOs are involved, but in the long run it's probably worth fixing this problem once and for all.

I assume you are thinking about creating custom linker sections with list of init functions; Similar to the existing constructors sections but running at a time controlled by llvm code. While the compiler/linker nerd in me would love doing that, I could see this being very tricky to pull off consistenly on all platforms.

We should not forget that there is a portable and proven solution: Just write the code!
So here comes the strawman:

static Statistic NumBlips("blips");
static cl::opt MyOpt("my-cool-option", cl::desc("bla"));
static cl::opt AnotherOpt("bla", cl::desc("foo bar"));
// Note that the constructors of Statistic and cl::opt would be reworked to be pure constexpr and do not run any code

static void init_globals() {
  NumBlips.init();
  MyOpt.init();
  AnotherOpt.init();
}
// Note that the init_globals() function is pretty mechanical so hopefully easy to understand and maintain.

We have to call init_gloabals somewhere early:
- Put an init_globals() call into code that runs early.
- We already have a lot of early running functions called initializeXXXPass() which we can use for this
- For the remaining files we probably have to export init_globals() and call it from some common place of the library.

If someone comes up a with working solution for putting initializers into a section we can later replace the
init_globals() function with that so the refactoring work to split into constexpr constructor and init() is not wasted either way!

- Matthias

>
> Given that LLVM has so many auto-registration systems (cl::opt, target
registry, pass registry, statistics, I'm sure there's more), maybe we
should spend the time to build an auto-registration system that doesn't
involve static constructors?
I would volunteer to do the work, however this obviously needs some
consensus first on how that would look.

>
> It needs custom code for every supported object file format, and is hard
to get right when DSOs are involved, but in the long run it's probably
worth fixing this problem once and for all.
I assume you are thinking about creating custom linker sections with list
of init functions; Similar to the existing constructors sections but
running at a time controlled by llvm code. While the compiler/linker nerd
in me would love doing that, I could see this being very tricky to pull off
consistenly on all platforms.

The goal of using sections would be to build the array of statistics
without running any code. This is only possible when you have one DSO. Once
you have a second DSO, it will need to run some initialization code to link
the two arrays together.

We should not forget that there is a portable and proven solution: Just

write the code!

Yep, that always works, and you can control when it happens so you don't
have to run initializers scattered across your binary at startup.

I agree, and there was a proposal in the past for cl::opt (I believe from Chris B.). The idea would be to have explicit registration (from the pass ctor for instance) and the storage for the options being hold in the context somehow (old memories, not sure about the details).

CC Chris who likely has more information (and possibly pointers).

Given that LLVM has so many auto-registration systems (cl::opt, target registry, pass registry, statistics, I’m sure there’s more), maybe we should spend the time to build an auto-registration system that doesn’t involve static constructors?

I would volunteer to do the work, however this obviously needs some consensus first on how that would look.

It needs custom code for every supported object file format, and is hard to get right when DSOs are involved, but in the long run it’s probably worth fixing this problem once and for all.

I assume you are thinking about creating custom linker sections with list of init functions; Similar to the existing constructors sections but running at a time controlled by llvm code. While the compiler/linker nerd in me would love doing that, I could see this being very tricky to pull off consistenly on all platforms.

We should not forget that there is a portable and proven solution: Just write the code!

I agree, and there was a proposal in the past for cl::opt (I believe from Chris B.). The idea would be to have explicit registration (from the pass ctor for instance) and the storage for the options being hold in the context somehow (old memories, not sure about the details).

CC Chris who likely has more information (and possibly pointers).

You probably mean:
http://web.archive.org/http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075886.html [1]
and
http://reviews.llvm.org/D5389

but I fail to see how that helps Statistic variables which are still global.

  • Matthias

[1] Please someone please give us an easier time to access old mails when you just have the old link…

Given that LLVM has so many auto-registration systems (cl::opt, target registry, pass registry, statistics, I’m sure there’s more), maybe we should spend the time to build an auto-registration system that doesn’t involve static constructors?

I would volunteer to do the work, however this obviously needs some consensus first on how that would look.

It needs custom code for every supported object file format, and is hard to get right when DSOs are involved, but in the long run it’s probably worth fixing this problem once and for all.

I assume you are thinking about creating custom linker sections with list of init functions; Similar to the existing constructors sections but running at a time controlled by llvm code. While the compiler/linker nerd in me would love doing that, I could see this being very tricky to pull off consistenly on all platforms.

We should not forget that there is a portable and proven solution: Just write the code!

I agree, and there was a proposal in the past for cl::opt (I believe from Chris B.). The idea would be to have explicit registration (from the pass ctor for instance) and the storage for the options being hold in the context somehow (old memories, not sure about the details).

CC Chris who likely has more information (and possibly pointers).

You probably mean:
http://web.archive.org/http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075886.html [1]
and
http://reviews.llvm.org/D5389

but I fail to see how that helps Statistic variables which are still global.

Admittedly the “static void registerOptions()” part of that patch appears to have the same/similar role than the “initGlobals()” I proposed.

  • Matthias

Right, basically there wouldn’t be any global variables, that would apply to statistics as well. The way Chris did it somehow was to use a singleton registry for this.
You could do the same for statistic and have any component that needs to use a statistic to get a reference to it from the registry on initialization (pass ctor for example). You wouldn’t any other registration code, because we don’t need to register early statistic on the contrary to cl::opt.

The intention with cl::opt was to use the singleton as a transitionary measure only. The end goal was to move option storage into the {MC|LLVM}Context.

-Chris

I don’t agree with an ideological “burn and remove all the globals” stance. For one thing that means heavy rewriting (we would need to move all the Statistic variable into classes somewhere, pass around way more contexts). I am NOT volunteering for that part of work.

  • Matthias

I don’t believe it is any more work (except the registry infrastructure) to do than you strawman proposal.

But it is more work: My strawman proposal only needs a very mechanical change of adding one xxx.init(); line for each global. Putting the variables into context classes is more involved: You have to actually decide on the appropriate context class for each of them and you will face lots of small helper functions outside of classes that simply have no appropriate context at hand.

  • Matthias

Well for one you’re snippet includes a static void init_globals() { which definitely can be “static” and it is quite unclear to me how you’re gonna trigger the registration for it.

As I wrote the trick here is sneak the initGlobals() into the places where we do pass registration which should get you a big part of the way. For the remaining cases you have to add a few functions and call them from init() like functions which we have all over llvm anyway because of pass registration.

  • Matthias

It is difficult to evaluate without seeing how it looks in practice, right now I fear it’s not gonna look nice: especially considering that you raised a concern earlier about free functions here and there using statistics. Tracking what’s initialized where and when might become hairy.

That is as easy or hard to track like pass registration. All of that mostly happens when llvm starts up and the user calls all these functions like
initializeCore(), intiaizlieCodeGen(), initializeAllTargets(), and all the other existing init functions. Also noticing missing init errors is obvious and fixing them trivial.

  • Matthias