[LTO] -time-passes and libLTO

Hi,

We have been investigating an issue when running LTO with our proprietary linker, which links against libLTO dynamically. The issue is that when we pass -time-passes via the lto_codegen_debug_options function in the LTO C API, no time information is produced during compilation. The reason for this is that time information is stored in state owned by a ManagedStatic instance, and is only printed when the state is destroyed. This in turn only happens when ManagedStatics are cleaned up, via the llvm_shutdown function. As we do not link against LLVM (except libLTO dynamically), we have no access to llvm_shutdown, which in turn means we are not able to clean up ManagedStatic instances and thus no timing information is produced.

We have considered a few options and have come up with the following suggestions, and would appreciate some feedback:

  1. Add llvm_shutdown (or rather likely some wrapper function that does the same job) to the C interface of libLTO. This should be called when we are done with the library.

  2. Add a “full-shutdown” command-line option to LLVM - that can be passed via lto_codegen_debug_options - which causes ManagedStatic-owned state to be destroyed on shutdown. This could even be more widely useful outside the LTO case.

  3. Call llvm_shutdown() immediately after compilation as part of the compile function.

  4. None of the above, because there’s a better way that I am unaware of to clean up this state.

I have a marginal preference for 1), but does anybody have any other preferences or suggestions?

Regards,

James

+Teresa, Mehdi

Hi,

We have been investigating an issue when running LTO with our proprietary linker, which links against libLTO dynamically. The issue is that when we pass -time-passes via the lto_codegen_debug_options function in the LTO C API, no time information is produced during compilation. The reason for this is that time information is stored in state owned by a ManagedStatic instance, and is only printed when the state is destroyed. This in turn only happens when ManagedStatics are cleaned up, via the llvm_shutdown function. As we do not link against LLVM (except libLTO dynamically), we have no access to llvm_shutdown, which in turn means we are not able to clean up ManagedStatic instances and thus no timing information is produced.

We have considered a few options and have come up with the following suggestions, and would appreciate some feedback:

  1. Add llvm_shutdown (or rather likely some wrapper function that does the same job) to the C interface of libLTO. This should be called when we are done with the library.

This seems pretty reasonable to me. I’m not sure what others think.

  1. Add a “full-shutdown” command-line option to LLVM - that can be passed via lto_codegen_debug_options - which causes ManagedStatic-owned state to be destroyed on shutdown. This could even be more widely useful outside the LTO case.

  2. Call llvm_shutdown() immediately after compilation as part of the compile function.

  3. None of the above, because there’s a better way that I am unaware of to clean up this state.

Perhaps this timing stuff should live in the LLVMContext instead? And then you get timing-per-LLVMContext?

James Henderson <jh7370.2008@my.bristol.ac.uk> writes:

Hi,

We have been investigating an issue when running LTO with our proprietary
linker, which links against libLTO dynamically. The issue is that when we
pass -time-passes via the lto_codegen_debug_options function in the LTO C
API, no time information is produced during compilation. The reason for
this is that time information is stored in state owned by a ManagedStatic
instance, and is only printed when the state is destroyed. This in turn
only happens when ManagedStatics are cleaned up, via the llvm_shutdown
function. As we do not link against LLVM (except libLTO dynamically), we
have no access to llvm_shutdown, which in turn means we are not able to
clean up ManagedStatic instances and thus no timing information is produced.

Using lto_codegen_debug_options is a bit of a hack, but probably
appropriate for -time-passes. We would also have the same issue with a
non hackish way of passing that option.

We have considered a few options and have come up with the following
suggestions, and would appreciate some feedback:

1) Add llvm_shutdown (or rather likely some wrapper function that does
the same job) to the C interface of libLTO. This should be called when we
are done with the library.

Adding a lto_shutdown sounds reasonable.

2) Add a “full-shutdown” command-line option to LLVM - that can be
passed via lto_codegen_debug_options - which causes ManagedStatic-owned
state to be destroyed on shutdown. This could even be more widely useful
outside the LTO case.

I would probably avoid that as it requires using lto_codegen_debug_options.

3) Call llvm_shutdown() immediately after compilation as part of the
compile function.

That would be a small behavior change, which is normally avoided with
the C api.

4) None of the above, because there’s a better way that I am unaware
of to clean up this state.

I have a marginal preference for 1), but does anybody have any other
preferences or suggestions?

I think adding a lto_shutdown is the best one, but Duncan and/or Mehdi
have more recent experience with the C api.

Cheers,
Rafael

+Teresa, Mehdi

Hi,

We have been investigating an issue when running LTO with our proprietary
linker, which links against libLTO dynamically. The issue is that when we
pass -time-passes via the lto_codegen_debug_options function in the LTO C
API, no time information is produced during compilation. The reason for
this is that time information is stored in state owned by a ManagedStatic
instance, and is only printed when the state is destroyed. This in turn
only happens when ManagedStatics are cleaned up, via the llvm_shutdown
function. As we do not link against LLVM (except libLTO dynamically), we
have no access to llvm_shutdown, which in turn means we are not able to
clean up ManagedStatic instances and thus no timing information is produced.

We have considered a few options and have come up with the following
suggestions, and would appreciate some feedback:

1) Add llvm_shutdown (or rather likely some wrapper function that
does the same job) to the C interface of libLTO. This should be called when
we are done with the library.

This seems pretty reasonable to me. I'm not sure what others think.

Not a C api user, but this seems like the best option to me.

2) Add a “full-shutdown” command-line option to LLVM - that can be
passed via lto_codegen_debug_options - which causes ManagedStatic-owned
state to be destroyed on shutdown. This could even be more widely useful
outside the LTO case.

3) Call llvm_shutdown() immediately after compilation as part of the
compile function.

4) None of the above, because there’s a better way that I am unaware
of to clean up this state.

Perhaps this timing stuff should live in the LLVMContext instead? And
then you get timing-per-LLVMContext?

We've talked about something like this so that we get more meaningful
timing info for ThinLTO backends which run in parallel, but haven't had the
bandwidth to address yet.

Teresa

I would prefer this option. I do not want to add any more stable API surface to the legacy c API, and in any case this is a debugging feature so it does not deserve stable API.

Peter

+Teresa, Mehdi

Hi,

We have been investigating an issue when running LTO with our proprietary
linker, which links against libLTO dynamically. The issue is that when we
pass -time-passes via the lto_codegen_debug_options function in the LTO C
API, no time information is produced during compilation. The reason for
this is that time information is stored in state owned by a ManagedStatic
instance, and is only printed when the state is destroyed. This in turn
only happens when ManagedStatics are cleaned up, via the llvm_shutdown
function. As we do not link against LLVM (except libLTO dynamically), we
have no access to llvm_shutdown, which in turn means we are not able to
clean up ManagedStatic instances and thus no timing information is produced.

We have considered a few options and have come up with the following
suggestions, and would appreciate some feedback:

1) Add llvm_shutdown (or rather likely some wrapper function that
does the same job) to the C interface of libLTO. This should be called when
we are done with the library.

This seems pretty reasonable to me. I'm not sure what others think.

2) Add a “full-shutdown” command-line option to LLVM - that can be
passed via lto_codegen_debug_options - which causes ManagedStatic-owned
state to be destroyed on shutdown. This could even be more widely useful
outside the LTO case.

3) Call llvm_shutdown() immediately after compilation as part of the
compile function.

4) None of the above, because there’s a better way that I am unaware
of to clean up this state.

In the short term, adding a way to print and reset the counters and call it
at the end of the LTO process seems like a good immediate tradeoff to me.
We're kind of doing it with statistics already:
https://llvm.org/svn/llvm-project/llvm/trunk/lib/LTO/ThinLTOCodeGenerator.cpp
(look at the very end of the file)

Perhaps this timing stuff should live in the LLVMContext instead? And
then you get timing-per-LLVMContext?

This last suggestion is my favorite answer by far!
I've been talking with Matthias about doing this for statistic as well:
having a dedicated StatisticManager and TimerGroupManager (straw man names)
that could live in the LLVMContext and have to be set explicitly by the
client of the LLVMContext is what I'd aim for.
This is needed to solve the issue we have with ThinLTO and threading in
general.

It is possible (likely...) that some places are using statistics without
having a LLVMContext, but we should be able to thread through a
StatisticManager somehow.

Ok, I’m happy to make a couple of changes based on the above comments. Here’s my proposal:

  1. Modify the timing information classes to provide a method to print and reset the counters other than at destruction time.

  2. Add to the end of LTOCodeGenerator::compileOptimized and ThinLTOCodeGenerator::run a call to the new method from 1).

I do think we’ve got the bandwidth to implement the Manager classes at this time, but that does seem like a good idea in the longer term to me.

James

I do think we’ve got the bandwidth to implement the Manager classes at this time, but that does seem like a good idea in the longer term to me.

Whoops, that should have been “don’t think we’ve got the bandwidth”.