JIT, LTO and @llvm.global_ctors: Looking for advise

Hi Everyone,

We are looking for advise regarding the proper use of LTO in
conjunction with just-in time generated code. Our usage scenario goes
as follows.

  1. Our front-end generates an LLVM module.

  2. A small runtime support library is linked-in. The runtime
     library is distributed as bitcode. It is generated using "clang++
     -emit-llvm' and 'llvm-link'. This allows LTO to kick-in and
     functions from the runtime support library become candidates for
     inlining. This is the desired effect that we are looking for.

  3. The resulting LLVM module is compiled and dynamically loaded. We
     are currently using the MCJIT API, but are planning to move to ORC
     very soon.

Our LLVM module linking code goes roughly as follows:

  Linker linker(jittedModule);
  std::unique_ptr<llvm::Module> moduleToLink(
    getLazyIRFileModule(bcFileName, error, context));
  linker.linkInModule(std::move(module),
                      Linker::LinkOnlyNeeded |
                      Linker::InternalizeLinkedSymbol);

Our issue is with the Linker::LinkOnlyNeeded flag. Using it has a huge
positive impact on link and compilation time :-). But, it causes the
@llvm.global_ctors and @llvm.global_dtors references from the
linked-in modules to be discarded :-(. AFAICT, the Linker code assumes
ThinLTO when the LinkOnlyNeeded flags is specified, and full-LTO
otherwise.

To resolve this, we have locally patched
llvm/lib/Linker/LinkModules.cpp with:

  bool ModuleLinker::run() {

    // ....

    if (shouldImportIntrinsicGlobalVariables()) {
      auto addIntrinsicGlobalVariable = [ValuesToLink,
                                         srcM](llvm::StringRef name) {
        if (GlobalValue *GV = SrcM->getGlobalVariable(name)) {
          ValuesToLink.insert(GV);
        }
      };

      // These are added here, because they must not be internalized.
      addIntrinsicGlobalVariable("llvm.used");
      addIntrinsicGlobalVariable("llvm.compiler.used");
      addIntrinsicGlobalVariable("llvm.global_ctors");
      addIntrinsicGlobalVariable("llvm.global_dtors");
    }

    // ...

  }

Questions:

   1. Is attempting to use llvm::Linker appropriate for our usage
      pattern ? Should we directly use llvm::IRMover instead ?

   2. Or, is our patch to ModuleLinker::run() the way to go ? Should
      we submit back a patch along these lines ?

   3. Other suggestions ?

[Note] We are currently using LLVM 4.0.1-rc2.

Cheers,
Benoit

Benoit Belley
Sr Principal Developer
M&E-Product Development Group

MAIN +1 514 393 1616
DIRECT +1 438 448 6304
FAX +1 514 393 0110

Twitter <http://twitter.com/autodesk&gt;
Facebook <https://www.facebook.com/Autodesk&gt;

Autodesk, Inc.
10 Duke Street
Montreal, Quebec, Canada H3C 2L7
www.autodesk.com <http://www.autodesk.com/&gt;

+Lang (for JIT) & Teresa (for LTO/ThinLTO).

Sounds like maybe the LinkOnlyNeeded got reused for a bit more than the original intent & maybe there should be more than one flag here - not sure.

Thanks for the hindsight.

I am currently working on a patch/potential fix which introduces a new
Linker::ImportIntrinsicGlobalVariables flag. The patch includes a unit
test reproducing the problem. Hopefully, that will help getting more
feedback.

Note that it might take a while before I am allowed to upload the patch
since I need approval from Autodesk Legal department.

Cheers,
Benoit

Benoit Belley
Sr Principal Developer
M&E-Product Development Group

MAIN +1 514 393 1616
DIRECT +1 438 448 6304
FAX +1 514 393 0110

Twitter <http://twitter.com/autodesk&gt;
Facebook <https://www.facebook.com/Autodesk&gt;

Autodesk, Inc.
10 Duke Street
Montreal, Quebec, Canada H3C 2L7
www.autodesk.com <http://www.autodesk.com/&gt;

Hi Benoit,

It seems to me that LinkOnlyNeeded failing to link the llvm.* special variables is a bug. I think we should probably just change the behaviour of LinkOnlyNeeded so that it links them.

Note that ThinLTO does not use the Linker class, it uses IRMover directly. That might not be appropriate for your use case, though, because IRMover does not follow references when linking (for example, if your module defines two functions f and g, where f calls g, it will not link g if you only ask for f).

Peter

Thanks Peter, this is very useful feedback.

I did manage to change the behavior of LinkOnlyNeeded to correctly import
all variables with AppendingLinkage. In fact, I discovered that there was
already something fishy. A variable with AppendingLinkage would get
imported correctly from the source module if the destination module
already contained a definition for that variable and wouldn't be imported
otherwiseŠ My local fix ensures that it correctly gets imported in both
cases.

You are right; ThinLTO no longer uses the Linker class. I was able to
remove the useless include of Linker.h from ThinLTOCodeGenerator.cpp.

That being said, Linker.h and LinkModules.cpp still have a few comments
about ThinLTO, namely:

  /// \brief Link \p Src into the composite.
  ///
  /// Passing OverrideSymbols as true will have symbols from Src
  /// shadow those in the Dest.
  /// For ThinLTO function importing/exporting the \p ModuleSummaryIndex
  /// is passed. If \p GlobalsToImport is provided, only the globals that
  /// are part of the set will be imported from the source module.
  ///
  /// Returns true on error.

  // Don't want to append to global_ctors list, for example, when we
  // are importing for ThinLTO, otherwise the global ctors and dtors
  // get executed multiple times for local variables (the latter causing
  // double frees).

    // For ThinLTO we don't import more than what was required.
    // The client has to guarantee that the linkonce will be availabe at
link
    // time (by promoting it to weak for instance).

Are these obsolete comments ? Should these be cleaned-up ? If so, how ?

Thanks again,
Benoit

Hi Benoit,

Thanks Peter, this is very useful feedback.

I did manage to change the behavior of LinkOnlyNeeded to correctly import
all variables with AppendingLinkage. In fact, I discovered that there was
already something fishy. A variable with AppendingLinkage would get
imported correctly from the source module if the destination module
already contained a definition for that variable and wouldn't be imported
otherwiseŠ My local fix ensures that it correctly gets imported in both
cases.

Thanks, this does sound even more like the right fix to me then.

You are right; ThinLTO no longer uses the Linker class. I was able to

remove the useless include of Linker.h from ThinLTOCodeGenerator.cpp.

That being said, Linker.h and LinkModules.cpp still have a few comments
about ThinLTO, namely:

  /// \brief Link \p Src into the composite.
  ///
  /// Passing OverrideSymbols as true will have symbols from Src
  /// shadow those in the Dest.
  /// For ThinLTO function importing/exporting the \p ModuleSummaryIndex
  /// is passed. If \p GlobalsToImport is provided, only the globals that
  /// are part of the set will be imported from the source module.
  ///
  /// Returns true on error.

  // Don't want to append to global_ctors list, for example, when we
  // are importing for ThinLTO, otherwise the global ctors and dtors
  // get executed multiple times for local variables (the latter causing
  // double frees).

    // For ThinLTO we don't import more than what was required.
    // The client has to guarantee that the linkonce will be availabe at
link
    // time (by promoting it to weak for instance).

Are these obsolete comments ? Should these be cleaned-up ? If so, how ?

Are you sure that you are looking at the latest version of the code? I
believe that I removed those comments (and the associated code) in r294015.

Peter

Are these obsolete comments ? Should these be cleaned-up ? If so, how ?

Are you sure that you are looking at the latest version of the code? I believe that I removed those comments (and the associated code) in r294015.

Peter

Excellent! I am currently working on top of the LLVM 4.0.1-rc2 release. I’ll try back porting some of these changes back to the release_40 branch where I need them. I will also test my changes (and regression tests) with the latest versions of llvm.

Benoit