intra vs inter module LTO (A. Ilchinger via llvm-dev)

Message: 2
Date: Mon, 31 Dec 2018 18:55:13 +0100
From: “A. Ilchinger via llvm-dev” <llvm-dev@lists.llvm.org>
To: llvm-dev@lists.llvm.org
Subject: [llvm-dev] intra vs inter module LTO
Message-ID: <0MdFwl-1gvHrS3utf-00ISyE@mail.gmx.com>
Content-Type: text/plain; charset=US-ASCII

I’d like to know if LTO also works in the second step, when multiple .a
libraries and potentially other .o files are linked together to the
final binary. So can LTO also work successfully between modules (being
inter-module).

LLVM mostly works in terms of an internal representation called bitcode. If your .o file contains machine code (e.g. elf + x86) then the information used by link time optimisation has almost all been discarded. Similarly if you group some of these objects into a static library.

You can instead compile to bitcode and leave it as bitcode, e.g. using -emit-llvm. If I understand correctly -flto essentially does this for you. Bitcode can be joined to other bitcode with llvm-link, run through whatever optimisation pipeline you like with opt, stored in a static archive with llvm-ar, passed back into clang to make an object file etc.

I don’t know if it’s commonly done. I’ve seen it described internally as ‘poor man’s LTO’, which seemed a bit sad to me as it’s far more flexible. The process I use for compiling projects is roughly:

  • clang *.cpp -emit-llvm -o *.bc

  • llvm-link directory/* -o directory.tmp.bc

  • opt directory.tmp.bc -internalise -internalize-public-api-list=foo -globaldce -O3 -o directory.bc

  • recurse

Compile each source file to bitcode, each directory to a linked & optimised bitcode file. The internalize stuff effectively renders some functions static partway through the process and isn’t necessary. This is recursive. Eventually the top level is a bitcode file called src.bc, which gets turned into a dynamic library or executable.

And if it is possible, do I have to pay attention to some compiler
flags, linker options, etc., or does it work out-of-the-bos with -flto?

But I think all you’re looking for so far is -emit-llvm + use llvm-ar. It’s possible -flto + use llvm-ar will be sufficient, but I haven’t checked exactly what combination of things -flto does (because I like working with the pieces).

Cheers,

Jon

Message: 2
Date: Mon, 31 Dec 2018 18:55:13 +0100
From: “A. Ilchinger via llvm-dev” <llvm-dev@lists.llvm.org>
To: llvm-dev@lists.llvm.org
Subject: [llvm-dev] intra vs inter module LTO
Message-ID: <0MdFwl-1gvHrS3utf-00ISyE@mail.gmx.com>
Content-Type: text/plain; charset=US-ASCII

I’d like to know if LTO also works in the second step, when multiple .a
libraries and potentially other .o files are linked together to the
final binary. So can LTO also work successfully between modules (being
inter-module).

LLVM mostly works in terms of an internal representation called bitcode. If your .o file contains machine code (e.g. elf + x86) then the information used by link time optimisation has almost all been discarded. Similarly if you group some of these objects into a static library.

You can instead compile to bitcode and leave it as bitcode, e.g. using -emit-llvm. If I understand correctly -flto essentially does this for you. Bitcode can be joined to other bitcode with llvm-link, run through whatever optimisation pipeline you like with opt, stored in a static archive with llvm-ar, passed back into clang to make an object file etc.

I don’t know if it’s commonly done. I’ve seen it described internally as ‘poor man’s LTO’, which seemed a bit sad to me as it’s far more flexible. The process I use for compiling projects is roughly:

  • clang *.cpp -emit-llvm -o *.bc

  • llvm-link directory/* -o directory.tmp.bc

  • opt directory.tmp.bc -internalise -internalize-public-api-list=foo -globaldce -O3 -o directory.bc

  • recurse

Compile each source file to bitcode, each directory to a linked & optimised bitcode file. The internalize stuff effectively renders some functions static partway through the process and isn’t necessary. This is recursive. Eventually the top level is a bitcode file called src.bc, which gets turned into a dynamic library or executable.

And if it is possible, do I have to pay attention to some compiler
flags, linker options, etc., or does it work out-of-the-bos with -flto?

But I think all you’re looking for so far is -emit-llvm + use llvm-ar. It’s possible -flto + use llvm-ar will be sufficient, but I haven’t checked exactly what combination of things -flto does (because I like working with the pieces).

Generally I would recommend using -flto directly, along with a linker like lld or gold configured to use the llvm gold-plugin which both know how to invoke the LTO pipeline including internalization. As Jon mentions here and Mehdi mentioned in his reply, if you use static .a libraries you will want to use llvm-ar so that it handles bitcode archives automatically.

Another thing to try (especially if project is large or you want effective incremental builds) is -flto=thin. See https://clang.llvm.org/docs/ThinLTO.html.

Teresa