Does full lto have runtime performance advantage over thin lto?

Hi all,

I find that it takes a very long time (about 15mins plus) to link vmlinux.o with ‘CONFIG_LTO_CLANG_FULL=y’ (‘-flto’),
while it only takes less than 2mins to link vmlinux.o with ‘CONFIG_LTO_CLANG_THIN=y’ (‘-flto=thin)’.
I’m testing with Android ACK kernel, which uses clang-r445002. (repo init -u kernel/manifest - Git at Google --depth=1 -b common-android-mainline)

So my question is:
Why didn’t kernel use ‘-flto=thin’ to accelerate vmlinux link speed?
Is it because full lto has performance advantage over thin lto?

The difference between Thin/Full LTO is quite specific to the application, sometimes it may not be about performance but about binary size for example.

You may find informations in the ThinLTO paper, or in the slides (the recording from the DevMtg is on YouTube as well).

@teresajohnson may have more recent data or feedback on this.

Ultimately that’s a question for the folks working on the kernel :slight_smile:

Thanks for reply. :slight_smile:
Another question, I read it somewhere which said ThinLTO was still under development and still tuning.
What’s the status now? Is ThinLTO still an experimental option or is it already released?

Another question, I read it somewhere which said ThinLTO was still under development and still tuning.
What’s the status now? Is ThinLTO still an experimental option or is it already released?

ThinLTO has not been experimental for a few years at least, and is used quite widely. For example, we at Google use it extensively (and in general don’t use full LTO at all as it doesn’t scale). I know it was released with Xcode back around 2016. There are a number of other companies that use it as well.

The difference between Thin/Full LTO is quite specific to the application, sometimes it may not be about performance but about binary size for example.

@teresajohnson may have more recent data or feedback on this.

LTO has a code size advantage over ThinLTO, because it can essentially internalize almost everything in a statically-linked binary and thus perform more aggressive code size shrinking IR based optimizations as a result. So some small very size sensitive binaries use LTO for the optimal code size.

That being said, I don’t know the current status of the kernel trying ThinLTO.

Thanks for reply. :slight_smile:

I don’t see data like this in my case.
In my case(Android ACK kernel), ThinLTO has smaller image size than FullLTO.

  • vmlinux.o with FullLTO = 1224051096 Bytes ~= 1.2GB
  • vmlinux.o with ThinLTO = 1161002072 Bytes ~= 1.1GB

Do you have any recent test data on code size between FullLTO and ThinLTO?

Targets that care about code size will build with -Os or even -Oz (in conjunction with -flto=full), if you use -O2 or -O3 you may see different behavior.
(In any case this is also very codebase-specific)