I’ve got a customer project that emits a 18 MB .exe and 9 MB .exe.pdb; It takes about 6 minutes to link on their system, 2 minutes to link on mine (this a regullar lld -flavor link /opt:lldlto=0). the input is about 61mb of bitcode files (originally about 2100 object files) with debug info (both pdb and dwarf get emitted) stored in 5 .lib files, compiled with -O0 but with thin-lto (without thin-lto it wasn’t faster).
Is this something to be expected?
Can I do something to improve the speed?
This use case is probably dominated by code generation time, so all the expensive work is in LLVM, not LLD. When you link bitcode, the linker does all the codegen work (isel & regalloc), and it is not naturally parallelized by traditional build systems. Speeding up LLVM codegen is a long term hard problem.
In theory, LLD uses ThinLTO by default, so code generation should be parallelized to the number of cores you have on your system.
You can also investigate using the LTO cache to make links more incremental, so that when you change a single object file, the object files code generated for previous builds are cached.
Generally speaking, though, if you want fast links, I’d recommend against linking with bitcode. The entire LTO pipeline is set up for optimization, not incremental link performance.