Now that 2016 is almost over, I wanted to look back and summarize the progress we’ve made to LLD this year, as I guess most people who are not looking closely at LLD don’t know very well about the current status. I think I can say that this year was a fantastic year for LLD. Now I’m pretty sure that that is going to be a serious (and better, in my opinion) alternative to the existing GNU linkers thanks to all the improvements we’ve made this year.
LLD is now able to link most x86-64 userland programs. The FreeBSD project and we are trying to make LLD the system default linker of the operating system, and except a few tricky programs such as the kernel or a boot loader, the linker works mostly fine. We are still working on implementing long-tail features/bugs, but I’d say that’s just a matter of time. LLD supports x86, x86-64, x32, AArch64, AMDGPU, ARM, PPC64 and MIPS32/64, though completeness varies.
Looks like there are already a few systems that are using LLD as system linkers, such as CloudABI or Fuchsia. Chromium and Clang/LLVM itself has build options to use LLD to build them.
It is hard to argue about the complexity of a program quantitatively, and of course I’m biased, but I believe we succeeded to maintain LLD code base clean, easy to read, and easy to add new features. It is just 20k lines of modern C++ code which is much smaller than GNU linkers.
Even though LLD was fast from day one, LLD got faster this year, despite it got a lot of new features. Below is a chart of Clang link time for every commit made to the LLD repository this year. At the beginning of this year, LLD took about 16 seconds to produce a 1.5 GB clang (debug build) executable. Now, it takes about 14.5 seconds on single core and 8.5 seconds on 20 cores (*1). ld.gold takes about 25 seconds and 20 seconds, respectively, so we’ve widen the gap. You can see the benchmark results here (*2). If you have a problem of too long link time, I’d recommend to try LLD.
Last but not least, a lot of people joined to the LLD development this year to make LLD better. We are growing as a community, and I’m very happy about that!
(*1) My machine has Ivy Bridge Xeon 2.8 GHz 20 physical cores (40 hyper-threading cores). To measure a single-thread performance, I pinned a process to (physical and hyper-threading) core 0. To measure a multi-thread performance, I pinned to CPU socket 2, so that a process gets 10 physical cores (20 hyperthreading cores).
(*2) https://docs.google.com/spreadsheets/d/1VvOqiU5JvqlxU7aof8gsbh-yweeNchMgtkamyXrwzrA/edit?usp=sharing. Changes with more than 1% rise or drop compared to the average of previous 5 commits are colored in green or red, respectively.