Contributing LLD for Mach-O

Hi all,

We’re planning to contribute a new implementation of LLD for Mach-O, using the same design as the COFF and ELF ports. This design has proven to work very well for those ports, and we’re keen to explore it for Mach-O as well. Our work is based on an initial prototype created by Peter Collingbourne and Rui Ueyama.

Our initial commit is up for review at https://reviews.llvm.org/D75382. We’ve intentionally stripped down this initial commit as much as possible to ease reviewing; we’ve kept it to the absolute minimum needed to produce and test a working macOS x86-64 executable for that prints “Hello World” via a syscall. We have several short-term follow-ups planned to add important functionality, such as linking against archives, universal binaries, dylibs, and tbd files, performing subsection splitting (atomization), and producing dylibs. The follow-ups should give a good sense of the overall design while still keeping each piece easily reviewable and testable individually. Our end goal is to create a full-featured Mach-O linker, and we’ll be working toward that goal over the next several months (and years, in all likelihood). We’d appreciate feedback and reviews.

Nice!

Your plan sounds great, and it’ll be awesome to finally have a good MachO LLD available.

The existing Mach-O port https://reviews.llvm.org/D38290#882910
had been unmaintained when the ld64.lld alias was added.

If Jez and the team are committed to maintain the new Mach-O port and we
think the existing port is a dead end, we may assign the flavor `darwin` to
it (`lld -flavor darwin`) and rename the existing flavor to `darwin-old` or
`darwin-legacy`.

Awesome :slight_smile: out of curiosity, are there any particular features/characteristics/benefits you’re hoping for compared to the MacOS system linker?

We are committed to maintaining this. I believe there are some people who use the old port right now though, so I'd want us to be a bit more feature complete before we take over the Darwin flavor (e.g. we should at least be able to self-host).

We’re hoping to get some linking speedups, similar to how LLD for ELF and COFF ended up being faster than the existing system linkers, though of course it’s hard to say how ld64’s design and efficiency compares to those other system linkers. In particular, we’re excited about the potential of speedups from parallelization and using the LLVM data structures instead of the STL ones.

We are committed to maintaining this. I believe there are some people who use the old port right now though, so I'd want us to be a bit more feature complete before we take over the Darwin flavor (e.g. we should at least be able to self-host).

Thanks for the commitment. I know close to zero about Mach-O, but I'll try following your development.

You may already have known this, but the following two talks are helpful.

https://llvm.org/devmtg/2017-10/#talk16

I'd seen the first talk but not the second; I'll check that out. Thanks for the pointer.

    >We are committed to maintaining this. I believe there are some people who use the old port right now though, so I'd want us to be a bit more feature complete before we take over the Darwin flavor (e.g. we should at least be able to self-host).
    
    Thanks for the commitment. I know close to zero about Mach-O, but I'll try following your development.
    
    You may already have known this, but the following two talks are helpful.
    
    https://urldefense.proofpoint.com/v2/url?u=https-3A__llvm.org_devmtg_2017-2D10_-23talk16&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=GT8uVlPGTQdvVjhHId_JrQo0z146SwJ5NNsuWB5Q2fs&s=XV1y4LJ9pubNdyDEOXQQXEjK3kKmamHDIYvRFtrfmlE&e=
    FOSDEM 2019 - What makes LLD so fast?

Hi Shoaib,

Has there been any recent discussion with anyone on Apple’s side? Is there a way forward that results in a single unified open source linker? If not at the start, how would it work if they later want to take maintainership?

Have you thought about a compatibility test suite? If so, I’m curious what the approach will be.

Also for what it’s worth, there’s a recent fork of ld64’s recent source drop, which is optimized for incremental compilation. It also uses non-STL data structures, parallelism, and a disk cache. See https://github.com/michaeleisel/zld

Hi Shoaib,

Has there been any recent discussion with anyone on Apple's side? Is there
a way forward that results in a single unified open source linker? If not
at the start, how would it work if they later want to take maintainership?

Have you thought about a compatibility test suite? If so, I'm curious what
the approach will be.

Also for what it's worth, there's a recent fork of ld64's recent source
drop, which is optimized for incremental compilation. It also uses non-STL
data structures, parallelism, and a disk cache. See
GitHub - michaeleisel/zld: A faster version of Apple's linker

% cat llvm-project/lld/CODE_OWNERS.TXT
...
N: Lang Hames, Nick Kledzik
E: lhames@gmail.com, kledzik@apple.com
D: Mach-O backend
...

Lang resigned (⚙ D75382 [lld] Initial commit for new Mach-O backend) and agreed to delete the existing Mach-O port,
but I think we should also get Nick's confirmation.

We’ve discussed this with Jim Grosbach at Apple (CC’d). My understanding is that at this time, the officially supported linker for Apple’s platforms is ld64. We’d love to collaborate with Apple (and any other interested parties, for that matter) on LLD for Mach-O, and we’d be delighted if it were to become officially supported at some point, but there’s a lot of work to be done first on reaching feature parity with ld64 before that could even be considered :slight_smile: Once we reach feature parity, I can envision several good reasons both for sticking with ld64 and for switching to LLD. On our end, we aim to create a feature-complete Mach-O linker. We also aim for the end product to be compelling enough that a switch could be considered, and we’d be happy to work with Apple on that front if it turns out to be.

What sort of compatibility test suite did you have in mind? We’re adding lit-style unit tests as we add features (as is standard for LLVM), but we didn’t have anything in mind beyond that right now.

We saw zld, and the speedups achieved by it make us hopeful of being able to achieve similar results with LLD (since we’ll also have better parallelization and better data structures in the form of the LLVM ones). (I’d also be curious if zld’s improvements could be contributed back to ld64 so that everyone can take advantage of them, but that’s completely tangential, of course.)

I tried to work on the Mach-O layer on LLD long-time ago (I got Dtrace User Probe working but as it was just before the LLD redesign and dev reboot, so never try to push the changes), and one thing I fond difficult was the complete lack of technical documentation on ld64 passes and flags.

I think that restarting a Mach-O parser from scratch (with ld64 feature parity as a goal) is a very good opportunity to also write a technical reference of the difference passes, especially the one that are Mach-O and Apple specific.
Do you plan something like that (as a page on lld.llvm.org <http://lld.llvm.org/&gt; for example). It would also be very helpful for potential contributors.