Mach-O support in lld: what are the known issues?

Hello all,

I’m trying to better understand the state of Mach-O support in lld.

The lld docs state that “the linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order of completeness.” [1] True to that statement, I found an email on this list from Jan 2018 stating that “MachO support in lld is not really ready for real world usage. It was able to bootstrap itself a couple of years ago, but, it has not really been maintained or further developed since.” [2] And on LLVM Bugzilla, a comment on one bug states “indeed the macOS version seems to be experimental, and not to support LTO at all for the moment.” [3]

I’m curious if anyone has more information on what else Mach-O support in lld is missing. From the above links, I’m aware of a lack of support for LTO. I also encountered the same Clang driver bugs as mentioned on that Bugzilla report. Besides that, I can see a few memory leaks and incorrect links have been reported on Bugzilla as well. [4]

Is there anything else that lld developers might be aware of? What work needs to be done before ld64.lld is considered complete?

Thanks in advance for any information you can send my way! :slight_smile:

  • Brian Gesiak

[1] https://lld.llvm.org
[2] http://lists.llvm.org/pipermail/llvm-dev/2018-January/120216.html
[3] https://bugs.llvm.org/show_bug.cgi?id=32175
[4] https://bugs.llvm.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED&component=MachO&list_id=139976&product=lld&query_format=advanced&resolution=—&resolution=LATER&resolution=REMIND

Are you planning on working on LLD?

Over here in the Zig frontend, we’ve been relying on LLD Mach-O support but at some point we’ll have to either maintain LLD or write a Mach-O linker in zig. So far we’ve been making it work with this hacky patch: https://github.com/ziglang/zig/commit/1ba6e1641a4c5ea1d0d665fe500c9c66d69443a4

If nobody else cares about the LLD Mach-O code, then we’ll be better off doing the work in zig but if others are interested then it may be more beneficial for the community to work together on the LLD codebase.

Brian,

Besides the features you pointed out, I think Xcode introduced a new way of listing dynamic linking symbols, and I believe lld doesn’t support that. There may be long-tail missing features as well. But I don’t think that’s the real issue. I think the real issue is the lack of maintenance and ownership of the mach-O lld tree. There’s no activities for the tree for years, though we’ve been making efforts to keep it compile and pass all the existing tests. As an example I made a last-minute cherrypick for some LLVM release to fix a bug blocking Zig language without really understanding what that patch is exactly doing. That’s far from ideal…

I’d be interested in the existence of a high-quality, open-source, portable linker for apple platforms, but not enough to help make that happen.

If I was gonna work on something related to that, I’d probably be inclined to instead add any required features to allow an ELF linker to target a notional darwin-elf target, and to have clang emit darwin-elf object files, and then write a binary converter to convert the emitted ELF binary to a Mach-O binary, so that it can actually run on a platform that exists rather than the platform I’d prefer to exist. But that’s probably just me being crazy, and I’m not going to work on it. :slight_smile:

I’d be interested in the existence of a high-quality, open-source, portable linker for apple platforms, but not enough to help make that happen.

If I was gonna work on something related to that, I’d probably be inclined to instead add any required features to allow an ELF linker to target a notional darwin-elf target, and to have clang emit darwin-elf object files, and then write a binary converter to convert the emitted ELF binary to a Mach-O binary, so that it can actually run on a platform that exists rather than the platform I’d prefer to exist. But that’s probably just me being crazy, and I’m not going to work on it. :slight_smile:

The list of feature exclusive to the Apple platform and Mac-O in the linker is astonishingly long. I started listing the missing features in lld at some points, but stop midway.
I’m not even sure it is possible to write Darwin-elf that supports all Mach-O features. If someone really need a portable linker, porting ld64 to other platform is probably the easier way

For my purposes, all I need is a linker that is sufficient to turn LLVM-generated MACH-O .o files into executables.

Thanks for the response, Rui!

Thanks for the response, Rui!

That’s a good question. There was a big discussion as to the design of the new (now current ELF/COFF/wasm) and the ATOM-based lld a few years ago when I started working on the new one. At the time no one including me was really sure what design is desirable, and I was exploring the design space to something good. Today, we have three working, high-performance linkers for ELF, COFF and wasm based on the new design, which I think proves the design; it is easy to add new features, easy to understand, and it delivers what users want the most (i.e. speed). Given that, if I were you, I’d try to see if the new lld’s design fits mach-O. You may need to tweak the design a little bit, but I’d imagine that the difference is not as significant as between ELF and wasm (which has a different concept of memory address space mainly for security). I’d also like to get input from Apple engineers as well.

I don’t remember but I think one of the main point was that EFL and Mach-O don’t have the same concept of section. IIRC, the concept of section in ELF was closer to the atom model than with Mach-O which uses few sections and put a lot of things in each one.

And a quick search gave me that: http://llvm.1065342.n5.nabble.com/LLD-improvement-plan-td80788.html#a80871

Both ELF and Mach-O have roughly the same concept of sections, Mach-O just
represents them in a weird way (by putting the delimiters between sections
in the symbol table instead of using actual sections). So I think that all
a new-style linker would need to do is read input files by splitting object
file sections into subsections using the symbol table and allocating
symbols and relocations to the correct subsections. Then linking would
proceed in pretty much the same way as in the other ports of the linker.

Thanks,