Darwin lld linking issue

I’m having a curious issue with variable symbols on OSX.

I have an external variable symbol (currently defined as i8* @_ZTVN6System7TObjectE). If it’s defined in another object in the same code, it’s all fine.

if I use a tapi-tbd with symbols: [ _ZTVN6System7TObjectE ] and link to that with lld darwin-new however, it points to a pointer to that value.

It almost feels like it does an indirect pointer table to the real table. But I don’t know enough about OSX’ dylib logic to know where to even start looking

Does anyone know where I can read more of how this is supposed to work?

it’s relocated as __DATA_CONST __got 0x1000F9000 pointer 0 bplrtl260 __ZTVN6System7TObjectE

I can work around it with a load, but that won’t work when I use this from within another global variable.

Does Apple’s ld64 behave differently?

From what you’ve described, things seem to be behaving as expected. If the symbol is defined in a TBD, then it has to be dynamically linked, and that means the symbol’s address gets indirected through the GOT. If the symbol is defined in a statically-linked object file, then no indirection is necessary.

Not sure about ld64. But is there a way to get different behavior here?

My main issue is that i have other variables that use this in their initializer. What i end up with is an indirect pointer instead of the original pointer. For code I can use a load, I know when something is from a dylib vs regular symbol. Does macho have another reloc type that could work?

Not entirely sure I understand your use case, but the UNSIGNED reloc type might be what you’re looking for

Yeah but it already does that. What I mean is this. Given this commandline:

llc.exe -filetype=obj -O0 test.ll
lld.exe -flavor darwin test.o importlib-f601441d1b0f0983055325e4c44612db.txt -arch x86_64 -platform_version macOS 12.3 12.3 -dynamic -U _main -U dyld_stub_binder 

test.ll:

target triple = "x86_64-apple-macosx12.3"

%struct.anon = type { i32* }

@x = external global i32, align 4
@y = internal global %struct.anon { i32* @x }, align 8

define dso_local void @_Z1mv() {
  %1 = load i32*, i32** getelementptr inbounds (%struct.anon, %struct.anon* @y, i32 0, i32 0)
  store i32 15, i32* %1
  store i32 15, i32* @x
  ret void
}

importlib-f601441d1b0f0983055325e4c44612db.txt:

--- !tapi-tbd
tbd-version:     4
targets:           [ x86_64-macos ]
uuids:
  - target: x86_64-macos
    value: '00000000-0000-0000-0000-000000000000'
install-name: 'bah.dylib'
exports:
  - targets:           [ x86_64-macos ]
    symbols:         [ _x ]
llvm-objdump.exe a.out  --macho --disassemble-all -r -R --dynamic-reloc --dynamic-syms -t --section=data  --chained-fixups --bind --private-header
a.out:
__Z1mv:
100000590:      55      pushq   %rbp
100000591:      48 89 e5        movq    %rsp, %rbp
100000594:      48 8b 05 6d 2a 00 00    movq    _y(%rip), %rax
10000059b:      c7 00 0f 00 00 00       movl    $15, (%rax)
1000005a1:      48 8b 05 58 1a 00 00    movq    6744(%rip), %rax ## literal pool symbol address: _x
1000005a8:      c7 00 0f 00 00 00       movl    $15, (%rax)
1000005ae:      5d      popq    %rbp
1000005af:      c3      retq

Bind table:
segment  section            address    type       addend dylib            symbol
__DATA_CONST __got              0x100002000 pointer         0 bah              _x
__DATA   __data             0x100003008 pointer         0 bah              _x
__DATA_CONST __got              0x100002008 pointer         0 flat-namespace   dyld_stub_binder

What I read from this is:

movq    6744(%rip), %rax
movl    $15, (%rax)

stores the address of the __GOT entry for x (bind table 1) into %rax, then writes 15 into that (into the GOT which probably should fail due to being readonly). Is this how it’s supposed to work (if so, I can, in code that uses _x directly do a bitcast / load) or a bug in lld somehow, or a fault in my reasoning, as I don’t understand the purpose of the GOT in this case?

It stores the value of the __GOT entry for x into %rax. leaq would store the address; movq gets the value at that address.

Thanks! You helped me a lot. Turns out this all occurred because of confusion with symbols from my side, and it turned out that the “right” value happened to be referend from that symbol. Sorry for the confusion, it was all my own issue.

1 Like