We have an LLVM14 MCJIT implementation we have been trying to get to support aarch64. During testing we have been experiencing sporadic segVs when the M1 memory pressure is notable. Linking to a debug build I found we assert when trying to relocate a page21 jump table.
Assertion failed: (isInt<33>(Addend) && “Invalid page reloc value.”), function encodeAddend
I found a few commits to the JITlink in this arena but have found nothing yet I could patch into the LLVM14 MCJIT we currently use.
That being said I was hoping to patch the RuntimeDyld to not use a JT21 if the address is going to be too big. However, I am completely unfamiliar with how the instructions are determined and am failing to find where this decision is made in the codebase.
The way MCJIT works is that the LLVM compiler generates a normal native object file (as if you ran “clang -c”), and then MCJIT “links” that file in memory. If the object file contains an adrp instruction, there’s no way for MCJIT to fix that.
So there are two approaches you can take here. One approach is to change the “linking” step: you can change your memory manager so it doesn’t allocate the relevant sections so far away form each other. The other approach is to change the “compiling” step: you can use a large code model (-mcmodel=large): tell the compiler that your final executable is going to be larger than 4gb, so it generates alternative (slower) code sequences.
Your post got me wondering about the large code model on an M1.
I was looking at the test test/CodeGen/AArch64/jump-table.ll
It looks like for “aarch64-none-linux-gnu” it is expected that the adrp instructions get replaced. However if I use “aarch64-apple-darwin”, the adrp instructions are there for both small and large code models
I tried LLVM17 out of curiosity and it also leaves the adrp instructions.
Doesn’t -mcmodel=large still switch between direct PC-relative addressing and GOT-indirect addressing? Just because there’s an ADRP doesn’t mean it’s the same ADRP as with the default code model; look at the relocations involved (or the symbol modifiers in the assembly).
The large code model still expects the text segment, which includes read-only data sections like jump tables, to be at most 2 GiB in size, and therefore it is valid to use PC-relative addressing to access jump tables. If you’re exceeding that limit something in your own code has gone horribly wrong.
I do have another ignorant question in this domain that is bothering me.
If I build jump_tables.ll with Default Code Model and aarch64-none-linux-gnu, I get adrp instructions
building with the same triple but Large Code Model I get all mov instructions. What is the difference in the OS’s that allow darwin to just relocate adrp yet linux-gnu need moves?
Darwin is always PIE whereas non-Darwin permits position dependent executables. Though it does seem that -target aarch64-linux-gnu -mcmodel=large -fPIE still produces position-dependent code… GCC instead gives an error that the combination is unsupported, which at least stops it silently doing the wrong thing (though it’ll still be an error at link time so it’s not all bad).