COFF::IMAGE_REL_AMD64_REL32 relocation overflow when compiling for x86_64

Some time ago I posted here regarding a relocation overflow on Windows (among other things), but the issue disappeared and so the thread got left. I’ve started this new thread because a) I didn’t want to necro the old one and b) it felt like its own.
I’ve now encountered the issue again and am noting down all the information I can get about it whilst it’s happening.

The issues is that I am getting a relocation overflow assertion inside RuntimeDyldCOFFX86_64.h inside the COFF::IMAGE_REL_AMD64_REL32 case.
However, the other thread left me with the impression that I shouldn’t be getting such relocation when I’m compiling for 64 bit. The only reason I can think of for this that I’m not supposed to get 32 bit relocations in the code I’m building rather than all the code being loaded.

The LLVM side of the call stack looks like this:

_wassert(const wchar_t * expr, const wchar_t * filename, unsigned int lineno) Line 369 C

llvm::RuntimeDyldCOFFX86_64::resolveRelocation(const llvm::RelocationEntry & RE, unsigned __int64 Value) Line 81 C++

llvm::RuntimeDyldImpl::resolveRelocationList(const llvm::SmallVectorllvm::RelocationEntry,64 & Relocs, unsigned __int64 Value) Line 796 C++

llvm::RuntimeDyldImpl::resolveExternalSymbols() Line 849 C++

llvm::RuntimeDyldImpl::resolveRelocations() Line 95 C++

llvm::RuntimeDyld::resolveRelocations() Line 961 C++

llvm::orc::ObjectLinkingLayerllvm::orc::DoNothingOnNotifyLoaded::ConcreteLinkedObjectSet<std::shared_ptrllvm::SectionMemoryManager,ClangClasses::LLVMExecutionEngine::LinkingResolver * __ptr64>::Finalize() Line 112 C++

llvm::orc::ObjectLinkingLayerllvm::orc::DoNothingOnNotifyLoaded::findSymbolIn::__l19::() Line 246 C++

std::_Callable_obj<unsigned __int64 (void),0>::_ApplyX() Line 284 C++

std::_Func_impl<std::_Callable_obj<unsigned __int64 (void),0>,std::allocator<std::_Func_class >,unsigned __int64>::_Do_call() Line 229 C++

std::_Func_class::operator()() Line 316 C++

llvm::orc::JITSymbol::getAddress() Line 62 C++

RelType is 4 (IMAGE_REL_AMD64_REL32).
Value is 139830239098107.
Addend is 0.

The symbol that is currently being resolved is _fperrraise. I did some researching and it appears that this symbol resides in libcmtd.lib (for me the path is C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\lib\amd64\libcmtd.lib).
The relocation type stated in that library (information gathered from dumpbin) is REL32.

I’m not sure what other information there is for me to gather, could somebody please help me resolve this?

Many thanks in advance!

Microsoft compilers have for quite a while now assumed the code you compile is going to be linked into PE images, which are limited to 4GB. So they assume a small memory model and use 32 bit relocations. If at link time it turns out your export is from a DLL the linker will insert a jump stub / dllimport into the image for you which can handle larger distances.

So you can’t straightforwardly load code from a static CRT library into the Dyld-hosted process, since the latter assumes a large memory model.

Your choices are:

  1. Dynamically link whatever you’re compiling against the CRT (compile with /MD or /MDd as appropriate)

  2. I think there has been some work on supporting small memory models in Dyld, you could try that out

  3. Implement a jump stub that is “nearby” the code you’ve compiled that can branch to the target (that is, emulate what the linker does)

Thanks Andy, helpful as always!

1 is a possibility, but not ideal for us.

Could you elaborate a little on 3? I don’t really know what a jump stub is, but am guessing it’s a kind of “alternative symbol” which would just act as a middle man to invoke the “real” symbol in the static library.
If that’s the case, I can think of a way to implement it for specific symbols, but not for the more general case.

Yes, that’s what case (3) is…

Currently you have something like:

@foo()

call RIP + 32-bit disp to __fperrraise

That only works if __fperrraise is sufficiently close to the call.

You can leave some space in the .text section that contains @foo, and when you load that section, if __fperrraise is too far away, you can create a bit of code there to jump to _fperrraise with a 64 bit disp (whose value you know, so it will be a literal), and call that bit of code from @foo. Since the stub is in the same section it will definitely be reachable.

It should work pretty generally. The jmp from the stub will be transparent, though there might be some trickiness if you need a scratch register. You can compute worst-case how many stubs you might need (note you just need one per target, not one per call site) and leave yourself enough space.

There may already be some support for this in dyld. I haven’t needed it so I haven’t looked that closely.

I just tried going along path (1), but I end up with the same issue for a different symbol again, so it seems that (3) is my only option.
Will take a look around dyld to see if this problem has already been solved anywhere, but if not, I’ll see if I can write a more general solution that handles overflowing 32 bit relocations automatically and upstream it.
I’ll also need to learn some LLVM IR and how to inject this bit of code. Should be a lot of … fun ?

I have no idea what a scratch register is.

Thanks very much for your help again!

(Resync with LLVM dev)

I’m trying to follow up on (3), but am having some issues…
The approach I have taken is writing a module pass, iterating through the module and descending into the instructions.

The problem I’m having is that I can’t seem to find any notion of a relocation at all within the LLVM IR, which after some thought would make sense as nothing has been linked yet.

My question is, at what point in the process should I be making such changes? I can’t seem to find the right place.

Thanks in advance!