Android JIT patch

I've attached a patch which has got JIT compilation working (for me at least!) on Android. It turns out that the problem was a bunch of intrinsic __aeabi* functions which reside in libgcc.a rather than libc.so so are not available unless explicitly linked in, so it's rather similar to the StatSymbols hack.

I moved the StatSymbols code into ExecutionEngine.cpp rather than JITMemoryManager.cpp since it's required for both the JIT and MCJIT, and deleted the code in RTDyldMemoryManager.cpp which did the same thing for fewer functions. They should now be picked up through the sys::DynamicLibrary::AddSymbol mechanism in both cases. The symbols required for ARM/Android are then added by an identical hack just below.

There's one other minor change since setLastModificationAndAccessTime can't be supported on Android; all relevant system calls are missing from the C library. I've therefore changed the code to fail at runtime rather than compile-time here.

James

llvm-android.patch (8.89 KB)

I've attached a patch which has got JIT compilation working (for me at
least!) on Android. It turns out that the problem was a bunch of intrinsic
__aeabi* functions which reside in libgcc.a rather than libc.so so are not
available unless explicitly linked in, so it's rather similar to the
StatSymbols hack.

Hi James,

I may be wrong, but I remember LLVM treating "androideabi" as "aeabi",
which is wrong. This may be the source of your problem, and I think we
should teach Clang/LLVM to understand the Android ABI in its full form.

There's one other minor change since setLastModificationAndAccessTime can't

be supported on Android; all relevant system calls are missing from the C
library. I've therefore changed the code to fail at runtime rather than
compile-time here.

This sounds worse... I'd rather we fixed the behaviour than the cause...

cheers,
--renato

I think __aeabi is ARM EABI not Android EABI, e.g. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0043d/index.html. The problem is not that the __aeabi functions are missing on Android, they're defined in libgcc.a, the problem is that they don't get linked in unless they're referenced. I don't know how this problem is avoided on ARM/Linux/glibc normally, but something pulls them in there according to my testing.

The way to avoid disabling setLastModificationAndAccessTime is to rewrite it to take a path and use utime/utimes instead (incidentally the setLastModificationAndAccessTime function is only used by llvm-ar). I don't know why it was written the way it is now; i.e. I'm not sure what systems have futimens but not futimes and vice-verse. I would guess that the utime function should be universal though. There's a HAVE_UTIME_H test in Config.h but it doesn't seem to be used anywhere.

James

I think __aeabi is ARM EABI not Android EABI, e.g.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0043d/index.html
.

Indeed, this is why I was surprised that Android used it with GCC.

The problem is not that the __aeabi functions are missing on Android,
they're defined in libgcc.a, the problem is that they don't get linked in
unless they're referenced. I don't know how this problem is avoided on
ARM/Linux/glibc normally, but something pulls them in there according to my
testing.

Knowing how Linux avoids it would help deciding which is the best solution.
Prior art is always a good start. :wink:

The way to avoid disabling setLastModificationAndAccessTime is to rewrite

it to take a path and use utime/utimes instead (incidentally the
setLastModificationAndAccessTime function is only used by llvm-ar). I don't
know why it was written the way it is now; i.e. I'm not sure what systems
have futimens but not futimes and vice-verse. I would guess that the utime
function should be universal though. There's a HAVE_UTIME_H test in
Config.h but it doesn't seem to be used anywhere.

I'd guess utime should be used. This looks like a bug in llvm-ar, though.

--renato

I've got a number of problems with this patch.

First, we have plans to pry apart the remaining strands connecting JIT with MCJIT, so I don't want to do anything that reconnects them. That is, I'm against moving things from RTDyldMemoryManager into ExecutionEngine.

Second, unless I'm reading it wrong, this relies on static constructors. That causes headaches and is against the LLVM coding standards.

Third, I don't think sys::DynamicLibrary::AddSymbol() is the right way to go here. I know I have recently suggested using that function, but I was wrong. That function exposes symbols to the entire host program as if they were present in a locally loaded shared library. For some programs that may be acceptable behavior, but it cannot be the default behavior for LLVM. In particular, MCJIT may not even be generating code for local execution.

The default RTDyldMemoryManager implementation selectively exposes symbols to execution engines for which it is used, and MCJIT/RuntimeDyld clients performing remote execution can (and should) override this behavior. We probably also need to fix the addGlobalMapping support for MCJIT.

-Andy

I've got a number of problems with this patch.

First, we have plans to pry apart the remaining strands connecting JIT with MCJIT, so I don't want to do anything that reconnects them. That is, I'm against moving things from RTDyldMemoryManager into ExecutionEngine.

The direction of LLVM is not something I'm in a position to comment on. The problem of locating these functions is common to the JIT and MCJIT though (and even the interpreter for Linux/x86).

Second, unless I'm reading it wrong, this relies on static constructors. That causes headaches and is against the LLVM coding standards.

My bad, I just ripped off the existing implementation. I guess the JITMemoryManager part is destined to go away along with the JIT.

Third, I don't think sys::DynamicLibrary::AddSymbol() is the right way to go here. I know I have recently suggested using that function, but I was wrong. That function exposes symbols to the entire host program as if they were present in a locally loaded shared library. For some programs that may be acceptable behavior, but it cannot be the default behavior for LLVM. In particular, MCJIT may not even be generating code for local execution.

Well, that's a bit trickier. Somewhere there needs to be a hook to find these functions (and make the linker include them). For remote execution I suppose that the user would have to load a shared object supporting this functionality into the target address space and then query it for the address of such functions. I'm assuming that the user isn't expected to solve this problem if they're not using remote execution.

The default RTDyldMemoryManager implementation selectively exposes symbols to execution engines for which it is used, and MCJIT/RuntimeDyld clients performing remote execution can (and should) override this behavior. We probably also need to fix the addGlobalMapping support for MCJIT.

The situation right now is actually more complicated than this: since llvm-config --libs MCJIT includes the LLVMJIT library the JITMemoryManager hack is always there (subject to being dropped by ld?) so these symbols are actually exposed globally.

As I said, I basically just borrowed from the existing code and added the functions which won't work without glibc on ARM. sys::DynamicLibrary::SearchForAddressOfSymbol only seems to be used by stuff under lib/ExecutionEngine, so making that find functions which are expected to be in libc during normal linking doesn't seem unreasonable.

James

The various ExecutionEngine pieces may be the only places within LLVM that are using the DynamicLibrary stuff, but there's no telling what various LLVM consumers may be using it for.

The "stat" symbols shouldn't be exposed the way they are in the old JIT memory manager location. It's not clear to me why that didn't happen inside getPointerToNamedFunction there.

Is there a reason not to just do this kind of handling in RTDyldMemoryManager/DefaultJITMemoryManager::getPointerToNamedFunction()?

-Andy

Well, is the attached version better? I've extended the RTDyldMemoryManager hooks instead to pick up the ARM math functions statically, and left JITMemoryManager alone except for changing the conditional so that it will build with non-glibc libraries.

I've also split the original patch up into two parts, to separate the math function fixes from setLastModificationAndAccessTime. The second patch modifies the setLastModificationAndAccessTime function to use utime (so it takes a filename) rather than just refusing to compile if futimes and futimens are not available,
due to Renato's suggestion. I don't know whether this has external users, but the function didn't exist in 3.3.

James

arm-android.patch (3.62 KB)

no-futimes.patch (3.95 KB)

Thanks for splitting up the patch. I'm not the right person to comment on the modification and access time changes (though it looks alright to me). You might want to re-submit that part on its own as it's likely to be ignored by people who aren't interested in JIT stuff otherwise.

Regarding the memory manager changes, functionally this looks good. I'm not a big fan of macros, and especially not nested macros. It took some mental effort to work through it and verify that it was doing what it was supposed to be doing. I can only imagine that it would be even harder to comprehend for someone who didn't know what it was supposed to be doing.

I wouldn't be opposed to a single macro that wrapped the 'if (Name == "foo") return (uint64_t)&foo;' pattern, which is quite ugly.

What do you think of this?

#define EXPOSE_STATIC_FUNCTION (fn) if (Name == #fn) return (uint64_t)&fn;

And then inside getSymbolAddress:

#if ... conditions ...
  EXPOSE_STATIC_FUNCTION(__aeabi_d2f);
  EXPOSE_STATIC_FUNCTION(__aeabi_d2iz);
  EXPOSE_STATIC_FUNCTION(__aeabi_d2lz);
  etc.
#endif

-Andy

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Kaylor, Andrew
Subject: Re: [LLVMdev] Android JIT patch

What do you think of this?

#define EXPOSE_STATIC_FUNCTION (fn) if (Name == #fn) return (uint64_t)&fn;

I think you'll need to remove the space between *_FUNCTION and the parenthesis.

- Chuck

Well, the reason I did it that way was that the list of functions is fairly long and has to be written out twice; once to declare the functions and once to do the if(Name==#fn) bit, and I thought nested macros was simpler than having two copies of the list.

Oh, I see now. It turns out that even knowing what the end goal was I misunderstood part of what the macros were doing.

If you could add some comments explaining what the macros are doing then I guess I can live with the patch in this form. I definitely agree that it's better not to have two copies of the list.

Thanks,
Andy

Here's an updated version with more comments.

James

arm-android.patch (4.51 KB)

That looks great. Thanks for seeing this through!

-Andy

Committed as r194832.

Thank you!