Trouble Resolving Objective-C Symbols in lli

Hi there, I'm trying to run trivial Objective-C code that uses the
Foundation framework under MacOS X in lli. It seems that the code will
compile and
run using llc, however fails to work in lli.

SimpleFoundation.m:

Hi there, I'm trying to run trivial Objective-C code that uses the Foundation framework under MacOS X in lli. It seems that the code will compile and run using llc, however fails to work in lli.

Nice! this is a great project, unfortunately, there are some issues here :slight_smile:

I'm CC'ing Marcel, as he has some experience with dynamic generation of code with the objc runtime library.

SimpleFoundation.m:
----

#import <Foundation/Foundation.h>

int main (int argc, const char * argv) {
   NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

   // insert code here...
   NSLog(@"Hello, World!");
   [pool release];
   return 0;
}

----

$ llvm-gcc -c -emit-llvm SimpleFoundation.m -o SimpleFoundation.bc
$ llc SimpleFoundation.bc -o SimpleFoundation.s
$ gcc /System/Library/Frameworks/Foundation.framework/Foundation
SimpleFoundation.s -o SimpleFoundation.exe
$ ./SimpleFoundation.exe

2007-07-19 17:42:32.667 SimpleFoundation.exe[2535] Hello, World!

yay :slight_smile:

$ lli -load=/System/Library/Frameworks/Foundation.framework/Foundation
SimpleFoundation.bc

Segmentation fault

$ lli -force-interpreter
-load=/System/Library/Frameworks/Foundation.framework/Foundation
SimpleFoundation.bc

Could not resolve external global address: .objc_class_name_NSAutoreleasePool
Abort trap

$ nm /System/Library/Frameworks/Foundation.framework/Foundation |
grep .objc_class_name_NSAutoreleasePool

00000000 A .objc_class_name_NSAutoreleasePool

Ok, as you figured out, you do need to tell lli explicitly what frameworks to load. Once you have that, you are hitting another problem. Specifically, the JIT::getPointerToNamedFunction method in lib/ExecutionEngine/JIT/Intercept.cpp just does a dlsym on missing symbols. If dlsym returns null, you get the error message.

The problem here is that .objc_class_name_* are special symbols that are used by the objc linker support and they have magic meaning. This itself isn't a problem, the problem is that they are absolute symbols (which is why nm prints 'A' for the symbol) and their absolute value is 0. When dlsym correctly returns the address of this symbol, it returns a null pointer (which is the address of the symbol) and lli aborts because it thinks dlsym returned failure.

After consulting with our dynamic linker guru, I don't think there is a wonderful clean way to do this. I suggest adding a memcmp to the error path in getPointerToNamedFunction that checks to see if the symbol starts with ".objc_class_name_". If so, getPointerToNamedFunction should return null without aborting. There may be one or two other prefixes you will have to add, for selectors or categories.

Once this is done, your example above should work. However, when you move on to more interesting examples, you'll probably hit other issues. For example, the objc runtime needs to be informed of any new classes that are added to an address space. I think the runtime has API calls that are used to do this, but the LLVM JIT currently doesn't know how to do any of them.

If you're interested in investigating this and adding support to LLVM, it would be greatly appreciated.

-Chris

Hi Chris,

Once you have that, you are hitting another problem. Specifically,
the JIT::getPointerToNamedFunction method in
lib/ExecutionEngine/JIT/Intercept.cpp just does a dlsym on missing
symbols. If dlsym returns null, you get the error message.

The problem here is that .objc_class_name_* are special symbols that
are used by the objc linker support and they have magic meaning. This
itself isn't a problem, the problem is that they are absolute symbols
(which is why nm prints 'A' for the symbol) and their absolute value
is 0. When dlsym correctly returns the address of this symbol, it
returns a null pointer (which is the address of the symbol) and lli
aborts because it thinks dlsym returned failure.

After consulting with our dynamic linker guru, I don't think there is
a wonderful clean way to do this.

I could be missing something, but shouldn't the use of dlsym() be

    char *err;
    void *p;

    if ((err = dlerror())) {
        error("earlier undetected dlerror: %s\n", err);
    }
    p = dlsym(handle, sym);
    if ((err = dlerror())) {
        error("dlsym failed: %s\n", err);
    }

    return p;

The authors of dlsym() realised the return value was overloaded AFAICS.

Cheers,

Ralph.

Hi Ralph,

Hi Chris,

I could be missing something, but shouldn't the use of dlsym() be

    char *err;
    void *p;

    if ((err = dlerror())) {
        error("earlier undetected dlerror: %s\n", err);
    }
    p = dlsym(handle, sym);
    if ((err = dlerror())) {
        error("dlsym failed: %s\n", err);
    }

    return p;

The authors of dlsym() realised the return value was overloaded AFAICS.

No, you're not missing anything. The correct way to check for errors is
with dlerror.

Please note that on the "trunk" revision of llvm (soon to be 2.1), and I
think also 2.0, there are 0 calls to dlsym in the Intercept.cpp. They
have been replaced with a call to

  sys::DynamicLibrary::SearchForAddressOfSymbol(NameStr)

Which is part of LLVM's lib/System package. That package implements this
using the libtool "ltdl" library, which presumably gets this right in an
operating system correct way.

One of the issues with this is that on some systems you can't get the
symbols that were linked into the native executable, but only from an
actually loaded shared object.

Reid.

Hi Reid,

> if ((err = dlerror())) {
> error("earlier undetected dlerror: %s\n", err);
> }
> p = dlsym(handle, sym);
> if ((err = dlerror())) {
> error("dlsym failed: %s\n", err);
> }

No, you're not missing anything. The correct way to check for errors
is with dlerror.

Please note that on the "trunk" revision of llvm (soon to be 2.1), and
I think also 2.0, there are 0 calls to dlsym in the Intercept.cpp.
They have been replaced with a call to

  sys::DynamicLibrary::SearchForAddressOfSymbol(NameStr)

Which is part of LLVM's lib/System package. That package implements
this using the libtool "ltdl" library, which presumably gets this
right in an operating system correct way.

Presumably?
http://www.gnu.org/software/libtool/manual.html#index-lt_005fdlsym-167
says:

    Function: lt_ptr lt_dlsym(lt_dlhandle handle, const char *name)

        Return the address in the module handle, where the symbol given
        by the null-terminated string name is loaded. If the symbol
        cannot be found, NULL is returned.

And lt_dlerror() also appears to have the same behaviour as its non-lt_
counterpart, so you'd think you'd have to do the same as above; use
lt_dlerror().

However, it appears things like libtool's

    static lt_ptr
    sys_dl_sym (loader_data, module, symbol)
         lt_user_data loader_data;
         lt_module module;
         const char *symbol;
    {
      lt_ptr address = dlsym (module, symbol);

      if (!address)
        {
          LT_DLMUTEX_SETERROR (DLERROR (SYMBOL_NOT_FOUND));
        }

      return address;
    }

break the ability to detect an undefined symbol versus a symbol with a
value of 0 because it sets the lt_dlerror() whenever dlsym() returns 0.
Perhaps a bug in libtool, I was looking at 1.5.6, but it's a twisty
maze and I could have taken a wrong turn.

It's clear dlsym() gives the required functionality; the layers of
wrapping on top of it present a broken interface.

Cheers,

Ralph.

Yep, patches welcome :slight_smile:

-Chris

Hi Ralph,

Hi Reid,

> > if ((err = dlerror())) {
> > error("earlier undetected dlerror: %s\n", err);
> > }
> > p = dlsym(handle, sym);
> > if ((err = dlerror())) {
> > error("dlsym failed: %s\n", err);
> > }
>
> No, you're not missing anything. The correct way to check for errors
> is with dlerror.
>
> Please note that on the "trunk" revision of llvm (soon to be 2.1), and
> I think also 2.0, there are 0 calls to dlsym in the Intercept.cpp.
> They have been replaced with a call to
>
> sys::DynamicLibrary::SearchForAddressOfSymbol(NameStr)
>
> Which is part of LLVM's lib/System package. That package implements
> this using the libtool "ltdl" library, which presumably gets this
> right in an operating system correct way.

Presumably?
http://www.gnu.org/software/libtool/manual.html#index-lt_005fdlsym-167
says:

    Function: lt_ptr lt_dlsym(lt_dlhandle handle, const char *name)

        Return the address in the module handle, where the symbol given
        by the null-terminated string name is loaded. If the symbol
        cannot be found, NULL is returned.

And lt_dlerror() also appears to have the same behaviour as its non-lt_
counterpart, so you'd think you'd have to do the same as above; use
lt_dlerror().

However, it appears things like libtool's

    static lt_ptr
    sys_dl_sym (loader_data, module, symbol)
         lt_user_data loader_data;
         lt_module module;
         const char *symbol;
    {
      lt_ptr address = dlsym (module, symbol);

      if (!address)
        {
          LT_DLMUTEX_SETERROR (DLERROR (SYMBOL_NOT_FOUND));
        }

      return address;
    }

break the ability to detect an undefined symbol versus a symbol with a
value of 0 because it sets the lt_dlerror() whenever dlsym() returns 0.
Perhaps a bug in libtool, I was looking at 1.5.6, but it's a twisty
maze and I could have taken a wrong turn.

It is a twisty mess. Function pointers in structures? Who'd ever do
THAT? :slight_smile:

We're currently using 1.5.22 but it doesn't look any different than what
you ahve above. I'm about to upgrade the support module to use 1.5.24
(latest stable).

However, I don't think the code above is necessarily wrong. If you get
an address back, there's no error (dlerror only reports and error when
dlsym returns 0). The DLERROR macro just calls dlerror. The
LT_DLMUTEX_SETERROR looks like this:

#define LT_DLMUTEX_SETERROR(errormsg) LT_STMT_START { \
        if (lt_dlmutex_seterror_func) \
                (*lt_dlmutex_seterror_func) (errormsg); \
        else lt_dllast_error = (errormsg); } LT_STMT_END

So, its basically calling a function to report the error or saving the
error. Presumably the user was supposed to set up the
lt_dlmutext_seterror_func or use lst_dlerror to access the error
message.

I think the entire problem is that the lib/System library is not using
the lt_dlerror function properly. For example:

  // Now search the libraries.
  for (std::vector<lt_dlhandle>::iterator I = OpenedHandles.begin(),
       E = OpenedHandles.end(); I != E; ++I) {
    lt_ptr ptr = lt_dlsym(*I, symbolName);
    if (ptr)
      return ptr;
  }

This needs to call lt_dlerror to clear any previous error, then call
dl_sym, then call dl_error again to see if there's an error. The lack of
this checking makes it miss the "return 0 without error" case.

I'll fix this.

Reid.

Chris / Ralph / Others,

> I could be missing something, but shouldn't the use of dlsym() be
> The authors of dlsym() realised the return value was overloaded AFAICS.

Yep, patches welcome :slight_smile:

-Chris

I don't recall who originally asked for help with this, but here's a
patch that could fix it. Please try this and let me know if it works. If
so, I'll commit it. The patch isn't complete (it should do something in
the error condition) but it should allow you to find symbols whose
address is 0.

Reid.

Index: DynamicLibrary.cpp

DL.patch (817 Bytes)

Hey Reid,

I don’t recall who originally asked for help with this, but here’s a
patch that could fix it. Please try this and let me know if it works. If
so, I’ll commit it. The patch isn’t complete (it should do something in
the error condition) but it should allow you to find symbols whose
address is 0.

It was me who was having the issues with Objective-C in the interpreter.

Thankyou, I did try the patch, and it doesn’t seem to work, or at least for my one program. In JIT mode it still segfaults and in interpreter mode it cannot resolve the
symbol.

I am at my wits end with this problem, the dlopen documentation states that if it
explicidly fails, it will return a NULL pointer. However it doesn’t say anything about
succeeding with a NULL pointer. If the simple Objective-C program is still failing, even
when checking if the lt_dlerror() code is set. I see two possibilities:

a) The error code is being set even when the symbol is found and has a 0 address
b) The error is caused by some other problem, maybe further upstream, indicated by the
fact that the JIT interpreter crashes with a segfault. Because normal unresolved
symbols will not crash the interpreter even in JIT mode.

I’m using the 2.0 release, and I’m about to get the bleeding edge SVN version to try,
however my mac is quite slow and will take an hour or two to compile the source code.

My knowleage of the Obj-C runtime is quite limited and I am new to the LLVM
architecture. So I’m sorry if I can’t be of more assistance code wise. But I’m currently
reading up on both in my spare time.

Andy.

Hi Andy,

Hey Reid,

        I don't recall who originally asked for help with this, but
        here's a
        patch that could fix it. Please try this and let me know if it
        works. If
        so, I'll commit it. The patch isn't complete (it should do
        something in
        the error condition) but it should allow you to find symbols
        whose
        address is 0.

It was me who was having the issues with Objective-C in the
interpreter.

Okay, sorry, I didn't remember.

Thankyou, I did try the patch, and it doesn't seem to work, or at
least for my one program. In JIT mode it still segfaults and in
interpreter mode it cannot resolve the
symbol.

Bummer.

I am at my wits end with this problem, the dlopen documentation states
that if it
explicidly fails, it will return a NULL pointer. However it doesn't
say anything about
succeeding with a NULL pointer. If the simple Objective-C program is
still failing, even
when checking if the lt_dlerror() code is set. I see two
possibilities:

a) The error code is being set even when the symbol is found and has a
0 address

unlikely

b) The error is caused by some other problem, maybe further upstream,
indicated by the
fact that the JIT interpreter crashes with a segfault. Because normal
unresolved
symbols will not crash the interpreter even in JIT mode.

This is probably it. The JIT probably doesn't know what to do with a
symbol whose address is 0, so, just for grins, it tries dereferencing
it.

What exactly are these symbols? How would one differentiate them if they
all have 0 address? What should the JIT do with them?

I'm using the 2.0 release, and I'm about to get the bleeding edge SVN
version to try,
however my mac is quite slow and will take an hour or two to compile
the source code.

I don't think much has changed in this area since 2.0, but you could
try.

My knowleage of the Obj-C runtime is quite limited and I am new to the
LLVM
architecture. So I'm sorry if I can't be of more assistance code wise.
But I'm currently
reading up on both in my spare time.

I don't know the Obj-C runtime at all so that makes you an expert
compared to me :slight_smile:

Reading up will probably help a lot.

Best Regards,

Reid.

The JIT should just return null for them. They can't be differentiated. Their address is just zero.

-Chris

Hi Reid,

> static lt_ptr
> sys_dl_sym (loader_data, module, symbol)
> lt_user_data loader_data;
> lt_module module;
> const char *symbol;
> {
> lt_ptr address = dlsym (module, symbol);
>
> if (!address)
> {
> LT_DLMUTEX_SETERROR (DLERROR (SYMBOL_NOT_FOUND));
> }
>
> return address;
> }
>
> break the ability to detect an undefined symbol versus a symbol with
> a value of 0 because it sets the lt_dlerror() whenever dlsym()
> returns 0. Perhaps a bug in libtool, I was looking at 1.5.6, but
> it's a twisty maze and I could have taken a wrong turn.

It is a twisty mess. Function pointers in structures? Who'd ever do
THAT? :slight_smile:

If I know the right way to use dlsym() I can probably handle a function
pointer or two. :wink:

However, I don't think the code above is necessarily wrong. If you get
an address back, there's no error (dlerror only reports and error when
dlsym returns 0).

If you mean sys_dl_sym() then I think you've taken the wrong twisty
passage. It seems broken to me. It sets its internal error code to
SYMBOL_NOT_FOUND if dlsym() returns 0. It too should do the double
shuffle with dlerror() to correctly detect a missing symbol but doesn't.
I've mailed the libtool guys to see if they concur.

I think the entire problem is that the lib/System library is not using
the lt_dlerror function properly. For example:

  // Now search the libraries.
  for (std::vector<lt_dlhandle>::iterator I = OpenedHandles.begin(),
       E = OpenedHandles.end(); I != E; ++I) {
    lt_ptr ptr = lt_dlsym(*I, symbolName);
    if (ptr)
      return ptr;
  }

This needs to call lt_dlerror to clear any previous error, then call
dl_sym, then call dl_error again to see if there's an error. The lack
of this checking makes it miss the "return 0 without error" case.

I agree LLVM has bugs too, and that it needs to do its own
double-shuffle due to the overloaded API originated by dlsym(). But at
the moment libtool dlsym()-wrapper seems buggy and since it sets its
internal error code when it shouldn't, LLVM will see that on the second
call to lt_dlerror().

Cheers,

Ralph.