Pending breakpoints to dlsym()ed functions

I'm trying to setup a pending breakpoint for sin() and cos() which are dlsym()ed from libm.so
(sample attached), and an attempt to continue execution seems just hangs the debugger. For example:

(lldb) attach 17043
Process 17043 stopped
* thread #1, name = 't-dlopen', stop reason = signal SIGSTOP
     frame #0: 0x0000000000400728 t-dlopen`main(argc=1, argv=0x00007ffd2b0a00c8) at t-dlopen.c:21
    18 for (a = 0; a < DELAY + argc; a++)
    19 for (b = 0; b < DELAY + argc; b++)
    20 for (c = 0; c < DELAY + argc; c++)
-> 21 z += a + b + c;
    22 while (1)
    23 {
    24 void *handle = dlopen (LIBM_SO, RTLD_LAZY);

Executable module set to "/home/dantipov/tmp/t-dlopen".
Architecture set to: x86_64--linux.
(lldb) breakpoint set -n sin
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) breakpoint set -n cos
Breakpoint 2: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) process continue ;; After this, nothing happens for a long time
Process 17043 resuming
(lldb) process status ;; After this, lldb hangs and have to be killed

I've tried 6.0.0-rc2 as well as 7.0.0 svn trunk 325127, with the same disappointing results.

Dmitry

t-dlopen.c (876 Bytes)

+ eugene as the "most recent person who worked on the DYLD plugin" :smiley:

Hi Dmitry,

I've tried your sample, and I was indeed able to reproduce the
problem. What makes your case special is that "sin" and "cos" are
indirect functions (STT_GNU_IFUNC), so we have to do some extra work
(call the resolver function) to resolve them. Doing that while we're
in the process of loading a module seems to be going south. There seem
to two things going wrong here which contribute to the overall effect
of "hanging":
1: We resolve the address of the resolver function as 0xfff...,
presumably because the module is not fully initialized yet.
2: Calling that address results in an inferior SEGV-ing, but for some
reason InferiorCall function does not detect that. (Probably also has
something to do with the "in the middle of module load" context)

A trivial fix would be to avoid calling an obviously wrong address,
but that's not going to solve your immediate problem (just prevent the
hang). May I suggest you file a bug with this information and we'll
see what we can do about that.

As a workaround, you can try setting the breakpoint on the symbol that
the IFUNC will eventually resolve to (in my case that would be
__sin_avx). Not an ideal solution, but I can't think of anything
better now.

I've changed my sample to dlsym() a regular function instead of an indirect
stub, and got a breakpoint hit, but:

(lldb) attach 16196
Process 16196 stopped
* thread #1, name = 'main', stop reason = signal SIGSTOP
     frame #0: 0x0000000000400798 main`main(argc=1, argv=0x00007ffd6f662668) at main.c:16
    13 for (a = 0; a < DELAY + argc; a++)
    14 for (b = 0; b < DELAY + argc; b++)
    15 for (c = 0; c < DELAY + argc; c++)
-> 16 z += a + b + c;
    17 while (1)
    18 {
    19 void *handle = dlopen ("libfoo.so", RTLD_LAZY);

Executable module set to "/home/dantipov/tmp/t-dl2/main".
Architecture set to: x86_64--linux.
(lldb) breakpoint set -n foo
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) process continue
Process 16196 resuming
1 location added to breakpoint 1
(lldb) error: ld-linux-x86-64.so.2 0x0005d207: adding range [0x14eea-0x14f5a) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d207: adding range [0x14f70-0x14f76) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d268: adding range [0x14eea-0x14f5a) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d268: adding range [0x14f70-0x14f76) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
Process 16196 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
     frame #0: 0x00007f3b1a8536f7 libfoo.so`foo(v=0.00000000000003907985046680551) at libfoo.c:6
    3 double
    4 foo (double v)
    5 {
-> 6 return sin (v) + cos (v);
    7 }

This seems to be an another bug, isn't it?

Dmitry

libfoo.c (73 Bytes)

main.c (761 Bytes)

Makefile (171 Bytes)

Yes, it looks that way, but I cannot reproduce this on my side (which
is not surprising as it involves parsing debug info from your dynamic
linker). I'd need the relevant portions of that file (or just the
whole file) to see what's going on there.

That said, this shouldn't impact you unless you plan to debug the linker itself.

I've tried your sample, and I was indeed able to reproduce the
problem. What makes your case special is that "sin" and "cos" are
indirect functions (STT_GNU_IFUNC), so we have to do some extra work
(call the resolver function) to resolve them.

I've changed my sample to dlsym() a regular function instead of an indirect
stub, and got a breakpoint hit, but:

(lldb) attach 16196
Process 16196 stopped
* thread #1, name = 'main', stop reason = signal SIGSTOP
   frame #0: 0x0000000000400798 main`main(argc=1, argv=0x00007ffd6f662668) at main.c:16
  13 for (a = 0; a < DELAY + argc; a++)
  14 for (b = 0; b < DELAY + argc; b++)
  15 for (c = 0; c < DELAY + argc; c++)
-> 16 z += a + b + c;
  17 while (1)
  18 {
  19 void *handle = dlopen ("libfoo.so", RTLD_LAZY);

Executable module set to "/home/dantipov/tmp/t-dl2/main".
Architecture set to: x86_64--linux.
(lldb) breakpoint set -n foo
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) process continue
Process 16196 resuming
1 location added to breakpoint 1
(lldb) error: ld-linux-x86-64.so.2 0x0005d207: adding range [0x14eea-0x14f5a) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d207: adding range [0x14f70-0x14f76) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d268: adding range [0x14eea-0x14f5a) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
error: ld-linux-x86-64.so.2 0x0005d268: adding range [0x14f70-0x14f76) which has a base that is less than the function's low PC 0x15730. Please file a bug and attach the file at the start of this error message
Process 16196 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
   frame #0: 0x00007f3b1a8536f7 libfoo.so`foo(v=0.00000000000003907985046680551) at libfoo.c:6
  3 double
  4 foo (double v)
  5 {
-> 6 return sin (v) + cos (v);
  7 }

This seems to be an another bug, isn't it?

Yes, the compiler or linker is producing bad DWARF. It is creating DWARF that has a function that has a top level address range of something like [0x1000-0x2000) and it has a child lexical block with a range like [0x900-0x910). All address ranges must be contained in their parent ranges within a DW_TAG_subprogram in the DWARF. If you have llvm-dwarfdump, you can run "llvm-dwarfdump --verify" to see a list of the errors in the DWARF that you can use to file a compiler bug. Try running "llvm-dwarfdump --verify" on the .o file before it is linked. If there are DWARF problems with the .o file, then file a bug on the compiler. If the problem only exists on the final executable, then file a bug against your linker.

Greg