Breakpoint + callback performance ... Can it be faster?

Hello Benjamin , all

>>I recently started using lldb to write a basic instrumentation tool for
>>tracking the values of variables at various code-points in a program.
I have the same problem of tracing some variables and debugging application post-mortem. Without knowing about your experience I've started walking same path and encountered same problem. In my case inserting an empty callback slows-down application by 100x. This is not acceptable for me, because instead of minutes I got hours of runtime.

Every time you use the "expr" command, we compile a tiny c++ code
snippet inject it into the target process and execute it. If you type
"log enable lldb expr" you should be able to follow how exactly that
works. You can use pretty much any c++ construct in the expression
including declaring variables/types:
(lldb) expr -- char s[]="qwerty"; for(int i=0; i < sizeof s; ++i)
printf("%d: %c\n", i, s[i]);
0: q
1: w
2: e
3: r
4: t
5: y
6:

So, if your question is "do we support compiling code and running it
in the debugged process", then the answer is yes. If you want
something that would automatically intercept some function to execute
your code while the process is running (some kind of dynamic
instrumentation), then the answer is no. (But I don't see any mention
of that on the gdb page you quoted either).

cheers,
pavel

Thanks Pavel, you are correct. This was the direction I thought to investigate, but I didnt done my homework yet.

Yes, dynamic instrumentation is what I want. Looks like both lldb and gdb do not allow this directly.

As GDB doc says “After execution, the compiled code is removed from gdb and any new types or variables you have defined will be deleted.”

Do you know why it is the case? Why cannot my generated code persist?

Looks like technically it is possible. For example you can allocate memory for generated code on heap.

GDB does not even allow me to define a new function. But for data dynamic allocation works perfectly fine:

Example:

#include <stdio.h>

char message[1000] = “test message\n”;

int main() {

char *msg = message;

for (int i = 0; i < 5; i++)

printf("%s\n", msg);

return 0;

}

gdb session:

Making the code persist is easy - that's sort of what expr --top-level does.

The tricky part is getting your code to execute. When you evaluate
expressions interactively, we manually modify the registers (PC being
the most important one) to point to the code you want to execute. If
you want it to happen automatically at runtime, you would have to
insert a jump instruction somewhere. The problem is, that can't
usually be done without overwriting a couple of instructions of
original code, which means you then have to somehow simulate the
effects of the overwritten instructions. And there's always a danger
that you will overwrite a jump target and things will blow up when
someone tries to jump there. The way this is normally done is that you
have the compiler insert hooks into your code during compilation, that
you can then intercept if necessary. I am not sure if that would fit
your use case.

pl

Yes, it would fit. I can even insert hooks manually, like:

std::function<void()> instrumentation_hook = []{ /dear lldb script, please insert your code here/ }

instrumentation_hook(); // run instrumentation, by default does nothing.

Is there an easy way to compile some code in lldb and assign to instrumentation_hook ?

Try this for size:

$ bin/lldb /tmp/a.out
(lldb) target create "/tmp/a.out"
Current executable set to '/tmp/a.out' (x86_64).
(lldb) b main
Breakpoint 1: where = a.out`main + 4 at a.cc:6, address = 0x0000000000400531
(lldb) pr la
Process 22550 launched: '/tmp/a.out' (x86_64)
Process 22550 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000000000400531 a.out`main at a.cc:6
   3 void (* volatile hook)();
   4
   5 int main() {
-> 6 printf("before\n");
   7 if(hook) hook();
   8 printf("after\n");
   9 }
(lldb) expr --top-level -- void HOOK() { (int)printf("in hook\n"); }
(lldb) expr hook = &HOOK
(void (*volatile)()) $0 = 0x00007ffff7ff5030
(lldb) c
Process 22550 resuming
before
in hook
after
Process 22550 exited with status = 0 (0x00000000)
(lldb)

Thats nice! But how to enable C++?:

(lldb) expr --top-level –
Enter expressions, then terminate with an empty line to evaluate:
1 #include
2 void HOOK() { std::cout << “in hook\n”; }
3

error: ‘iostream’ file not found

Clang just doesn’t currently know where to look for the standard headers. Not sure if this is a top level code only bug or not. I know the expression parser can include common C headers on Darwin. Not sure if any includes work on linux. Try importing <stdio.h> and see how that goes?

The better way to do what you want is to write a shared library, save it to “/tmp/liba.so” and then load it at runtime:

(lldb) process load /tmp/liba.so

This will load a shared library with all of the top level code you want into your current program, then you can call any functions in that shared library as well using expressions that you need.

If you feel you still need to use the expression parser, then a few tips on the expression parser when not in top level code mode:

  • C++ lambda functions that don’t capture anything can be cast to function pointer types and used as static callbacks.
  • Any variable you define that is prefixed with a ‘$’ will become a global variable and persist beyond your expression (assign a function callback from C++ lambda, and you have a callback you can now use.
  • Any type you declare that is prefixed with a ‘$’ will persist beyond that expression and be available for future expressions.
  • Any type returned as the result type of an expression will have its type available

Greg

Thank you very much Greg and Pavel for help!

I will probably need another months or so to work on my variable tracing stuff. I don’t see any additional roadblocks yet.