Patches for python bindings

Hi Filipe,
I reviewed the patch for Predicate.h

Looks good, and committed:

Sending include/lldb/Host/Predicate.h
Transmitting file data .
Committed revision 152733.

I will look to the other ones right away.

Enrico Granata
:email: egranata@.com
✆ (408) 972-7683

Hi Filipe,
I have looked at the lldb-disable-python-interpreter.patch.

I am unsure as to why we need to have this finer-grained control over disabling the ScriptInterpreter AND/OR the SWIG-stuff. Would we ever want one without the other?

Also, the patch as-is causes crashes on my setup. Did you notice anything unusual when testing it?

This is what I get when testing the data formatters (which heavily use Python, so make for a good candidate to detect issues)

egranata:test egranata$ ./dotest.py functionalities/data-formatter/
LLDB build dir: /Volumes/work/egranata/lldb.patching/build/Debug
LLDB-131
Path: /Volumes/work/egranata/lldb.patching
URL: https://enrico@llvm.org/svn/llvm-project/lldb/trunk
Repository Root: https://enrico@llvm.org/svn/llvm-project
Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8
Revision: 152732
Node Kind: directory
Schedule: normal
Last Changed Author: johnny
Last Changed Rev: 152732
Last Changed Date: 2012-03-14 11:35:38 -0700 (Wed, 14 Mar 2012)

Session logs for test failures/errors/unexpected successes will go into directory ‘2012-03-14-12_20_37’
Command invoked: python ./dotest.py functionalities/data-formatter/
compilers=[‘clang’]

Configuration: arch=x86_64 compiler=clang

Hi Filipe,
I have also looked at lldb-enable-log-callback.patch

As far as I can tell, it depends on SWIG’s threading support for its operation (since otherwise it looks like nothing prevents multiple threads from calling LLDBSwigPythonCallPythonLogOutputCallback).

Is there a way to change the patch so it does not depend on that feature, since apparently it is not working well with the existing LLDB code?

Thanks,

Enrico Granata
:email: egranata@.com
✆ (408) 972-7683

Hi Enrico,

It doesn't necessarily depend on SWIG's thread features.
The thing is: if we're in a multithreading python environment (which is what we have to assume, when there is a python program using lldb's event API), then the callback helper function will lock the GIL before calling python code. If SWIG isn't run with the "-threads" argument, then those calls are no-ops (SWIG_PYTHON_THREADS is undefined and those two symbols will be defined to nothing).

I'm locking the GIL because I'm planning on fixing lldb's ScriptInterpreterPython code to lock the GIL when it needs to. When that happens, these locks will need to be here to make the callback feature continue working.

There's also no problem in having several threads calling LLDBSwigPythonCallPythonLogOutputCallback at the same time: the callback would be the one responsible for any races with output channels, since LLDB doesn't know anything about what will happen to the strings in the callback function.

If you would prefer to leave the threading issue to a later date, I can send a commit without those two lines, too.

Regards,

  Filipe

Hi Enrico,

That was my problem. I'm using thread-enabled SWIG bindings for an app (actually an app plugin) I'm writing using lldb's Python bindings.

But for ScriptInterpreterPython to work, SWIG can't have threads enabled yet (I want to correct that in the future), since it doesn't manage Python's GIL like it should.

With this patch an lldb user can use the python bindings and lldb's asynchronous mode (without the Python script interpreter). Otherwise, threads would get locked up because a python thread was waiting for events without releasing the GIL, and other Python threads couldn't execute any Python code.

Attached is a new patch, that doesn't enable SWIG's thread stuff. That way, the code is already split and allows asynchronous lldb usage in Python when disabling the script interpreter but not the bindings.

In the future, when the script interpreter's bugs are fixed, we would be able to add the "-threads" flags to SWIG (and maybe enable/disable if desired).

Thanks,

  Filipe

lldb-disable-python-interpreter-no-threads.patch (11.4 KB)

I forgot to mention that I just tested the difference with this last patch (without the -threads arg to SWIG) and the tests had the same result as the trunk version.

Regards,

  Filipe

Hi

Replies inlined.

Enrico Granata
:email: egranata@.com
✆ (four oh eight) 862-7683

Hi Enrico,

It doesn’t necessarily depend on SWIG’s thread features.
The thing is: if we’re in a multithreading python environment (which is what we have to assume, when there is a python program using lldb’s event API), then the callback helper function will lock the GIL before calling python code. If SWIG isn’t run with the “-threads” argument, then those calls are no-ops (SWIG_PYTHON_THREADS is undefined and those two symbols will be defined to nothing).

It is correct to assume multithreading indeed.

I’m locking the GIL because I’m planning on fixing lldb’s ScriptInterpreterPython code to lock the GIL when it needs to. When that happens, these locks will need to be here to make the callback feature continue working.

We are currently using our own Locker implementation. If you find any bugs with it, please feel free to let us know/fix them and send in a patch for review.
If there are good reasons to move from what we are doing to the GIL locking, and nothing get broken in the conversion, then again send in a patch :slight_smile:

There’s also no problem in having several threads calling LLDBSwigPythonCallPythonLogOutputCallback at the same time: the callback would be the one responsible for any races with output channels, since LLDB doesn’t know anything about what will happen to the strings in the callback function.

It is true. You could arrange for LLDBSwigPythonCallPythonLogOutputCallback to manage mutual exclusion.
My question is how are you going to make sure that the locking is synchronized with what ScriptInterpreterPython does, so that don’t end up having a log-output callback and something else entirely be running into Python together, both believing they hold an exclusive lock?

If you would prefer to leave the threading issue to a later date, I can send a commit without those two lines, too.

I would definitely address the threading issue in a separate and comprehensive patch. Moving to the GIL locking might be the right thing to do, but having most of the code use the Locker class and a few places the GIL locking seems a good call for trouble!

Hi Enrico,

Well, for now I can't, using the public API, lock the ScriptPythonIntepreter lock. I will change that class and make it use only the Python-blessed GIL locking mechanism, instead of the home-made lock. Otherwise the Python bindings will only serve for synchronous debugging, since other Python threads can never lock it.

Regards,

  Filipe

Hi Filipe,
replies inlined.

Enrico Granata
:email: egranata@.com
✆ (four oh eight) 862-7683

Hi Enrico,

Well, for now I can’t, using the public API, lock the ScriptPythonIntepreter lock.

True.

I will change that class and make it use only the Python-blessed GIL locking mechanism, instead of the home-made lock. Otherwise the Python bindings will only serve for synchronous debugging, since other Python threads can never lock it.

This might be the right thing to do, or it might not.

Having multiple locking mechanisms and some code uses one and some uses another is a no-no, since that amounts to having no locking at all :slight_smile: We need to pick one way. If the GIL approach has merits (other than being The Dogmatic Right Way), then I guess one might implement the Locker class to use Python’s GIL macros.

You would rewrite:

bool
DoAcquireLock ();

bool
DoFreeLock ();
to use the GIL macros. The rest of the code should be unaffected.

However, in the future we could need to keep track of additional information for the lock. Currently, all we store is a thread-id, but being able to store additional information (e.g. which method acquired the lock) might become important. If you do a major rewrite of the locking code, you should keep this in mind as a requirement.

An alternative approach would be for your log callback function to take a struct as a baton, and have it contain the actual PyObject* wrapping the function plus a ScriptInterpreterPython*. Then you would have your SWIG-binding call routed through the ScriptInterpreter (much like it happens for data formatters today). This would enable you to merge seamlessly in the way locks are currently managed. Even better, we now have a ScriptInterpreterObject class that efficiently does the right thing when it comes to ref-counting scripted objects. If you feel the need to pass objects from Python to C++ and back, consider wrapping the PyObject*. All you need to call is ScriptInterpreterPython::MakeScriptObject and you get a C++ shared pointer to the object in return.

Hi Enrico,

Replies inline.

Regards,

  Filipe

Hi Filipe,
replies inlined.

Enrico Granata
:email: egranata@.com
✆ (four oh eight) 862-7683

> Hi Enrico,
>
> Well, for now I can't, using the public API, lock the ScriptPythonIntepreter lock.

True.
> I will change that class and make it use only the Python-blessed GIL locking mechanism, instead of the home-made lock. Otherwise the Python bindings will only serve for synchronous debugging, since other Python threads can never lock it.

This might be the right thing to do, or it might not.

Having multiple locking mechanisms and some code uses one and some uses another is a no-no, since that amounts to having no locking at all :slight_smile: We need to pick one way. If the GIL approach has merits (other than being The Dogmatic Right Way), then I guess one might implement the Locker class to use Python's GIL macros.

Well, if you want to call Python code… Then you have to use the GIL. Unless you only let the user use a small python dialect (with no threading support, etc). Due to the way the Python interpreter works, You have to have the GIL locked when interpreting Python bytecode. If you want to support Python bindings, you have to choose between using the GIL or only having sync mode (without any traces of the embedded Python interpreter).

You would rewrite:
bool
DoAcquireLock ();

bool
DoFreeLock ();
to use the GIL macros. The rest of the code should be unaffected.

Yes, that was how I was thinking about doing it. Possibly also keeping the existing ScriptInterpreter lock (at least for a while). Since it is only used when lldb interprets Python, it shouldn't even do any difference (because Python already forces every thread to lock before executing Python code). Other Python threads wouldn't stall in the middle of lldb-land because of that since every SWIG call (when the -threads argument is passed to SWIG) will unlock the GIL when leaving Python-land and lock it immediately after performing the C++ call.

However, in the future we could need to keep track of additional information for the lock. Currently, all we store is a thread-id, but being able to store additional information (e.g. which method acquired the lock) might become important. If you do a major rewrite of the locking code, you should keep this in mind as a requirement.

An alternative approach would be for your log callback function to take a struct as a baton, and have it contain the actual PyObject* wrapping the function plus a ScriptInterpreterPython*. Then you would have your SWIG-binding call routed through the ScriptInterpreter (much like it happens for data formatters today). This would enable you to merge seamlessly in the way locks are currently managed. Even better, we now have a ScriptInterpreterObject class that efficiently does the right thing when it comes to ref-counting scripted objects. If you feel the need to pass objects from Python to C++ and back, consider wrapping the PyObject*. All you need to call is ScriptInterpreterPython::MakeScriptObject and you get a C++ shared pointer to the object in return.

With the current API there is no way to access any ScriptInterpreter classes. Unless you want to expose them to the API, which can make some sense but would have to be thought through. Using this latest strategy would entail to a lot more than just changing how the lock works, since the Python API wrappers should only use the SB* API stuff.
It would also be farther from a simple bridge from Python to the C++ API and would be closer to a dedicated Python API. Would that be desired?