Can Process hold a TargetSP instead of a Target&?

We’ve been seeing races during shutdown of inferiors for months, and I finally tracked it down to the fact that Process holds a Target&. When an inferior is exiting on Windows, we will get a notification of this and we try to do various cleanup related with the target. But there are times where the Target gets deleted even when the Process is still around, due to some interactions between our debug loop and the timing of when certain debug events that get sent by the operating system.

As a result, the race leads to us getting one of the notifications from the OS and us trying to access the target, which is stored by reference leading to a crash.

It seems like a purely mechanical change to make Process hold a TargetSP instead of a Target&. I’ve already started down this patch locally, but I want to make sure there are no objections or concerns before I continue down this path, since it’s kind of mundane work.

I think it should be a weak_ptr if anything. Target already holds a
shared_ptr of the process, and you will get pointer loops otherwise.

Couldn't you make sure the debug thread exits (and processes all
events) before the Target gets deleted (e.g. shut it down in
Process::Finalize() or somewhere...)? If there is currently an
invariant that Process should never outlive its Target (which would
seem to be implied by the fact it holds a Target&), I would prefer if
it can be preserved.


We already do this in DoDestroy(), but it looks like for whatever reason DoDestroy is not getting called on us even though the Target is being destroyed. Or maybe it is and our DoDestroy is getting into some edge condition that doesn’t cleanup correctly. But it’s hard to debug because it only happens from the test suite, and only when the system is under heavy load (i.e. you run the entire test suite, and not just one test).

In the future I had planned to make an option for dotest that would allow lldb to write full logs of every run during the test suite, so we could see the sequence of events that are happening, but it’s a bigger task.

A weak_ptr would work just as well and avoid the problem you describe, so I’ll wait and see if anyone has an issue with that.

A std::weak_ptr is necessary because Target contains a shared pointer to the process and if you have the process have a shared pointer to the target then neither will ever destruct.

I have no problem doing this. The main question is who actually has this strong reference to the process? This seems like the real bug to me. The _only_ code that should hold onto a ProcessSP is:
1 - the Target itself to ensure the process lives
2 - local ProcessSP variables that need to do something temporarily with a process instance

No code should have ProcessSP has a member variable other than lldb_private::Target, even though we have many places that do this. I will fix this ASAP. All code needs to lock the weak pointer and check the shared pointer before using it.

So if the problem persists after the fix that I will do removing all strong process references, we will need to fix that, but I would assume the problem will go away. Any external references to a process via lldb::SBProcess have a weak pointer already, or SBThread or SBFrame both have a "lldb::ExecutionContextRefSP m_opaque_sp;" which points to a class that contains weak pointer to the process.

Windows plugin holds a strong reference to itself. It calls shared_from_this() when the process is launched, and it releases this strong reference after a process exits. It does this because the debug loop happens in a different thread, and it wanted to ensure that the process plugin cannot be killed before we’ve cleaned up gracefully.

Should this also be changed to a weak_ptr?

I would change it to a std::weak_ptr if possible. That way if the plug-in goes away, you will try to lock the weak pointer and notice the shared pointer is NULL and know the process is gone and you can proceed to shut down this extra thread.

Sounds good. I already have that working in a local patch, but I will wait for the fix you described earlier to go in first, and handle the merge and check in the weak_ptr change.

Sounds good. I already have that working in a local patch, but I will wait for the fix you described earlier to go in first, and handle the merge and check in the weak_ptr change.

This was in yesterday:

Author: gclayton
New Revision: 246488

Stop objects from keeping a strong reference to the process when they should have a weak reference.