Continue misfiring after expression evaluation

Hi all,

I’ve been debugging a problem on Linux where if you hit a breakpoint, then evaluate an expression that requires JITing, then continue LLDB will break at the line you were already on. I don’t know if this problem is specific to Linux, but that’s where I’m debugging it. This problem appears in the ‘lang/c/setvalues/TestSetValues.py’ test case, but can also be reproduced manually.

What I’m seeing is that when the ‘continue’ handling calls Thread::SetupForResume() the current Thread reports its stop reason as ‘eStopReasonPlanComplete’ because the ThreadPlan for evaluating the expression is still on the completed plans stack. As a result, the ThreadPlanStepOverBreakpoint plan doesn’t get queued. If I add calls to ‘m_completed_plan_stack.clear()’ and ‘m_discarded_plan_stack.clear()’ at the top of the ‘if (GetResumeState() == eStateSuspended)’ block in Thread::SetupForResume() then everything works as expected.

Obviously, that feels like a pretty risky thing to do, at best. It seems like the expression command handling should have done something to clear the completed plan stack when it was finished, but as of yet I haven’t found a place where that is appropriate.

Can anyone give me guidance on this issue?

Thanks,

Andy

I don't see this happening on Mac OS X. That test succeeds, and I don't see the behavior you describe.

You should not need to clear the completed plan stack. The way it is supposed to work is as follows:

Running functions is always done by Process::RunThreadPlan. After the ThreadPlanCallFunction gets done running in that function, we do this
bit of code:

        // Restore the thread state if we are going to discard the plan execution.
        
        if (return_value == eExecutionCompleted || discard_on_error)
        {
            thread_plan_sp->RestoreThreadState();
        }
        
That should set the StopInfo that was squirreled away in Thread::CheckpointThreadState back as the current StopInfo for the thread. Is that getting called? If not, why not? If it is getting called, why isn't it succeeding in resetting the thread's StopInfo back to the old reason?

Note, up till a couple of weeks ago this work was done in the DoTakedown method of ThreadPlanCallFunction, but I moved resetting the StopInfo out of that function (which gets called while the stop event for the function call is still being processed) to here, because otherwise it might trigger the original StopInfo's PerformAction while you are in the middle of handing the ThreadPlanCallFunction's execution, which isn't right.

Hope that helps.

Jim

Hi Jim,

I am working with an old version of the code, so things were in a bit different state for me. I was seeing Thread::RestoreThreadStateFromCheckpoint() being called. The problem is that when Thread::SetupForResume() calls Thread::GetStopInfo(), the latter does this:

    ThreadPlanSP plan_sp (GetCompletedPlan());
    if (plan_sp && plan_sp->PlanSucceeded())
        return StopInfo::CreateStopReasonWithPlan (plan_sp, GetReturnValueObject());
    else
    {
        ProcessSP process_sp (GetProcess());
        if (process_sp
            && m_actual_stop_info_sp
            && m_actual_stop_info_sp->IsValid()
            && m_thread_stop_reason_stop_id == process_sp->GetStopID())
            return m_actual_stop_info_sp;
        else
            return GetPrivateStopReason ();
    }

Since the completed stop plan from the expression command is still around, the stop info comes from there. The thread's m_actual_stop_info seems to be correct, but the code doesn't get to the point where it would use it.

I'll update to the latest code and see if this is still happening, but I wanted to tell you what I knew while I was still in a known state. I'll let you know what happens with the new code.

-Andy

I updated to the latest code and I'm still seeing the same behavior.

That is, thread_plan_sp->RestoreThreadState() is being called, and it does restore the stop info for the thread to the old stop info, but when Thread::SetupForResume() calls Thread::GetStopInfo() that function creates stop info based on the completed thread plan stack rather than using m_actual_stop_info_sp.

Any suggestions? I'd be curious to know why this doesn't happen on Mac OS X because it looks like it would.

-Andy

Yeah, after taking a quick look at the code last night, I'm curious too.

I am in the middle of something right now, but as soon as I get a chance, I'll step through it and see what's going on.

Thinking about it a little bit more, I can't see any reason why Process::RunThreadPlan should not clear the evidence of its work from the plan stack. It should only do this if the plan succeeds, or if it is going to unwind on error. If you're going to stop because you hit a breakpoint in the middle of a function call, you don't want to change the plan stack state (just like you don't want to restore the old stop info.) And I think it would be wrong to erase the whole completed plan stack, better to mark the position on entry, and restore it to the state it had when you entered RunThreadPlan.

Jim

Daniel and I saw the same behaviour as Andy on Mac recently though we reproduced it differently. If I recall what was happening on Mac correctly, the expression to be evaluated didn't require MCJIT/executing in the target process (for instance, printing a register value - expr pc) but on Linux did require MCJIT/running in the target. However, evaluating a function call expression did reproduce the problem on Mac since this did go through the MCJIT route. Could you try this to see if it reproduces the problem?

Thanks,
Matt

I don't see this fail, even for function calls:

(lldb) b s -n main
Breakpoint 1: 3 locations.
(lldb) run
Process 96310 launched: '/Users/jingham/Projects/Sketch/build/Debug/Sketch.app/Contents/MacOS/Sketch' (x86_64)
Process 96310 stopped
* thread #1: tid = 0x1c03, function: main , stop reason = breakpoint 1.1
    frame #0: 0x000000010001a74e Sketch`main at SKTMain.m:11
   8
   9
   10 int main(int argc, const char *argv[]) {
-> 11 NSLog (@"Added for testing rebuilds.");
   12 const char *names[20];
   13 for (int i = 0; i < argc; i++)
   14 {
(lldb) expr (int) printf ("Some text here.\n")
(int) $0 = 16
Some text here.
(lldb) log enable lldb step
(lldb) c
<lldb.driver.main-thread> Pushing plan: "Single stepping past breakpoint site 3 at 0x10001a74e", tid = 0x1c03.
<lldb.driver.main-thread> ThreadPlanCallFunction(0x7f8eb4189540): DoTakedown called as no-op for thread 0x1c03, m_valid: 1 complete: 1.

<lldb.driver.main-thread> WillResume Thread #1: tid = 0x1c03, pc = 0x10001a74e, sp = 0x7fff5fbff4a0, fp = 0x7fff5fbff5e0, plan = 'Step over breakpoint trap', state = stepping, stop others = 1
<lldb.process.internal-state(pid=96310)> Current Plan for thread 1 (0x1c03): Step over breakpoint trap being asked whether we should report run.
Process 96310 resuming
<lldb.process.internal-state(pid=96310)>
<lldb.process.internal-state(pid=96310)> ThreadList::ShouldStop: 1 threads
<lldb.process.internal-state(pid=96310)> Thread::ShouldStop for tid = 0x1c03, pc = 0x000000010001a751
<lldb.process.internal-state(pid=96310)> ^^^^^^^^ Thread::ShouldStop Begin ^^^^^^^^
<lldb.process.internal-state(pid=96310)> Plan stack initial state:
  Plan Stack for thread #1: tid = 0x1c03, stack_size = 2
    Element 1: Single stepping past breakpoint site 3 at 0x10001a74e
    Element 0: Base thread plan.

<lldb.process.internal-state(pid=96310)> Plan Step over breakpoint trap explains stop, auto-continue 1.
<lldb.process.internal-state(pid=96310)> Plan Step over breakpoint trap should stop: 0.
<lldb.process.internal-state(pid=96310)> Completed step over breakpoint plan.
<lldb.process.internal-state(pid=96310)> Popping plan: "Step over breakpoint trap", tid = 0x1c03.
<lldb.process.internal-state(pid=96310)> Plan stack final state:
  Plan Stack for thread #1: tid = 0x1c03, stack_size = 1
    Element 0: Base thread plan.
  Completed Plan Stack: 1 elements.
    Element 0: Single stepping past breakpoint site 3 at 0x10001a74e

The call to printf definitely ran code in the target, and yet continue pushed the "step over breakpoint" plan rather than stopping with another breakpoint hit (which is what I presume you are seeing?)

Jim

Yeah, stopping again at the same breakpoint is what I'm seeing.

I'm attaching a log I just created with the latest (as of last night) code -- expr-continue.log.

I'm also attaching much more verbose logs I created a couple of days ago, one with the expression command that exhibits the failure and one without the expression command that shows what happened when the continue worked correctly.

Let me know if you spot anything that might tell you what the problem is. Meanwhile, I'll experiment with your earlier suggestions about having the thread plan clean up just its part of the completed thread plan stack (which I'm thinking needs to be done in Thread::CheckpointThreadState and Thread::RestoreThreadStateFromCheckpoint?).

-Andy

expr-continue.log (4.77 KB)

failing.log (28.7 KB)

working.log (21.7 KB)

You're right; I just tried it on trunk it seems to work correctly on Mac. It definitely used to reproduce not too long (past few weeks) but seems to be fine now...

Alright, this feels a lot better.

After further investigation, I learned that Thread::GetStopInfo() only gets the stop info from completed thread plans that aren't marked private. Looking around at some examples in the code, it seemed that the thread plans involved in the expression evaluation should have been marked private, at least before the Process object started executing them. I verified that Process::RunThreadPlan was saving and restoring the ThreadPlan's 'private' state correctly but that my ThreadPlanRunFunction was entering with 'private' set to false.

I was able to track this thread plan back to InferiorCallMmap in InferiorCallPOSIX.cpp. I'm guessing that Mac OS X doesn't use that for whatever reason.

The attached patch fixes the problem I was investigating and it seems like a reasonable change. Can you review this, Jim?

Thanks,
Andy

make-mmap-thread-plan-private.patch (1.08 KB)

Ah, right, that seems like another good way to do this (wonder who thought of and then forgot it...)

I can't see a reason why function calling thread plans should ever be public. Even if you stop in the middle of one (because you called a function to break in it and step around) when you continue to finish the call you don't want the call function plan to be the reason you stopped back in normal code.

So a better fix would be to set ThreadPlanCallFunction's to private in their constructor.

BTW, We added an "allocate memory" packet to debugserver, so we don't need to use InferiorCallMmap. That's why we didn't see this on OSX.

Jim

I have run into the same problem with duplicated 'breakpoint stops' when using expression evaluation on Linux. When setting the thread plans to private in InferiorCallPOSIX.cpp as Andrew suggested, the issue does go away.

Thanks Andrew!

Pierre

The solution Jim suggested below also fixes the problem. I've committed this as r169618.

-Andy

Excellent, thanks!

Jim