LLDB Process Attach Failed When Waiting

I am attempting to use: process attach --name "<name>" --waitfor

The problem is that it immediately returns error: attach failed: lost connection.

Is this something I can troubleshoot further? Or is this not supported on Linux?

Try enabling logging with:

(lldb) log enable -f /tmp/packets.txt gdb-remote packets

And the try the attach. It might be a lldb-server issue?

It might be a lldb-server issue?

How would one go about determining this?

(lldb) log enable -f /tmp/packates.txt gdb-remote packets
(lldb) process attach --name "<name>" --waitfor
error: attach failed: lost connection

Attached the log output. Maybe it'll make more sense to you than me.

That escalated quickly. So I've read through that log. I found something that says $qVAttachOrWaitSupported#38 with the following response being $#00. Perhaps I'm reading too much but is it that it's reporting that it is not supported?

Does lldb-server need to be compiled in a certain way to add support for attach wait?

packates.txt (1.67 KB)

lldb-server claims it doesn’t support it:

lldb < 48> send packet: $vAttachWait;6c616e677365727665722d7377696674#d6
lldb < 4> read packet: $#00

Returning the empty packet ("$#00") means it is not supported, but we really should give a better error message than this.

It isn’t that hard to implement. It would be easy to copy the code that is in debugserver in DNB.cpp in the DNBProcessAttachWait function. Most of it is probably posix style stuff.

So two issues here:
1 - we need a better error message when vAttachWait returns unsupported
2 - fix lldb-server to support vAttachWait

Greg

Well at least there is a little good news. I understood way more of those logs than I thought I did.

So two issues here:

1 - we need a better error message when vAttachWait returns unsupported

2 - fix lldb-server to support vAttachWait

A few questions.

What is the protocol here? Can/should I report the issues at https://bugs.llvm.org/ ? Or is that something that a project member needs to do? Assuming that I can, would this be two seperate issues or one “large” issue?

Barring that this is not something someone else might implement better/faster than me. Considering I have zero LLDB development experience is this something that I could tackle?

Well at least there is a little good news. I understood way more of those
logs than I thought I did.

So two issues here:
1 - we need a better error message when vAttachWait returns unsupported
2 - fix lldb-server to support vAttachWait

A few questions.

What is the protocol here? Can/should I report the issues at
https://bugs.llvm.org/ ? Or is that something that a project member needs to
do? Assuming that I can, would this be two seperate issues or one "large"
issue?

You can open a bug on bugzilla (you can ask an account via e-mail).

Barring that this is not something someone else might implement
better/faster than me. Considering I have zero LLDB development experience
is this something that I could tackle?

You can try and see how far you get (and ask questions, either here or
on IRC oftc.net/#lldb)

Well at least there is a little good news. I understood way more of those logs than I thought I did.

So two issues here:

1 - we need a better error message when vAttachWait returns unsupported

2 - fix lldb-server to support vAttachWait

A few questions.

What is the protocol here? Can/should I report the issues at https://bugs.llvm.org/ ? Or is that something that a project member needs to do? Assuming that I can, would this be two seperate issues or one “large” issue?

Barring that this is not something someone else might implement better/faster than me. Considering I have zero LLDB development experience is this something that I could tackle?

I didn’t realize this functionality was missing in lldb-server. I can take a stab at implementing it and see what I can do. Stay tuned.

Greg,

Is there anything I can do to help you implement or test this feature? Obviously I’m willing to roll-up my sleeves and work on this myself too if you’ve become more busy than you expected. That happens to us all and I completely understand.

Not being able to debug in this manner is blocking me from adding Linux support to a Swift program. As you might imagine, I’m a little antsy to get past this so I can start writing some code.

I will be too busy this week to get to this, so please do have a stab at it.

Basically the flow that debug server does is:
1 - get a list of all processes whose basename matches and remember those pids so we don’t try to attach to them since we are waiting for a new process to show up
2 - poll the OS for the process list and wait for a new pid to show up with the basename and attach to the first one that matches whose pid isn’t i the list from step #1

On MacOSX we don’t have a way to be notified of new processes are spawned, not sure on other unix variants. If it is possible, we would want change to:
1 - sign up to be notified about new processes
2 - as each process gets launched, check for a match and attach as quickly as possible

Hope this helps and I look forward to seeing your patch!

Greg

So I’ve found a capability on Linux to be notified about new processes. I have an example of listening for these new processes running on my machine now [1] and I can see when my desired user process spawns. Though documentation on the API is scarce. It also requires that the executable or user have the CAP_NET_ADMIN capability to run.

It’s possible that the addition of this requirement is a non-starter. Though I’m not 100% sure. Do you have any thoughts before I pursue this further?

[1] http://bazaar.launchpad.net/~kees/+junk/cn_proc/files/3

I doubt anybody will be installing lldb with elevated privileges, and
I would be wary of recommending that. Theoretically, we could
implement that approach and use it in case the user happens to have
that privilege, but I don't believe it's worth it. I think we should
just stick to the polling approach. You should be able to get a list
of running processes with `Host::FindProcesses`.

I think I might be a little lost. I built a lldb in debug mode and I am running lldb in an lldb debugger (very meta).

Earlier in the thread you said “we need a better error message when vAttachWait returns unsupported” I have found where the error message, e.g., “error: attach failed: lost connection” is constructed. The "attach failed: " comes from here [1] and the “lost connection” comes from here [2].

What do you mean by “vAttachWait”? Am I missing something obvious?

It seems like you are expecting lldb-server to be the place where the fix will be implemented. Though in the debugger I’m seeing the method Target::Attach() [3] as the place where the connection attempt is made and fails.

[1] https://github.com/apple/swift-lldb/blob/a8c149f75a8cba674bead048cd9c80ddc8166a8a/source/Commands/CommandObjectProcess.cpp#L518

[2] https://github.com/apple/swift-lldb/blob/a8c149f75a8cba674bead048cd9c80ddc8166a8a/source/Target/Target.cpp#L3444-L3445

[3] https://github.com/apple/swift-lldb/blob/a8c149f75a8cba674bead048cd9c80ddc8166a8a/source/Target/Target.cpp#L3374

Except for Windows and FreeBSD, lldb uses a server program to do the actual debugging - either debugserver on Darwin or lldb-server elsewhere. The interface to these servers (and to the in-process debugging in Windows & FreeBSD) is abstracted being Process Plugins, which the generic code uses. Target::Attach is in generic code, it won't know anything about the actual method used to attach, wait for attach, whatever. That will be dispatched to the currently active ProcessPlugin.

ProcessGDBRemote is the plugin that is used on Linux & Darwin. It is the code that actually talks to debugserver or lldb-rpc-server from within lldb. And that's the code in lldb that sends the vAttachWait gdb-remote protocol packet to instruct the above-mentioned servers to implement attach-wait. That request works on Darwin because debugserver handles the vAttachWait packet, but doesn't work on Linux because lldb-rpc-server doesn't know currently respond to the vAttachWait packet. So all you should need to do is teach lldb-server to handle the vAttachWait packet.

Jim

Thank you that was a huge help. I'm making some progress now.

Though I wonder, is there any documentation of the GDB packet format? The reason I ask is that I'm trying to figure out to get the process name from vAttach.

I've looked at how debugserver does it, namely the method GetProcessNameFrom_vAttach [1], and I cannot seem to find the equivalent method in StringExtractorGDBRemote or StringExtractor (though I'll admit that it might be obvious and I'm just missing it). If that's the case I imagine I'll have to implement it and I would like to at least understand the packet format to do that.

[1] https://github.com/apple/swift-lldb/blob/a8c149f75a8cba674bead048cd9c80ddc8166a8a/tools/debugserver/source/RNBRemote.cpp#L3585

There is some documentation on the packets in docs/lldb-gdb-remote.txt
in the lldb repo, though for this particular packet it does not say
much about the encoding (patches welcome).

That said, it looks like the format is just that we send the process
name hex-encoded. For that you can simply use
StringExtractor.GetHexBytes. There should be plenty of examples doing
hex {en|de}coding around the codebase.

Thank you. That get me moving again. I will also see if there is anything I could contribute to lldb-gdb-remote.txt regarding the format of this packet.

Does anyone have any specific suggestions on how I might go about debugging/testing this? Specifically, turning on the logging in the gdb-server and also launching lldb-server separate from lldb so that I can actually attach a debugger to it.

My workflow right now is to start the gdb-server like so:

$ lldb-server gdb-server --log-file /tmp/gdb-server.txt *:1234

Then connect the main lldb process like this:

$ lldb
(lldb) platform select remote-gdb-server
(lldb) platform connect connect://localhost:1234

As far as I can tell, that seems to be working I get a message in gdb-server "lldb-server-local_buildConnection established."

Unfortunately, when I issue my command,

(lldb) process attach --name "langserver-swift" --waitfor
error: attach failed: invalid host:port specification: 'localhost'

The log file remains zero length.

I presume this means I am incorrectly launching my gdb-server both with respect to the logging and how I tell lldb to connect to it. Though it is not clear to me what I should do differently. The commands I've got I pieced together from an article on lldb.llvm.org titled "Remote debugging with LLDB" [1].

[1] https://lldb.llvm.org/remote.html

There are other options available as well, but for this particular
scenario, I'd go with the following:
- start debugging the client, set a breakpoint just before it sends
the vAttach packet
- when the client reaches this breakpoint, attach a debugger to the
server (it will already be running at this point)
- let the client send the packet
- step through the server as it processes it

That's if you want an interactive debug session. If you just want to
see the logs, it should be sufficient to set LLDB_DEBUGSERVER_LOG_FILE
and LLDB_SERVER_LOG_CHANNELS environment variables before starting
lldb.

Huzzah! I have a working proof of concept.

A few more questions (I hope I'm not wearing out my welcome):

1. There seems to be a "sleep" capability in the DNBProcessAttachWait method [1]. I'm not exactly sure how this "sleep" function works. When I use it the sleep seems to be a no-op. Is there another function that should be used for Linux?

2. DNBProcessAttachWait seems to have a timeout capability [2]. As far as I can tell, the argument timeout_abstime is hard-coded as NULL. Thus rendering that timer/timeout dead code. Should replicate that in the Linux code as well?

3. Are there, or are there expected to be, tests for this stuff?

[1] https://github.com/llvm-mirror/lldb/blob/b64ab3f60c8faa43841af88b10dbcbfd073a82ea/tools/debugserver/source/DNB.cpp#L743
[2] https://github.com/llvm-mirror/lldb/blob/b64ab3f60c8faa43841af88b10dbcbfd073a82ea/tools/debugserver/source/DNB.cpp#L720-L729

Huzzah! I have a working proof of concept.

A few more questions (I hope I'm not wearing out my welcome):

We are very happy to see this feature coming along, so no worries. Ask as many questions as you need!

1. There seems to be a "sleep" capability in the DNBProcessAttachWait method [1]. I'm not exactly sure how this "sleep" function works. When I use it the sleep seems to be a no-op. Is there another function that should be used for Linux?

Sleep is used to make the system sleep the current thread a little bit between polling for processes by name. If the sleep isn't there, we will light up a CPU with really quick polling for the processes by name, so we should use usleep() which take a sleep amount in microseconds to not peg the CPU at 100% while waiting.

2. DNBProcessAttachWait seems to have a timeout capability [2]. As far as I can tell, the argument timeout_abstime is hard-coded as NULL. Thus rendering that timer/timeout dead code. Should replicate that in the Linux code as well?

No need for now.

3. Are there, or are there expected to be, tests for this stuff?

Yes! There are many lldb-server tests already. Pavel should be able to direct you to how and where to make these tests.