Odd error involving python subprocess.Popen and gdb-remote hangup on Windows

I have a python script that automatically launches a simulator and connects to it with gdb-remote. Everything works fine on Linux. But when I issue the “kill” command on Windows, LLDB crashes.

This only happens if I launch the simulator (or any external program) using python. I’m using subprocess.Popen, but I’ve also tried os.spawnl. I’ve traced the problem down to reading errno in Error::SetErrorToErrno(). In this case, errno is 0, so no error is reported, and the -1 that recv returns is used as a buffer size and LLDB crashes. If I don’t launch a program using Popen, errno is 2, and everything is handled correctly.

Stepping into the errno access, GetLastError() is correctly set to WSAECONNRESET, but ptd->_terrno, which errno is set to, is 0. This seems like a Visual Studio 2012 runtime bug.

I think maybe we shouldn’t rely on errno on Windows, but call GetLastError() and convert their error numbers to POSIX error values.

Ted

Indeed, relying on errno on Windows is not the correct solution for many reasons. Actually I don’t think converting their error numbers to POSIX error values is a good solution either. I’m of the philosophy that if you’re on Windows you should be writing windows code. Recently I added eErrorTypeWin32 as a category to lldb::Error. When you create an error with that category, you can directly pass it the result of GetLastError(). Unfortunately, that’s literally all I did. Planned for the future would be an implementation of Success() and Failure() that returns the right thing when type == eErrorTypeWin32, and calling FormatMessage() with the error code so that the message is set automatically. If you would like to post patches toward making lldb::Error better handle the case when type == eErrorTypeWin32, that would be very welcome.

BTW, where is this particular call to Error::SetErrorToErrno() that this is coming from?

We might also need the ability to create lldb specific errors and be able to have the windows and posix errors be converted to an internal LLDB enumeration. A few examples:

- bad argument
- bad/invalid file handle
- out of memory
- end of file

We don't tend to look for specific errors in many places, but sometimes we do check for certain things. And it might be nice to be able to do:

Error error = some_object->PlatformSpecificCall();

and then do:

if (error.IsEOF())
{
    ...
}

And the code within error would take care of the details even though it might be a posix or windows error...

Most of the places it is nice to just let the error string speak for itself (Error::AsCString(...)), but sometimes we do need to know what went wrong by checking for an specific error code.

Greg

We might also need the ability to create lldb specific errors and be able
to have the windows and posix errors be converted to an internal LLDB
enumeration. A few examples:

- bad argument
- bad/invalid file handle
- out of memory
- end of file

This sounds a lot like std::error_code / std::errc.

I did the kill command, which gets down to ProcessGDBRemote::DoDestroy(), which sends a “k” packet by calling GDBRemoteCommunicationClient::SendPacketAndWaitForResponse(), which goes through a couple layers to GDBRemoteCommunication::WaitForPacketWithTimeoutMicroSecondsNoLock(). That calls Communication::Read(), which calls ConnectionFileDescriptor::Read(). That calls recv(), which returns -1, and then calls Error::SetErrorToErrno(). ConnectionFileDescriptor::Read() then parses the error, assuming the error value is a POSIX error.

Adding a call to GetLastError() and converting certain error codes to POSIX errors solved my crash issue. Inside #ifdef _WIN32, of course, because I don’t want to break my Linux version J.

That reminds me of another (minor) bug. If the call to SendPacketAndWaitForResponse() doesn’t return PacketResult::Success, LLDB prints an error:

exit_string.assign(“failed to send the k packet”);

But the RSP standard says the server doesn’t have to send a response to the “k” packet. Hanging up is legal. If the return value is PacketResult::ErrorDisconnected, it shouldn’t print the error.

Ted

Ahh yes, ConnectionFileDescriptor. I have that on my Todo list, but a real fix is going to be a very large undertaking. Assuming you upstream your fix, please include me on the review, because I’m already pretty familiar with this problem.