Critique of Linux DoReadMemory implementation

Folks,

Is there a good reason why the DoReadMemory function of
Linux/ProcessMonitor.cpp is implemented using multiple calls of

ptrace(PTRACE_PEEKDATA, ...) ?

An easier, and less CPU-intensive way is to read the memory using the proc
filesystem. The inferior's memory will be available in the file

/proc/<pid>/mem

int fd = open("/proc/<pid>/mem", O_RDONLY);
ssize_t bytes = read(fd, buf, count);

The read-of-procfs just seems more succinct to me, so I wondered what the
rationale was for not using this.

Discussion welcomed,
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.

Hi,

In any case ptrace(PTRACE_PEEKDATA, ) should be left as fallback, at least for attaching.
Your debugger might CAP_SYS_PTRACE, that would allow it read memory of an process running with different UID, and it won’t be possible with /proc//mem.
It is also very popular for LSM’s or hardening patches floating around to disable access to process memory via proc at all.
So I see that more as optimization rather than replacement.

Cheers,
/Piotr

Currently, because of memory caching, lldb_private::Process::ReadMemory() will break up memory reads into 512 byte requests to the pure virtual lldb_private::Process::DoReadMemory() function and cache each 512 byte block of data that comes back. We should modify the caller of DoReadMemory to call it with a _multiple_ of 512 byte reads so we can do large memory reads all at once, yet still maintain the caching. The biggest issue is some process subclasses, like ProcessGDBRemote, are often communicating with remote GDB servers that can't read that much at a time. Sometimes the remote GDB servers have fixed size buffers that allow for packets to be up to a fix size in the responses, so any change we do would need to probably ask the lldb_private::Process subclass what its max memory read size is with a new pure virtual lldb_private::Process function:

class Process {

virtual uint64_t GetMaxMemoryReadByteSize() = 0;

};

As for using ptrace versus using file mapping, I would stick with ptrace unless you have a valid reason to do so. Using seek + read on a file descriptor seems like a hack that is kinda cool, but as someone else already mentioned, when you are debugging a process of another user, you might run into permissions problems. We know ptrace always works, so I would say "don't fix what isn't broken" unless you find some serious performance issues with ptrace vs the file mapped proc/<pid>/mem approach. Another issue you might run into is threading issues with the file position on the "fd" returned from 'open("/proc/<pid>/mem", O_RDONLY);' If one thread tries to read from address 0x1000, and another reads from 0x2000, you can run into issues. Of course you can use pread, but again, why switch from ptace()?

Greg

Greg Clayton wrote:

As for using ptrace versus using file mapping, I would stick with ptrace unless you have a valid reason to do so. Using seek + read on a file descriptor seems like a hack that is kinda cool, but as someone else already mentioned, when you are debugging a process of another user, you might run into permissions problems. We know ptrace always works, so I would say "don't fix what isn't broken" unless you find some serious performance issues with ptrace vs the file mapped proc/<pid>/mem approach. Another issue you might run into is threading issues with the file position on the "fd" returned from 'open("/proc/<pid>/mem", O_RDONLY);' If one thread tries to read from address 0x1000, and another reads from 0x2000, you can run into issues. Of course you can use pread, but again, why switch from ptace()?

Fair enough. Totally agree with the engineering principle "if it ain't broke, don't fix".

I guess this debate is best reserved until the day comes when block reads are noticeably slow, and profiling points to a finger of suspicion towards ptrace.

FWIW I'd imagine file access from a tracer onto a tracee's proc file to be the same as ptracing, regarding permissions. But don't quote me on that!

thanks
Matt

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Keep up to date with CSR on our technical blog, www.csr.com/blog, CSR people blog, www.csr.com/people, YouTube, www.youtube.com/user/CSRplc, Facebook, www.facebook.com/pages/CSR/191038434253534, or follow us on Twitter at www.twitter.com/CSR_plc.
New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com.