There’s a longstanding problem with AArch64 watchpoints (possibly on other targets too, but I see it with this target in particular where you watch 4 bytes, say, 0x100c - 0x100f, and something does a 16-byte write STP to 0x1000, the FAR register has the value 0x1000, the start of the write, and lldb doesn’t correctly associate the watchpoint hit with our watchpoint at 0x100c; it won’t disable the watchpoint, instruction step, re-enable the watchpoint and report the changed value (for a ‘write’ watchpoint). For this, I’d need to add some target specific WatchpointAddressVagueness where a watchpoint exception address within some byte range of a known watchpoint is attributed to that watchpoint. I can’t remember if you can write an entire 128 byte neon register to memory in a single instruction, but that would probably be the correct size on this target.
Second related topic, I’m switching debugserver from using Byte Address Select (BAS) watchpoints on AArch64 (which can watch any bytes within an aligned doubleword) to using MASK watchpoints - which can watch power of 2 regions of memory from 8 bytes to 2GB - and now I’ve got the problem that an exception within that power of 2 region may be touching my watched region or not. e.g. I watch 8 bytes 0x100c - 0x1013. This requires an 8 byte watchpoint at 0x1008 and an 8 byte watchpoint at 0x1010. If something writes to/accesses 0x1008, my mask watchpoint will be hit and now lldb needs to (1) understand that this is associated with this watchpoint, and (2) decide whether to notify the user about it or not.
(an aside, I have to use to watchpoints for this unaligned region, or I have to use a 16-byte watchpoint at 0x1010. You can come up with example unaligned buffers that can quickly require quite large mask watchpoints to cover with one watchpoint, e.g. 24 bytes at 0x10f0 would need a 1024 byte watchpoint at 0x1000 if I did the bits correctly just now. Or I can do it with a 16-byte watchpoint at 0x10f0 and an 8 byte watchpoint at 0x1100)
At first, I thought “well, a write watchpoint which doesn’t change the watched value is a private stop that we don’t tell the user about, right” but that is not correct. lldb shows every write to that memory even if the value in the memory is unchanged. If the behavior was “only report a public stop when the watched memory value has changed”, then I can sweep the “watchpoint hits” which are actually accesses to the region I’m mask watching, but not touching the bytes I read. Or possibly if I know the target’s maximum watchpoint vagueness value from paragraph 1, if the FAR exception address is within the actually-watched region, but further from my watched byte range than the watchpoint vagueness, I can continue silently.
I’ve got the mask watchpoints working in debugserver and am trying to figure out how to best handle these follow on issues from using this mechanism on AArch64 cleanly, and wanted to see if there are strong opinions. I think we have
-
I want a target watchpoint exception address vagueness value to be available; the max byte range before a watched memory region that can touch my watched region, and should be associated with the watchpoint.
-
A setting for whether a write watchpoint should silently continue if the watched memory is unmodified? I’m probably going to default this to “silently continue” on Darwin when we’re using mask watchpoints. Doesn’t help read watchpoints, but I think those are a lot less common. I’m sure there are people/use cases where people want to stop on any write, whether it wrote the same value or not (and I’d prefer preserving that behavior, but with mask watchpoints I’m struggling for any way to disambiguate between a modification of the actual memory region versus the mask address range)
-
I wonder if debugserver shouldn’t report the start address & length of memory that is actually being watched. Z2-4 only return “OK”, “”, or “Exx” for their returns according to the gdb remote serial protocol docs; I wonder about adding something after the ‘OK’. Or possibly a feature request that could be asked at the beginning of the debug session, where lldb asks debugserver to report this in watchpoint set results. Knowing the actual memory region that had to be watched to cover the user’s request would help lldb in associating the exception address with the watchpoint.
I haven’t implemented any of the above yet, but I wanted to reach out and see if anyone has opinions or thoughts about these things. The first one – a large write can touch a watched memory region, but report an exception address before the watched region which lldb doesn’t correctly associate with the watchpoint – is a long standing issue, usually someone is watching a field of an object, they memset the object contents to zero, and the memset impl does it as a series of large writes. (e.g. discourse popped up a window for “similar issues” and it shows one of these 51927 – lldb misses AArch64 stp hardware watchpoint on certain hardware (Neoverse N1) )
Despite this post being about all the drawbacks and complications of mask watchpoints, I am very interested in taking advantage of this hardware feature to allow users to watch larger objects, I think it will improve the feature for lldb.
I think @clayborg @DavidSpickett @labath @jingham might have feedback on these issues, tagging so they see it.