GDB can show register fields generated from XML, I want to add that to LLDB.
(lldb) register read cpsr cpsr = 0x60001000 | N | Z | C | V | TCO | DIT | UAO | PAN | SS | IL | SSBS | BTYPE | D | A | I | F | nRW | EL | SP | | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(formatting very much TBC, suggestions welcome!)
The above output is what I have so far and it also works connected to a gdbserver.
(Commits · DavidSpickett/llvm-project · GitHub)
The purpose of this RFC is to get your feedback on how this feature could be added to lldb in a way that is user friendly.
What’s the benefit?
The biggest benefit is that users don’t have to leave lldb to work out what a register means.
For me this usually involves asking python to do some bit shifting that I worked out from the Arm architecture manual. It gets in the way of “what floating point mode am I in” or “what flags are set so I know what this next branch will do”.
That applies to experienced users and beginners. Seeing Arm flags really helps learning Arm assembly for example, or for people new to Arm who are porting software.
The information is readily available in the manuals and is already present, as mentioned, in GDB. If we end up using official documentation’s descriptions there are some IP issues there but that’s a future issue, showing field names and values is well trodden ground.
Now for the selfish reason. I work on Arm enablement and a bunch of features lately have boiled down to new mode bits in various registers. I’d really like to have register fields so I can call these features “supported” in lldb.
Should we follow GDB on this?
GDB’s target XML is documented here:
We already use/generate some XML in this format but just don’t include register info. Following GDB gives us the information from gdbserver and its implementations (e.g. qemu) for free.
Extending that format wouldn’t be too difficult if we needed to, with some coordination with the GDB community. For example descriptions and field enums would be great additions.
When should we show register fields?
My guess is only when specific registers are asked for:
- register read foo → can show fields
- register read foo bar → can show fields
- register read --all → does not show fields
We could add compact formatting for that last command. Something like GDB does:
cpsr = 0x00001000 [SSBS, nRW, EL=0]
In general I don’t want to turn some command into “spam” by printing all this when it’s not appropriate.
Where does the XML info come from?
- gdbserver (when you do lldb → gdbserver)
- handcrafted XML for testing ProcessGDBRemote
- XML in lldb-server that we make by reading the architecture manuals
In the future I would like to investigate using Arm’s register XML (Exploration Tools – Arm Developer) to generate this information. There are IP issues here but it would be the most complete set we could use (perhaps some setting to point to it).
What else can we do with this information?
I would like to add a “register info” command that shows:
- the layout of the register
- it’s size
- its name and alternate names (and anything like oh this is known to be an argument register)
- registers that it invalidates (e.g. w0 on AArch64 invalidates x0 because they overlap)
- …and any other useful stuff we can derive.
This is a good discovery tool and would help anyone writing C code over in another editor as they can see where fields start and end (again no need to pull up the manual).
The main gap here is the lack of descriptions. For example for “cpsr”, which is actually what the architecture calls “spsr”, you would want to see “Holds the saved process state when an exception is taken to EL1.”.
This may not be possible due to IP issues given that the architecture manual is Arm’s property. There are perhaps ways to blend Arm’s register XML with ours either at build or runtime (obligatory not a lawyer, would consult one if I intended to try this).
The far future of this stuff is being able to set the fields directly. Either from the commands or from scripts. Easier said than done of course but once the information is accessible it’s a possibility.
What’s my initial goal?
To get lldb and lldb-server to display XML described “flags” and describe the most common AArch64 control registers. So far I have done this for 1 register, “cpsr”.
What are the other pitfalls?
How do we generate accurate XML? If for example, a bit was added in armv8.7-a how can we tell you are using a v8.7-a device? Or do we just show all bits defined in the latest Architecture. In other words, if you saw the name of that bit shown, would you assume your device had that feature?
GDB says no you shouldn’t make that assumption. It certainly makes life easier and avoids situations where userspace simply cannot tell if a feature is enabled without running code. I tend to agree for the same reasons.
Which registers should have this information? Arm’s register files include every single system register and clearly 9% of the time we don’t want that (though kernel debugging would be interesting for that). I propose sticking to the common control registers and pseudo registers an OS like Linux provides.
(and anyone would be able to add new registers if they disagree about what is “interesting” enough)
Should this information be generated on a per OS level or start from some core set and be augmented? For example Mac OS and Linux probably provide a floating point control register but only Linux is going to show the memory tagging control pseudo register.
I think the latter though I’m not sure what the code will look like yet.
Is this Arm only?
No, I just happen to be working on Arm enablement. The XML content is the target specific bit, the processing is all generic. After my changes you should be able to connect lldb to, for example, a PowerPC gdbserver and see any info it sends.