RegisterContextPOSIX_i386

We’re currently using PTRACE_GETREGS in ProcessMonitor which (as has been pointed out) always returns the 64-bit register structure if called from a 64-bit debugger even if the target is 32-bit. This is why the RegisterContextPOSIX code is based on host::arch and tries to do the 64 <-> 32 bit register conversion jig.

In 2.6.34, PTRACE_GETREGSET support was committed which allows us to get the correctly sized register set information. Checkin information is down below. (I also wrote a test ptrace program and it does appear to work.) We’re already using PTRACE_GETREGSET elsewhere, so I think this requirement should be fine and switching to this should give us the correct 32-bit registers for 32-bit debuggee with a 64-bit debugger:

ptrace(PTRACE_GETREGSET, child, NT_PRSTATUS, &regs_vec);

The RegisterContextPOSIX_i386 code is fairly sparse right now. There is no core file or watchpoint support, etc.

And now my question. :slight_smile: Does this plan make sense?

  1. Copy all the x64 register context posix files over to i386. (Ie, RegisterContextPOSIX_x86_64.* → RegisterContextPOSIX_i386.*, etc)
  2. Remove the 32-bit register conversion code in the 64-bit code.
  3. Remove the 64-bit registers, etc. from the 32-bit code.
  4. Modify POSIXThread::GetRegisterContext() to check the debuggee architecture instead of the host.

Any feedback or pointers before I start tackling this would be great. Thanks!
-Mike

FYI, the x86_64 build of LLDB will have limited success with i386 inferiors, but it’s a pretty good starting point (i.e. test/functionalities/registers passes). In this case, ptrace calls populate the 64-bit register set, and the RegisterContext_x86_64 class uses offsetof to associate the i386 register set with the LS bytes of the associated 64-bit registers. However, this isn’t correct because “The DWARF and GCC register numbers need to use the i386 register numbering schemes otherwise all info parsed from EH frame and DWARF will be incorrect when they don't match up. – Greg Clayton”.

The i386 build of LLDB should use RegisterContext_i386. Similarly, a future remote i386 target should use RegisterContext_i386. However, this class is just stubbed in.

We're currently using PTRACE_GETREGS in ProcessMonitor which (as has been pointed out) always returns the 64-bit register structure if called from a 64-bit debugger even if the target is 32-bit. This is why the RegisterContextPOSIX code is based on host::arch and tries to do the 64 <-> 32 bit register conversion jig.

In 2.6.34, PTRACE_GETREGSET support was committed which allows us to get the correctly sized register set information. Checkin information is down below. (I also wrote a test ptrace program and it does appear to work.) We're already using PTRACE_GETREGSET elsewhere, so I think this requirement should be fine and switching to this should give us the correct 32-bit registers for 32-bit debuggee with a 64-bit debugger:

    ptrace(PTRACE_GETREGSET, child, NT_PRSTATUS, &regs_vec);

The RegisterContextPOSIX_i386 code is fairly sparse right now. There is no core file or watchpoint support, etc.

And now my question. :slight_smile: Does this plan make sense?
1. Copy all the x64 register context posix files over to i386. (Ie, RegisterContextPOSIX_x86_64.* --> RegisterContextPOSIX_i386.*, etc)
2. Remove the 32-bit register conversion code in the 64-bit code.
3. Remove the 64-bit registers, etc. from the 32-bit code.
4. Modify POSIXThread::GetRegisterContext() to check the debuggee architecture instead of the host.

Sounds correct to me. Just be sure to convert the correct register numbers for the 32 bit stuff (which you probably already have for 32 bit on 64 bit machines). The biggest issue to watch out for is the register numbers for ESP and EBP. They are reversed between the DWARF and GCC register numbers.

Ditto on the "sounds good", thanks for taking this on.

One point to consider is if there is scope for some common code between architectures. Note that the list of architectures will only grow (i.e. x32). A future point is to keep POSIX-isms nicely contained. When considering platform-independent remote debugging that is consistent with native-local debugging, we'll want code consistent between platforms to live in one place. Cheers,

- Ashok

Here is a first pass at this:

http://llvm-reviews.chandlerc.com/D1798

It passes all the 64-bit linux tests, although there is still a bit of cleanup I need to do.

  • FreeBSD is most likely busted… (I’ll contact Ed about working on this when it’s a bit more nailed down.)
  • I need to check and fix dwarf / gdb constant values.
  • I don’t like the ConvertRegisterKindToRegisterNumber() routines.
  • Also don’t like the RegisterContextPOSIXProcessMonitor_x86_64::ReadRegister() / WriteRegister() routines.
  • Need to implement RegisterContextPOSIX_i386* so 32-bit LLDB will fully work. (This may come in a second checkin).
  • Would love to have xmm00, xmm01, etc. type aliases for mmx, sse, and avx registers.

If anyone has any general feedback on how any of this looks and/or ideas on more cleanup, please fire away.

Thanks much.
-Mike

Looks good.

More comments below...

One point to consider is if there is scope for some common code between architectures. Note that the list of architectures will only grow (i.e. x32). A future point is to keep POSIX-isms nicely contained. When considering platform-independent remote debugging that is consistent with native-local debugging, we'll want code consistent between platforms to live in one place.

Here is a first pass at this:

http://llvm-reviews.chandlerc.com/D1798

It passes all the 64-bit linux tests, although there is still a bit of cleanup I need to do.
- FreeBSD is most likely busted... (I'll contact Ed about working on this when it's a bit more nailed down.)
- I need to check and fix dwarf / gdb constant values.
- I don't like the ConvertRegisterKindToRegisterNumber() routines.

Do you not like how they are implemented in the register context classes for posix/linux/freebsd, or are you questioning their need?

They are needed for parsing DWARF and EH frame information and any other object file sections that contain register numbers in them. We can probably automate the ConvertRegisterKindToRegisterNumber() up into the RegisterContext base class so that it uses the RegisterInfo data to populate lookup tables, but then we might need a finalize call to let the base class know that it is ok to go ahead and compute the lookup tables.

- Also don't like the RegisterContextPOSIXProcessMonitor_x86_64::ReadRegister() / WriteRegister() routines.

Again, do you not like how they are implemented, or would you rather see them go away? I tried to add flexibility to the register contexts so you can read/write all registers at once, or read/write single registers since things like GDB remote might support one, the other, or both. Many JTAG debuggers also read/write registers individually and it can impose quite a performance penalty to force reading/writing all registers at once (all GPRs, all FPUs, etc).

- Need to implement RegisterContextPOSIX_i386* so 32-bit LLDB will fully work. (This may come in a second checkin).
- Would love to have xmm00, xmm01, etc. type aliases for mmx, sse, and avx registers.

Is this more than filling out the "alt_name" field? Or is this more like the "eax" register that is part of "rax"?

Looks good.

More comments below...

> One point to consider is if there is scope for some common code between
architectures. Note that the list of architectures will only grow (i.e.
x32). A future point is to keep POSIX-isms nicely contained. When
considering platform-independent remote debugging that is consistent with
native-local debugging, we'll want code consistent between platforms to
live in one place.
>
> Here is a first pass at this:
>
> http://llvm-reviews.chandlerc.com/D1798
>
> It passes all the 64-bit linux tests, although there is still a bit of
cleanup I need to do.
> - FreeBSD is most likely busted... (I'll contact Ed about working on
this when it's a bit more nailed down.)
> - I need to check and fix dwarf / gdb constant values.
> - I don't like the ConvertRegisterKindToRegisterNumber() routines.

Do you not like how they are implemented in the register context classes
for posix/linux/freebsd, or are you questioning their need?

They are needed for parsing DWARF and EH frame information and any other
object file sections that contain register numbers in them. We can probably
automate the ConvertRegisterKindToRegisterNumber() up into the
RegisterContext base class so that it uses the RegisterInfo data to
populate lookup tables, but then we might need a finalize call to let the
base class know that it is ok to go ahead and compute the lookup tables.

I didn't like how they were implemented with the architecture switch
statement and then two fairly large individual register name case
statements. I fixed that in "diff 3 & 4" up on the chandlerc site.

I'm glad I said that though - your description of how they are utilized is
quite useful. Ok if I put that in as a code comment above those routines?

> - Also don't like the
RegisterContextPOSIXProcessMonitor_x86_64::ReadRegister() / WriteRegister()
routines.

Again, do you not like how they are implemented, or would you rather see
them go away? I tried to add flexibility to the register contexts so you
can read/write all registers at once, or read/write single registers since
things like GDB remote might support one, the other, or both. Many JTAG
debuggers also read/write registers individually and it can impose quite a
performance penalty to force reading/writing all registers at once (all
GPRs, all FPUs, etc).

This is good information also.

> - Need to implement RegisterContextPOSIX_i386* so 32-bit LLDB will
fully work. (This may come in a second checkin).
> - Would love to have xmm00, xmm01, etc. type aliases for mmx, sse, and
avx registers.

Is this more than filling out the "alt_name" field? Or is this more like
the "eax" register that is part of "rax"?

It would be more eax part of rax type thing. A more general question is how
do folks look at these SSE and AVX registers on lldb? On gdb, we get
something like this by default. I haven't found an easy way to view these
registers like that with lldb - I'm probably missing something though.

(gdb) p $xmm0
$1 = {
  v4_float = {9.14767638e-41,
    0,
    0,
    0},
  v2_double = {3.2252605360516574e-319,
    0},
  v16_int8 = {0,
    -1,
    0 <repeats 14 times>},
  v8_int16 = {-256,
    0,
    0,
    0,
    0,
    0,
    0,
    0},
  v4_int32 = {65280,
    0,
    0,
    0},
  v2_int64 = {65280,
    0},
  uint128 = 65280
}

gdb also does this for the flag registers:

eflags 0x202 [ IF ]
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]

Which can be quite useful.

I'm having trouble getting FreeBSD to build at the moment and am working
with Ed on that. As soon as I get things verified there, I'll check this in.

Thanks for the help Greg.
-Mike

Looks good.

More comments below...

> One point to consider is if there is scope for some common code between architectures. Note that the list of architectures will only grow (i.e. x32). A future point is to keep POSIX-isms nicely contained. When considering platform-independent remote debugging that is consistent with native-local debugging, we'll want code consistent between platforms to live in one place.
>
> Here is a first pass at this:
>
> http://llvm-reviews.chandlerc.com/D1798
>
> It passes all the 64-bit linux tests, although there is still a bit of cleanup I need to do.
> - FreeBSD is most likely busted... (I'll contact Ed about working on this when it's a bit more nailed down.)
> - I need to check and fix dwarf / gdb constant values.
> - I don't like the ConvertRegisterKindToRegisterNumber() routines.

Do you not like how they are implemented in the register context classes for posix/linux/freebsd, or are you questioning their need?

They are needed for parsing DWARF and EH frame information and any other object file sections that contain register numbers in them. We can probably automate the ConvertRegisterKindToRegisterNumber() up into the RegisterContext base class so that it uses the RegisterInfo data to populate lookup tables, but then we might need a finalize call to let the base class know that it is ok to go ahead and compute the lookup tables.

I didn't like how they were implemented with the architecture switch statement and then two fairly large individual register name case statements. I fixed that in "diff 3 & 4" up on the chandlerc site.

I'm glad I said that though - your description of how they are utilized is quite useful. Ok if I put that in as a code comment above those routines?

Sure thing.

> - Also don't like the RegisterContextPOSIXProcessMonitor_x86_64::ReadRegister() / WriteRegister() routines.

Again, do you not like how they are implemented, or would you rather see them go away? I tried to add flexibility to the register contexts so you can read/write all registers at once, or read/write single registers since things like GDB remote might support one, the other, or both. Many JTAG debuggers also read/write registers individually and it can impose quite a performance penalty to force reading/writing all registers at once (all GPRs, all FPUs, etc).

This is good information also.

> - Need to implement RegisterContextPOSIX_i386* so 32-bit LLDB will fully work. (This may come in a second checkin).
> - Would love to have xmm00, xmm01, etc. type aliases for mmx, sse, and avx registers.

Is this more than filling out the "alt_name" field? Or is this more like the "eax" register that is part of "rax"?

It would be more eax part of rax type thing. A more general question is how do folks look at these SSE and AVX registers on lldb? On gdb, we get something like this by default. I haven't found an easy way to view these registers like that with lldb - I'm probably missing something though.

The way I see this happening on LLDB is to be allowed to give a snippet of code that describes the register as a type. We then take this and parse it with the expression parser and then display the variable using this type.

So we could have a code snippet like:

'''
struct XMMValue {
    union float32 { float floats[4]; };
    union float64 { double doubles[2]; }
    ...
};
'''

Then the qRegisterInfo packets could specify the type of the register with: "display-type:XMMValue;". The code snippet above could also contain enumerations and other types to make the display of registers more clear. Something for the ARM CPSR register could be:

'''
enum Mode {
User = 0x10,
FIQ = 0x11,
IRQ = 0x12,
Supervisor = 0x13,
Abort = 0x17,
Undefined = 0x1b,
System = 0x1f
};

struct CPSR {
   ...
   Mode mode;
}
'''

(gdb) p $xmm0
$1 = {
  v4_float = {9.14767638e-41,
    0,
    0,
    0},
  v2_double = {3.2252605360516574e-319,
    0},
  v16_int8 = {0,
    -1,
    0 <repeats 14 times>},
  v8_int16 = {-256,
    0,
    0,
    0,
    0,
    0,
    0,
    0},
  v4_int32 = {65280,
    0,
    0,
    0},
  v2_int64 = {65280,
    0},
  uint128 = 65280
}

gdb also does this for the flag registers:

eflags 0x202 [ IF ]
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]

Which can be quite useful.

So the above code snippets could be supplied by the register context plug-ins and parsed into the target's scratch AST and then used for registers. They should probably be hidden in a namespace or something to avoid collisions with the user code.

To state my "code snippet" idea a bit more clearly:

RegisterContext subclasses should be able to evaluate expressions using the target's scratch AST context to define types that can be used to display register contents. The expressions can include enumerations, struct definitions, unions, bitfields and anything else to help display a variable more clearly. This avoids using any fancy XML or other descriptive languages and will allow the developers to use languages they are familiar with to make the formatting for registers. The work that would need to be done:
- Add code to the target to parse and keep persistent types arounds that can be used for formatting by parsing code snippets.
- Allows registers to specify a type by name that can them be used from the above code snippets.
- Modify the register ValueObjects to be able to display this type data correctly.

I like that idea, since it will save the redundant effort when
implementing RegisterContext for a new architecture.