Rosetta debugserver details?

movax-13h · April 23, 2024, 5:36am

Hi everyone,

I was just wondering if anyone could point me to details about the debugserver built to run when debugging an x64 app on arm64?

I saw Jason post a tidbit here** and went to poke around the lldb source tree to see if I could get some pointers for a small hobby project of my own, but, it looks like there isn’t much referring to either Rosetta, or “running in translation”. There was some in DNB.cpp and debugserver.cpp, but, seemed to be mostly checks to launch the special debugserver. I’m going to do a more thorough review, but, wanted to ask here before I did.

My interest is largely for a toy debugger I’m working on. I recently upgraded to a new M2 Air, and was going to set about updating the project to work on arm64, but, thought I’d first check to see if I could do x86_64 development on here (and just break out the ancient Intel laptop for validation.)

Anyway, long story short, it doesn’t seem to work at all running under Rosetta2, specifically, writing memory to insert the INT3 trap instruction. All the calls appear successful, but, when executing the program, none of the traps seem to have actually been written (though, reading back the memory, does show them).

I did notice that I never got the Task Permission pop-up (or don’t remember seeing it) you get when trying to get a task from a PID for the first time.

I also noted that if I emit an INT3 with inline assembly when building the inferior, that does get hit. So possibly it’s a JIT issue in the Rosetta translation layer? Or maybe my vm write isn’t being committed somehow for some reason?

I have minimal repro, if that would help. If so, I could put it up on Github, or zip it here possibly? Let me know, and I’ll pop back in to provide that.

** How does LLDB filter out Rosetta threads in x86_64 targets on Apple Silicon? - #3 by jasonmolenda

movax-13h · April 25, 2024, 1:47am

Added the repro to a repository below, in case anyone gets a chance to take a peek.

https://github.com/michael-dwan/for-questions

in the /rosetta-debugserver sub directory. Includes a README with steps, but, should be able to:

git clone git@github.com:michael-dwan/for-questions.git
arch -x86_64 zsh
cd for-questions/rosetta-debugserver
mkdir build && cd build
cmake .. && make
objdump -d inferior/inferior # To get an instruction address
./rosetta-repro ADDRESS

Assumes you have a cert named lldb_codesign already in your Keychain (figure most people here do?)

jasonmolenda · April 25, 2024, 2:36am

I’m not completely clear what you’re done with the rosetta-debugserver example. It looks like you’re attaching to a process running under rosetta and manually inserting a 0xcc break instruction in memory? rosetta presents a fiction that an x86_64 process is running on an arm64 mac, but it is not actually executing x86_64 instructions. If there was an asm("int $3") in the original program, that would be faithfully emulated by (probably something like) brk #1 in arm64 assembly when it goes to be executed. But trying to modify the instruction stream externally like this will not work.

As you noted, there is a special debugserver for rosetta processes, and it knows how to read the rosetta internal data structures to present a view of the process AS IF it was acutally executing x86_64 instructions and had x86_64 registers, but of course none of that is true. If you’d like to manipulate a process running under rosetta, I would do it via the rosetta debugserver which knows how to do things like insert a breakpoint in a process.

jasonmolenda · April 25, 2024, 2:44am

If you’d like to control the rosetta debugserver programmatically, it implements the standard gdb remote serial protocol (same as debugserver) – you can see the packets by doing log enable gdb-remote packets before you start debugging, and you’ll see the packets for inserting/removing breakpoints, instruction stepping, etc. All of the important ones are industry standard, and there’s documentation on the lldb extensions (which are mostly optional, for performance). Outside of the rosetta debugserver, the code running inside Rosetta isn’t inspectable/controllable.

movax-13h · April 25, 2024, 4:12am

Thanks @jasonmolenda,

Sorry for the confusion on that. You’re right about what the repro was trying to do, though.

I guess I thought maybe there was some additional JIT involved while running under Rosetta for the Rosetta debugserver, since I didn’t see much in the lldb source…

My main goal is to learn debugger internals. I actually have my own “debugserver” that works on Intel Macs (and soon Apple Silicon), so, controlling the Rosetta debugserver isn’t especially interesting in that sense, though, I have written GDB RSP clients/servers as part of my learning. What I was really hoping for was to implement my own rosetta debugserver so that I could do the majority of my work on my M2 Air, and just use my Intel Mac for testing completed features (make sure they run on real Intel hardware).

I think my questions in this case are (if I could trouble you a little longer):

I’m guessing that lldb’s special debugserver is not open source, is that correct?
Would there be any open documents or source that might give some pointers to the underlying details? Or maybe this is just a dead end?

Thank you for taking the time to help.

jasonmolenda · April 25, 2024, 4:25am

ah, I understand what you’re doing. That’s really impressive to write a separate gdb RSP stub from scratch on Darwin, congratulations on getting that working!

The internals of Rosetta are not publicly documented, their debugserver has to know all of the internal implementation details of how the instruction set is emulated to present the fake view that it’s x86_64. I don’t think you can work on writing a debug stub that debugs x86_64 processes on an Apple silicon system. I believe the team that implemented the Rosetta debugserver started with the same public debugserver sources, and then added the internals for Rosetta on top of it - I don’t know anything about the details, but it’s pretty different from the standard debugserver.

(I was surprised you knew all the mach calls to insert that int $3 in a process in your example program, that’s not something many people know how to do, it makes a lot more sense now.

The x86_64 / aarch64 specific parts of the debug stub are not a lot of code, if it can work with the goals of your project, you might want to port your stub to support arm64 debugging and debug natively on Apple silicon computers, I think it will be a simpler way to go, and have a longer lifespan of usability.

movax-13h · April 25, 2024, 4:55am

Thanks Jason!

That’s really impressive to write a separate gdb RSP stub from scratch on Darwin, congratulations on getting that working!

It definitely helped that most of the RSP stuff is more or less documented. It was definitely the Mach kernel stuff that was hardest to dredge up. It did help the XNU source was open, and in hindsight I would have saved a lot of time reviewing lldb sooner. Spelunking the old lldb list helped, too, already learned lots from old posts of yours, and Greg Clayton especially.

Hopefully I’ll be able to parlay the learning into helping out the lldb project and community here.

The internals of Rosetta are not publicly documented, their debugserver has to know all of the internal implementation details of how the instruction set is emulated to present the fake view that it’s x86_64. I don’t think you can work on writing a debug stub that debugs x86_64 processes on an Apple silicon system. I believe the team that implemented the Rosetta debugserver started with the same public debugserver sources, and then added the internals for Rosetta on top of it - I don’t know anything about the details, but it’s pretty different from the standard debugserver.

Ah, thanks for confirming (and sharing the details you’re able). Makes sense, and sounds like I’d spend more time reversing than I’d save switching between my two laptops, and that is very generously assuming I’d have any success trying it

(I was surprised you knew all the mach calls to insert that int $3 in a process in your example program, that’s not something many people know how to do, it makes a lot more sense now.

The mach stuff was definitely tough to find info on. And the code signing, if I hadn’t tripped on the scripts in the lldb repo, I’d probably still be trying to turn a pid into a task.

The x86_64 / aarch64 specific parts of the debug stub are not a lot of code, if it can work with the goals of your project, you might want to port your stub to support arm64 debugging and debug natively on Apple silicon computers, I think it will be a simpler way to go, and have a longer lifespan of usability.

Yeah, 100% agree here, that’s overarching plan is to bring everything up on arm64, maintaining the Intel side is mostly an exercise in managing multiple architectures in the code base after that.

Thanks again for explaining everything so thoroughly. I know it’s a little off-topic, so, I really appreciate you taking that time =)

Topic		Replies	Views
lldb-gdbserver work LLDB	6	87	December 4, 2013
LLDB for Android initiative LLDB	31	128	February 5, 2014
About the "debugger target" LLVM Dev List Archives	2	69	April 16, 2015
General confusion/problems with LLDB debugging Cortex M0 LLDB	4	1336	January 26, 2023
Adding DWARF5 accelerator table support to llvm LLVM Dev List Archives	44	158	June 18, 2018

Rosetta debugserver details?

Related Topics