Cannot use system debugserver for testing

Hi Stefan,

Since the commit
“[CMake] Always build debugserver on Darwin and allow tests to use the system’s one”
I cannot use the system debugserver for testing.
I receive the following error message from lldb when I execute “ninja check-lldb”:

runCmd: run
runCmd failed!
error: process exited with status -1 (Error 1)

I do set up “-DLLDB_USE_SYSTEM_DEBUGSERVER=ON” with cmake so I see

-- LLDB tests use out-of-tree debugserver: /Library/Developer/CommandLineTools/Library/PrivateFrameworks/LLDB.framework/Resources/debugserver

Also, I have inspected the following test output

Command invoked: /usr/bin/python /Users/egbomrt/llvm2/git/llvm/tools/lldb/test/dotest.py -q --arch=x86_64 -s /Users/egbomrt/llvm2/build/release_assert/lldb-test-traces --build-dir /Users/egbomrt/llvm2/build/release_assert/lldb-test-build.noindex -S nm -u CXXFLAGS -u CFLAGS --executable /Users/egbomrt/llvm2/build/release_assert/./bin/lldb --dsymutil /Users/egbomrt/llvm2/build/release_assert/./bin/dsymutil --filecheck /Users/egbomrt/llvm2/build/release_assert/./bin/FileCheck -C /Users/egbomrt/llvm2/build/release_assert/bin/clang --codesign-identity - --out-of-tree-debugserver --arch x86_64 -t --env TERM=vt100 -p TestCModules.py --results-port 49931 -S nm --inferior -p TestCModules.py /Users/egbomrt/llvm2/git/llvm/tools/lldb/packages/Python/lldbsuite/test/lang/c/modules --event-add-entries worker_index=0:int
1 out of 736 test suites processed - TestCModules.py

so it seems like the argument for --out-of-tree-debugserver is missing…

Could you please advise?

Thank you,
Gabor

This might not be related to the debugserver, I just realized that I get
“error: process exited with status -1 (Error 1)”

even with the simplest main.c.
This may be some kind of security issue on mac OS…
Though I’ve checked and I have SIP disabled and I have executed “sudo /usr/sbin/DevToolsSecurity --enable”.

Actually, it is embarrassing (perhaps for macOS and not for me) that after a reboot the problem is gone.
Perhaps after “sudo /usr/sbin/DevToolsSecurity --enable” a reboot is required, but could not find anything official about that.

I this was because of a change in llvm which broke codesigning of debugserver: https://reviews.llvm.org/D64965

Hi Gábor, I am sorry this caused an issue for you. Good that apparently it’s resolved now.

Did you reconfigure an existing build-tree? Your observations would make sense in this context, because the change affects CMake cached variables. This is unfortunate, but can not always be avoided. If this happens again (or to anyone else), a clean build seems to be a good first step.

Best,
Stefan

I am still struggling with this issue. Now I decided to work with the codesigned version of the debugserver, becasue I had an error when I tried to use the system debugserver.
So I’ve run scripts/macos-setup-codesign.sh

After a reboot and fresh build (I have removed the CMakeCache.txt and the whole build dir) I have the debugserver signed:

$ codesign -dvvvv ~/llvm2/build/release_assert/bin/debugserver
Executable=/Users/egbomrt/llvm2/build/release_assert/bin/debugserver
Identifier=com.apple.debugserver
Format=Mach-O thin (x86_64)
CodeDirectory v=20100 size=38534 flags=0x0(none) hashes=1197+5 location=embedded
VersionPlatform=1
VersionMin=658944
VersionSDK=658944
Hash type=sha256 size=32
CandidateCDHash sha256=7b475cfa7127c84281ceb206093d13dd464dad74
Hash choices=sha256
Page size=4096
CDHash=7b475cfa7127c84281ceb206093d13dd464dad74
Signature size=1611
Authority=lldb_codesign
Signed Time=2019. Jul 22. 15:26:29
Info.plist entries=6
TeamIdentifier=not set
Sealed Resources=none
Internal requirements count=1 size=100
$

So far so good.
But then when I try to use lldb I have permission problems:

egbomrt@msmarple ~/llvm2/build/release_assert $ ./bin/lldb /bin/ls
(lldb) target create "/bin/ls"
Current executable set to '/bin/ls' (x86_64).
(lldb) r
**error: process exited with status -1 (Error 1)**
(lldb) ^D
egbomrt@msmarple ~/llvm2/build/release_assert $

However, as root I can use lldb:

egbomrt@msmarple ~/llvm2/build/release_assert $ sudo ./bin/lldb /bin/ls
(lldb) target create "/bin/ls"
Current executable set to '/bin/ls' (x86_64).
(lldb) r
Process 28052 launched: '/bin/ls' (x86_64)
.ninja_deps compile_commands.json
.ninja_log docs
CMakeCache.txt examples
CMakeDoxyfile.in include
...
Process 28052 exited with status = 0 (0x00000000)
(lldb) ^D
egbomrt@msmarple ~/llvm2/build/release_assert $

Is it possible to codesign in a way that a regular user can run the built debugserver? Or what else could be the reason behind this permission problem?

Thanks,
Gabor

egbomrt@msmarple ~/llvm2/build/release_assert $ ./bin/lldb /bin/ls
(lldb) target create “/bin/ls”
Current executable set to ‘/bin/ls’ (x86_64).
(lldb) r
error: process exited with status -1 (Error 1)

I don’t think this is related to debugserver codesigning. If you really need to debug system binaries, you may need to disable SIP.

Well, SIP is turned off and I experience the same with a binary I just built:

egbomrt@msmarple ~/llvm2/build/release_assert $ csrutil status
System Integrity Protection status: disabled.
egbomrt@msmarple ~/llvm2/build/release_assert $ ./bin/lldb ~/a.out
(lldb) target create "/Users/egbomrt/a.out"
Current executable set to '/Users/egbomrt/a.out' (x86_64).
(lldb) r
error: process exited with status -1 (Error 1)
(lldb) ^D
egbomrt@msmarple ~/llvm2/build/release_assert $ ls -la ~/a.out
-rwxr-xr-x 1 egbomrt admin 8736 Júl 22 16:16 /Users/egbomrt/a.out
egbomrt@msmarple ~/llvm2/build/release_assert $

Interesting. Is there any extra info dumped to the log (e.g. log enable lldb default)

Yes, here it is.

egbomrt@msmarple ~/llvm2/build/release_assert $ ./bin/lldb ~/a.out
(lldb) target create “/Users/egbomrt/a.out”
Current executable set to ‘/Users/egbomrt/a.out’ (x86_64).
(lldb) log enable lldb default
(lldb) r
Processing command: r
HandleCommand, cmd_obj : ‘process launch’
HandleCommand, (revised) command_string: ‘process launch -X true --’
HandleCommand, wants_raw_input:‘False’
HandleCommand, command line after removing command name(s): ‘-X true --’
Target::Launch() called for /Users/egbomrt/a.out
Target::Launch the process instance doesn’t currently exist.
have platform=true, platform_sp->IsHost()=true, default_to_use_pty=true
at least one of stdin/stdout/stderr was not set, evaluating default handling
target stdin=‘(empty)’, target stdout=‘(empty)’, stderr=‘(empty)’
Generating a pty to use for stdin/out/err
Target::Launch asking the platform to debug the process
Host::StartMonitoringChildProcess (callback, pid=94887, monitor_signals=0) source = 0x7f9bb923ec10

::waitpid (pid = 94887, &status, 0) => pid = 94887, status = 0x00000000 (EXITED), signal = 0, exit_status = 0
Host::StartMonitoringChildProcess (callback, pid=94888, monitor_signals=0) source = 0x7f9bb9218180

Went to stop the private state thread, but it was already invalid.
Process::SetPublicState (state = attaching, restarted = 0)
Host::StartMonitoringChildProcess (callback, pid=94889, monitor_signals=0) source = 0x7f9bb9243bd0

thread created
Process::AttachCompletionHandler::AttachCompletionHandler process=0x7f9bb8803c18, exec_count=0
Process::ControlPrivateStateThread (signal = 4)
Sending control event of type: 4.
thread created
Process::RunPrivateStateThread (arg = 0x7f9bb8803c18, pid = 94888) thread starting…
timeout = , event_sp)…
Process::RunPrivateStateThread (arg = 0x7f9bb8803c18, pid = 94888) got a control event: 4
timeout = , event_sp)…
timeout =
timeout = , event_sp)…
thread created
Process::SetExitStatus (status=-1 (0xffffffff), description=“Error 1”)
Process::SetPrivateState (exited)
Process::SetPrivateState (exited) stop_id = 1
Process::AttachCompletionHandler::PerformAction called with state exited (10)
Ran next event action, result was 2.
Process::ShouldBroadcastEvent (0x7f9bb92423b0) => new state: exited, last broadcast state: exited - YES
Process::HandlePrivateEvent (pid = 94888) broadcasting new state exited (old state attaching) to hijacked
Process::RunPrivateStateThread (arg = 0x7f9bb8803c18, pid = 94888) about to exit with internal state exited…
Process::RunPrivateStateThread (arg = 0x7f9bb8803c18, pid = 94888) thread exiting…
Process::SetPublicState (state = exited, restarted = 0)
timeout = , event_sp) => exited
HandleCommand, command did not succeed
error: process exited with status -1 (Error 1)
(lldb) ::waitpid (pid = 94889, &status, 0) => pid = 94889, status = 0x00000000 (EXITED), signal = 0, exit_status = 0
(lldb)

Some more info: I’ve stopped lldb (with the system lldb) when " Process::SetExitStatus (status=-1 (0xffffffff), description=“Error 1”)" is logged and gathered some more information.
Is it possible that some packet types are not implemented?

  • frame #0: 0x00007fff599452c6 libsystem_kernel.dylib__pthread_kill + 10 frame #1: 0x00007fff59a00bf1 libsystem_pthread.dylibpthread_kill + 284
    frame #2: 0x00007fff598af6a6 libsystem_c.dylibabort + 127 frame #3: 0x00007fff5987820d libsystem_c.dylib__assert_rtn + 324
    frame #4: 0x000000010b393539 liblldb.10.0.99svn.dyliblldb_private::Process::SetExitStatus(this=0x00007fcfaa470218, status=-1, cstr="Error 1") at Process.cpp:1149:3 frame #5: 0x000000010bbd1792 liblldb.10.0.99svn.dyliblldb_private::process_gdb_remote::ProcessGDBRemote::AsyncThread(arg=0x00007fcfaa470218) at ProcessGDBRemote.cpp:3877:28
    frame #6: 0x000000010b08f87b liblldb.10.0.99svn.dyliblldb_private::HostNativeThreadBase::ThreadCreateTrampoline(arg=0x00007fcfab218740) at HostNativeThreadBase.cpp:69:10 frame #7: 0x0000000112be72fd liblldb.10.0.99svn.dyliblldb_private::HostThreadMacOSX::ThreadCreateTrampoline(arg=0x00007fcfab218740) at HostThreadMacOSX.mm:68:10
    frame #8: 0x00007fff599fe2eb libsystem_pthread.dylib_pthread_body + 126 frame #9: 0x00007fff59a01249 libsystem_pthread.dylib_pthread_start + 66
    frame #10: 0x00007fff599fd40d libsystem_pthread.dylibthread_start + 13 (lldb) frame select 5 frame #5: 0x000000010bbd1792 liblldb.10.0.99svn.dyliblldb_private::process_gdb_remote::ProcessGDBRemote::AsyncThread(arg=0x00007fcfaa470218) at ProcessGDBRemote.cpp:3877:28
    3874 “System Integrity Protection”);
    3875 } else if (::strstr(continue_cstr, “vAttach”) != nullptr &&
    3876 response.GetStatus().Fail()) {
    → 3877 process->SetExitStatus(-1, response.GetStatus().AsCString());
    3878 } else {
    3879 process->SetExitStatus(-1, “lost connection”);
    3880 }
    (lldb) p response.GetError()
    (uint8_t) $0 = ‘\x01’
    (lldb) p response.GetServerPacketType()
    (StringExtractorGDBRemote::ServerPacketType) $1 = eServerPacketType_unimplemented
    (lldb) p response.GetResponseType()
    (StringExtractorGDBRemote::ResponseType) $2 = eError
    (lldb) p response.IsUnsupportedResponse()
    (bool) $3 = false
    (lldb) p response.GetStatus()
    (lldb_private::Status) $4 = (m_code = 1, m_type = eErrorTypeGeneric, m_string = “Error 1”)

Thanks,
Gabor

I think the system logs (in Console.app) would tell us more. Search for debugserver and you should find attach failures. Then remove the filter and look at what happens right before that. There should be a log from taskgated or authd that is a little more explicit about what’s failing.

Fred

Ok, I’ve done that, here are the logs which happen before the first debugserver error:

error 18:33:20.236506 +0200 taskgated cannot open file at line 42270 of [95fbac39ba]
error 18:33:20.236526 +0200 taskgated os_unix.c:42270: (2) open(/var/db/DetachedSignatures) - No such file or directory
default 18:33:20.236586 +0200 taskgated MacOS error: -67062
error 18:33:20.246771 +0200 taskgated cannot open file at line 42270 of [95fbac39ba]
error 18:33:20.246787 +0200 taskgated os_unix.c:42270: (2) open(/var/db/DetachedSignatures) - No such file or directory
default 18:33:20.246841 +0200 taskgated MacOS error: -67062
default 18:33:20.260319 +0200 debugserver debugserver will use os_log for internal logging.
default 18:33:20.260491 +0200 debugserver debugserver-@(#)PROGRAM:LLDB PROJECT:lldb-360.99.0
for x86_64.
default 18:33:20.260615 +0200 debugserver Got a connection, waiting for process information for launching or attaching.
default 18:33:20.264541 +0200 trustd cert[0]: AnchorTrusted =(leaf)[force]> 0
default 18:33:20.272256 +0200 trustd cert[2]: AnchorTrusted =(leaf)[force]> 0
default 18:33:20.276567 +0200 trustd cert[2]: AnchorTrusted =(leaf)[force]> 0
default 18:33:20.278680 +0200 authd UNIX error exception: 3
error 18:33:20.279462 +0200 authd process: PID 27648 failed to create code ref 100003
error 18:33:20.280017 +0200 authd Fatal: interaction not allowed (session has no ui access) (engine 3727)
default 18:33:20.280031 +0200 authd Failed to authorize right ‘system.privilege.taskport’ by client ‘/usr/libexec/taskgated’ [254] for authorization created by ‘/usr/libexec/taskgated’ [27648] (3,1) (-60007) (engine 3727)
error 18:33:20.280092 +0200 authd copy_rights: authorization failed
error 18:33:20.280442 +0200 debugserver error: MachTask::TaskPortForProcessID task_for_pid failed: ::task_for_pid ( target_tport = 0x0103, pid = 27646, &task ) => err = 0x00000005 ((os/kern) failure)

error 18:33:20.280017 +0200 authd Fatal: interaction not allowed (session has no ui access) (engine 3727)

This gave me a hint, so I used VPN to have a gui and got a gui window popped up to authenticate lldb. And then I could run lldb as a normal user. Hurray!

BUT through ssh I still cannot run lldb that as a normal user.
I’ve seen you have

@@@ Setup @@@
Unlocking keychain /Users/buildslave/Library/Keychains/lldb.keychain-db ... [1;32mOK [0m
+ echo @@@@@@
@@@@@@

at your build bot at greenlab.

So I tried “security -v unlock-keychain /Library/Keychains/System.keychain” but that did not work, I believe because scripts/macos-setup-codesign.sh did not ask for a password for the keychain (it asked for pw because of sudo).
Is this the way to work if I don’t have GUI (I must work via SSH, and this ought to be part of a CI) ?
Should I recreate the keychain with a pw somehow?

Thanks

error 18:33:20.280017 +0200 authd Fatal: interaction not allowed (session has no ui access) (engine 3727)

This gave me a hint, so I used VPN to have a gui and got a gui window popped up to authenticate lldb. And then I could run lldb as a normal user. Hurray!

BUT through ssh I still cannot run lldb that as a normal user.

This is by design, debugging as a normal user requires a graphical session. I don’t remember in which macOS version this became a requirement, but maybe our bots are running an older version?

I’m honestly not sure what the exact constraints are, but I know that I have to start a tmux session in a graphical context to be able to run the test suite remotely on my machine.

Fred

I believe you can run a shell command as root:

$ sudo /usr/sbin/DevToolsSecurity --enable

Then you should be able to debug after that, even on ssh connections.

Greg

$ sudo /usr/sbin/DevToolsSecurity --enable
Unfortunately, this did not help.

Anyway, I’ve found the solution, but took a while.
So, the given scripts/macos-setup-codesign.sh adds a certificate to the system keychain /Library/Keychains/System.keychain and with a TrustRoot policy. This explains why could I execute the build lldb (debugserver) only as root.
So, I tried with a TrustAsRoot policy but that did not work, security add-trusted-cert failed by complaining about the parameters.

Finally, I tried with debugsign from ZORG (https://github.com/llvm/llvm-zorg/tree/master/codesign/debugsign), which proved to be the approach to follow if we have only SSH access. (Big kudos for Endre for pointing out that!)
Here is what I did from step to step (if anyone will be in a similar situation in the future):

  • cloned ZORG from github and followed the instructions in the readme.
  • Executed debugsign check, debugsign --unsafe setup.
  • Created the a new keychain as described at debugsign’s readme. This resulted the file ~/Library/Keychains/lldb.keychain-db.
  • Created the .p12 file as described in the readme
  • Then, debugsign import.
  • I had to remove the existing “lldb_codesign” from /Library/Keychains/System.keychain with sudo security delete-certificate -c "lldb_codesign" /Library/Keychains/System.keychain in order to use the new one from lldb.keychain-db.
  • Reboot. I noticed, even if the old cert was removed by delete-certificate the system still used it because macOS does cache the certificates as stated here https://lldb.llvm.org/resources/build.html#code-signing-on-macos .
  • Lastly and most importantly, after the reboot, we have to unlock the new keychain: security unlock-keychain -p lldb_codesign ~/Library/Keychains/lldb.keychain-db . According to http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake the best is to do unlock the keychain before every build! Normally, if we have a gui, then a popup window requires authentication for using the keychain. But since we use ssh, unlock-keychain is the tool to do the same.

Cheers,
Gabor