lldb test suite on macOS 10.13 (High Sierra)

Hello lldb-dev,

We've just updated our mac buildbot to 10.13.1 (from 10.10.x), and
we're having trouble with the lldb test suite. All of the tests are
failing with the following error:

/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c:15:10:
fatal error: 'Python/Python.h' file not found
#include <Python/Python.h>
         ^~~~~~~~~~~~~~~~~
1 error generated.
Traceback (most recent call last):
  File "/Users/lldb_build/lldbSlave/buildDir/scripts/../llvm/tools/lldb/test/dotest.py",
line 7, in <module>
    lldbsuite.test.run_suite()
  File "/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/dotest.py",
line 1120, in run_suite
    configuration.setupCrashInfoHook()
  File "/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/configuration.py",
line 51, in setupCrashInfoHook
    raise Exception('command failed: "{}"'.format(cmd))
Exception: command failed: "SDKROOT= xcrun clang
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
-o /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
-framework Python -Xlinker -dylib"

It seems that this is happening because the buildbot is missing the
/System/Library/Frameworks/Python.framework/Headers symlink (this link
is present on my mac machine, which is still on 10.12). The rest of
the framework seems to be there (e.g. the file
/System/Library/Frameworks/Python.framework/Versions/Current/include/python2.7/Python.h
is present), just this symlink is missing. I cannot even create it
manually as System Integrity Protection will not let me do that.

Do you have any idea what went wrong?

thanks,
pl

PS: If it helps anything, the version reported by clang is:
Apple LLVM version 9.0.0 (clang-900.0.38)
Target: x86_64-apple-darwin17.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

On a somewhat tangential note:

Is anyone actually using this crashinfo hook? It looks like this could
be useful in the old dotest days, when we were running all of the
tests in a single python process, but now with the parallel test
runner spawning a new process for each test file and with piping test
results through a socket, it seems much less useful. If noone is using
that functionality, maybe we could "fix" the problem by deleting it.

Xcode and general development on macOS has moved from using headers in the base OS (/System/Library/Frameworks…) for building programs to using SDK’s to contain all the header files. The installed tools will know how to find the correct SDK.

On a clean install of 10.13 there are no headers anywhere in the /System/Library/Frameworks frameworks; I’m a little surprised you can build anything w/o an Xcode install. There’s a “command line tools” package in the Developer tools that puts some stuff back in /usr but that has had some issues, mostly it overwrites the /usr/bin/clang etc. tools that Xcode uses - which are just shims that find the correct version of the tools in the xcode-selected Xcode - with fixed versions of the tools which can lead to subtle errors if you ever try to use Xcode, so I don’t suggest that.

I think you need to put an Xcode install on your bot.

Jim

Thanks for the reply, Jim.

As far as I can tell, we already have Xcode on that machine (I only
have shell access there).
$ xcodebuild -version
Xcode 9.1
Build version 9B55
$ xcode-select -version
xcode-select version 2349.

BTW, this is the list of Python.h files on that machine:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7/Python.h
/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7/Python.h
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7/Python.h

I guess you were referring to the third one. That one seems to have
the "Headers" symlink and everything, but for some reason clang is not
picking it up. Do I need to run some fancy xcode-select or set some
environment variables?

I guess the reason I am able to compile lldb just fine is that cmake
is smart enough to find the right python framework. I'll try comparing
the cmake command lines with the one we run from dotest.

Additional issue which may complicate things is that this was not a
clean install, but an upgrade.

The build should be finding the version in the SDK within Xcode. I do have the CommandLineTools directory on my system, but it doesn’t have an SDK directory in it. I wonder if that is causing the problem?

One thing to check, do:

$ xcrun -find clang

Does that find clang in the DeveloperTools directory? If so try:

sudo xcode-select —switch /Applications/Xcode.app

Then everything should point to the toolchain & SDK’s in Xcode. If that doesn’t help you might try moving the DeveloperTools aside and see if things work then. Xcode should not need that to work.

Jim

Right, after learning way more than I ever wanted to know about xcrun,
I think I see the issue. When running with empty SDKROOT variable,
xcrun sets SDKROOT to "/" when running clang:
$ SDKROOT= xcrun --no-cache --log --verbose clang
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
-o /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
-framework Python -Xlinker -dylib -v
xcrun: note: PATH = '/usr/bin:/bin'
xcrun: note: SDKROOT = '/'
                                         ^^^^^ WRONG. there are no
python headers there

.....

env SDKROOT=/ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
-o /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
-framework Python -Xlinker -dylib -v

...

#include "..." search starts here:
#include <...> search starts here:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/9.0.0/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
/usr/include
/System/Library/Frameworks (framework directory)
/Library/Frameworks (framework directory)
     ^^^^ clang gets the include path wrong

End of search list.
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c:15:10:
fatal error: 'Python/Python.h' file not found
#include <Python/Python.h>
         ^~~~~~~~~~~~~~~~~
1 error generated.
^^^^^ and errors out

On the other hand, if I invoke xcrun with SDKROOT=macosx, everything
works just fine:

$ SDKROOT=macosx xcrun --no-cache --log --verbose clang
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
-o /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
-framework Python -Xlinker -dylib -v
xcrun: note: looking up SDK with
'/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
macosx -version Path'
xcrun: note: PATH = '/usr/bin:/bin'
xcrun: note: SDKROOT = 'macosx'
xcrun: note: TOOLCHAINS = ''
xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
xcrun: note: xcrun_db =
'/var/folders/bt/tws6gynx0ws1cc4ss53_pvqm0000gq/T/xcrun_db'
xcrun: note: lookup resolved to:
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk'
xcrun: note: PATH = '/usr/bin:/bin'
xcrun: note: SDKROOT =
'/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk'
                    ^^^^^^ correct. a valid SDK is located there

...

env SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
/Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
-o /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
-framework Python -Xlinker -dylib -v

...

#include "..." search starts here:
#include <...> search starts here:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/9.0.0/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/usr/include
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/System/Library/Frameworks
(framework directory)

  ^^^^^ include path is good and compilation succeeds

I've checked this on 10.12 and the xcrun behavior is same there (with
the difference that 10.12 does contain headers in
/System/Library/Frameworks.

It seems we could fix this by changing the invocation in dotest to
explicitly request SDKROOT=macosx. As far, as I can tell, this would
only break if someone tried to run dotest with iOS as host, but I
don't think that's ever going to be supported(?)

What do you think?

Pavel, I happened to hit this.
I'm not sure how you worked around, as I tried to export
SDKROOT=macosx but that didn't work for me.
Do you have a patch or a series of commands I can run?

Thanks,

In test/testcases/configuration.py:47, change "SDKROOT=" to "SDKROOT=macosx".

I did not put this out for review (yet) mainly because I was not sure
whether this affects only our buildbot (which was updated from some
ancient version straight to 10.13, so it could be something introduced
by the upgrade).

However, if you are running into this as well (and I understand you
have a fresh macbook :D), then I think we should just really fix it.

I also just hit this and apparently this is an intentional behavior of xcrun.

Note that this only affects systems that have the so-called command line tools installed (this is what you get when you install the command line tools without installing Xcode).

When the command line tools are installed *and* xcrun is run without explicitly asking for an sdk, it will add /usr/local/include to the search path instead of adding the -isysroot /Applications/Xcode.app/.../MacOSX10.13.sdk that we want here. This explains why Pavel's workaround works.

I'm not yet sure whether requiring the macosx SDK in this file is always the right thing to do here or if there is a better solution.

-- adrian

Setting SDKROOT=macosx is not ideal, but I think it should fine. This
is building host code, so the only case where this would be wrong is
if someone tried to run dotest on an iOS (WatchOS, ...) host, which I
think you guys don't do.

TBH, I would even consider removing the "crash hook" altogether. Is
anyone using this functionality on your side? The feature sounds like
it would be useful in the old dotest days, when all tests were run
sequentially in a single process, but now we run pretty much every
test in it's own process, so it doesn't look like it should be a
problem figuring out what the test was doing when it crashed.

Jason probably knows about the crash hook.

The crash hook is needed since ReportCrash on MacOS knows how to dig up a crash log line for each shared library that is currently loaded in a process when it generates a crash report. There are settings that we can enable to enable allowing the expression that is being run to be logged:

(lldb) settings set target.display-expression-in-crashlogs 1

These crash hooks, if we are talking about the host level call that sets the crash log lines and not something else, have been invaluable over the years so I would venture to say we don't want to take them out. Not sure if any other system does anything with these. If Apple is the only one, we can conditionally compile them in only for Apple targets if needed and use macro to set them that does nothing on non Apple builds.

Greg

I wasn't familiar with that setting, but I think we are talking about
something different here, as this is code that only runs as a part of
test (test/testcases/crashinfo.c). The main reason I suggested removal
is because it is bolted onto the test suite in quite a crude fashion.
The right solution would be to generate the crashinfo.so as a part of
the regular build (in which case we wouldn't have the current problem
where the crashinfo hook fails in some configurations even though the
rest of the lldb builds just fine).

This is actually using the same mechanism that Greg was describing, but through a separate route.

With the crashinfo mechanism in place, if a test crashes the crash log will show whatever test was being run at the time of the crash. This was pretty useful when we were running the testsuite in one process, since then it was pretty hard to figure out where the test crashed. But now that the runner is in a separate process it doesn't go down when a test crashes, so you can tell at the least which test directory caused the crash. If we want to make this better we could do something more platform independent to track which test in a particular directory is the current test, and then report that when we see a crash. But I don't think it is worth much effort keeping the current mechanism alive.

Jim

I agree with that. I have created D41871 to delete the hook.

Also, note that the current test runner will already report the full
test case name if we have a crash (except if the crash happens after
the relevant tearDown method finishes, but in that case, the crashinfo
hook would not help either).