ptxas fatal : Cannot take address of function '__pthread_key_create'

Hi,
I am trying to build an application that can do OpenMP offloading on a POWER8 + P100 system using the latest LLVM/Clang toolchain (openmp
is too the latest).

The build error is:

[ 3%] Building CXX object sli/CMakeFiles/sli_lib.dir/arraydatum.cc.o
cd /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/sli && /gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang++ -Dsli_lib_EXPORTS -I/bgsys/drivers/ppcfloor/comm/gcc/include -I/gpfs/software/opt/gsl/2.4/include -I/gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/libnestutil -I/gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/libnestutil -std=c++11 -O2 -Wall -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda --cuda-path=/gpfs/software/opt/cuda/9.2.88 -fPIC -o CMakeFiles/sli_lib.dir/arraydatum.cc.o -c /gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/sli/arraydatum.cc
ptxas fatal : Cannot take address of function ‘__pthread_key_create’
clang-8: error: ptxas command failed with exit code 255 (use -v to see invocation)
make[2]: *** [sli/CMakeFiles/sli_lib.dir/arraydatum.cc.o] Error 255
make[2]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream' make[1]: *** [sli/CMakeFiles/sli_lib.dir/all] Error 2 make[1]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream’
make: *** [all] Error 2

The CUDA toolkit is 9.2.88, is it possible the error is fixed in the
latest toolkit?

Thanks,
Itaru.

Hi Alexey,
If the function is not explicitly used in the code-base in question, how do I
find which function/module requires it?

Mabye you are using some thread-local storage? Can you share the code, everything else will be guesses at best...

Jonas

Jonas,
No problem. It is at:

https://github.com/ikitayama/nest-simulator

… and especially this file

https://github.com/ikitayama/nest-simulator/blob/master/sli/arraydatum.cc

is the first one on which llvm/clang-8.0.0 stops in a build process.

I can't reproduce the problem on x86, at least on this file on branch master and for-upstream. Maybe it's specific to Power?

Can you have a look where this function is called? Extending the compile command for that specific file by "-S -emit-llvm" and posting the result would be helpful.

Jonas

CMake compiler check fails:

Determining if files mach-o/dyld.h exist failed with the following output:
Change Dir: /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp

Run Build Command:“/usr/bin/gmake” “cmTC_1486a/fast”
/usr/bin/gmake -f CMakeFiles/cmTC_1486a.dir/build.make CMakeFiles/cmTC_1486a.dir/build
gmake[1]: Entering directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMak eTmp' Building C object CMakeFiles/cmTC_1486a.dir/CheckIncludeFiles.c.o /gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang -o CMakeFiles/cmTC_1486a.dir/C heckIncludeFiles.c.o -c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/C heckIncludeFiles.c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/CheckIncludeFiles.c:2:10: fatal error: 'mach-o/dyld.h' file not found #include <mach-o/dyld.h> ^~~~~~~~~~~~~~~ 1 error generated. gmake[1]: *** [CMakeFiles/cmTC_1486a.dir/CheckIncludeFiles.c.o] Error 1 gmake[1]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMake
Tmp’
gmake: *** [cmTC_1486a/fast] Error 2

Source:
/* */
#include <mach-o/dyld.h>

int main(void){return 0;}

Determining if files mach/mach.h exist failed with the following output:
Change Dir: /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp

Run Build Command:“/usr/bin/gmake” “cmTC_a088b/fast”
/usr/bin/gmake -f CMakeFiles/cmTC_a088b.dir/build.make CMakeFiles/cmTC_a088b.dir/build
gmake[1]: Entering directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMak eTmp' Building C object CMakeFiles/cmTC_a088b.dir/CheckIncludeFiles.c.o /gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang -o CMakeFiles/cmTC_a088b.dir/C heckIncludeFiles.c.o -c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/C heckIncludeFiles.c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/CheckIncludeFiles.c:2:10: fatal error: 'mach/mach.h' file not found #include <mach/mach.h> ^~~~~~~~~~~~~ 1 error generated. gmake[1]: *** [CMakeFiles/cmTC_a088b.dir/CheckIncludeFiles.c.o] Error 1 gmake[1]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp’

I meant when it tries to compile arraydatum.cc, sorry:

[ 3%] Building CXX object sli/CMakeFiles/sli_lib.dir/arraydatum.cc.o
cd /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/sli &&
/gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang++
-Dsli_lib_EXPORTS -I/bgsys/drivers/ppcfloor/comm/gcc/include
-I/gpfs/software/opt/gsl/2.4/include
-I/gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/libnestutil
-I/gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/libnestutil
-std=c++11 -O2 -Wall -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda
--cuda-path=/gpfs/software/opt/cuda/9.2.88 -fPIC -o
CMakeFiles/sli_lib.dir/arraydatum.cc.o -c
/gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/sli/arraydatum.cc

You can copy this command and add some flags to see where this function is called.

It finishes silently if `-S -emit-llvm’ is also supplied to the driver.

Sure, because Clang only outputs LLVM IR and ptxas is not invoked :wink: but what does the output file look like?
If you didn't remove the -o flag, there will now be text in arraydatum.cc.o which is what I'm looking for: That's the internal representation during compilation and easier to read than PTX code.

Sorry, I'm obviously doing a very bad job in explaining how to find the problem :frowning:

I have attached to this email.

arraydatum.cc.o (1.27 MB)

Okay, so according to the IR the reference to pthread_key_create comes from
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_empty_rep()
which is (transitively) called from
Datum::list(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) const.

The "problem" is that
1. arraydatum.cc has two explicit instantiations of lockPTRDatum (inheriting Datum through TypedDatum) which is wrongfully emitted for the GPU, and
2. lockPTRDatum needs a vtable and all virtual member functions (including Datum::list) end up on the device.
Funny enough I submitted a bug just yesterday: https://bugs.llvm.org/show_bug.cgi?id=38823
Looks like you don't even need a target region to make Clang emit the explicit instantiation...

Jonas