Hi,
I am trying to build an application that can do OpenMP offloading on a POWER8 + P100 system using the latest LLVM/Clang toolchain (openmp
is too the latest).
The build error is:
[ 3%] Building CXX object sli/CMakeFiles/sli_lib.dir/arraydatum.cc.o
cd /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/sli && /gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang++ -Dsli_lib_EXPORTS -I/bgsys/drivers/ppcfloor/comm/gcc/include -I/gpfs/software/opt/gsl/2.4/include -I/gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/libnestutil -I/gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/libnestutil -std=c++11 -O2 -Wall -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda --cuda-path=/gpfs/software/opt/cuda/9.2.88 -fPIC -o CMakeFiles/sli_lib.dir/arraydatum.cc.o -c /gpfs/homeb/pcp0/pcp0151/projects/nest-simulator/sli/arraydatum.cc
ptxas fatal : Cannot take address of function ‘__pthread_key_create’
clang-8: error: ptxas command failed with exit code 255 (use -v to see invocation)
make[2]: *** [sli/CMakeFiles/sli_lib.dir/arraydatum.cc.o] Error 255
make[2]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream' make[1]: *** [sli/CMakeFiles/sli_lib.dir/all] Error 2 make[1]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream’
make: *** [all] Error 2
The CUDA toolkit is 9.2.88, is it possible the error is fixed in the
latest toolkit?
I can't reproduce the problem on x86, at least on this file on branch master and for-upstream. Maybe it's specific to Power?
Can you have a look where this function is called? Extending the compile command for that specific file by "-S -emit-llvm" and posting the result would be helpful.
Determining if files mach-o/dyld.h exist failed with the following output:
Change Dir: /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp
Determining if files mach/mach.h exist failed with the following output:
Change Dir: /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp
Run Build Command:“/usr/bin/gmake” “cmTC_a088b/fast”
/usr/bin/gmake -f CMakeFiles/cmTC_a088b.dir/build.make CMakeFiles/cmTC_a088b.dir/build
gmake[1]: Entering directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMak eTmp' Building C object CMakeFiles/cmTC_a088b.dir/CheckIncludeFiles.c.o /gpfs/work/pcp0/pcp0151/opt/llvm+clang-upstream/bin/clang -o CMakeFiles/cmTC_a088b.dir/C heckIncludeFiles.c.o -c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/C heckIncludeFiles.c /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp/CheckIncludeFiles.c:2:10: fatal error: 'mach/mach.h' file not found #include <mach/mach.h> ^~~~~~~~~~~~~ 1 error generated. gmake[1]: *** [CMakeFiles/cmTC_a088b.dir/CheckIncludeFiles.c.o] Error 1 gmake[1]: Leaving directory /gpfs/work/pcp0/pcp0151/build/nest-clang-upstream/CMakeFiles/CMakeTmp’
Sure, because Clang only outputs LLVM IR and ptxas is not invoked but what does the output file look like?
If you didn't remove the -o flag, there will now be text in arraydatum.cc.o which is what I'm looking for: That's the internal representation during compilation and easier to read than PTX code.
Sorry, I'm obviously doing a very bad job in explaining how to find the problem
Okay, so according to the IR the reference to pthread_key_create comes from
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_empty_rep()
which is (transitively) called from
Datum::list(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) const.
The "problem" is that
1. arraydatum.cc has two explicit instantiations of lockPTRDatum (inheriting Datum through TypedDatum) which is wrongfully emitted for the GPU, and
2. lockPTRDatum needs a vtable and all virtual member functions (including Datum::list) end up on the device.
Funny enough I submitted a bug just yesterday: https://bugs.llvm.org/show_bug.cgi?id=38823
Looks like you don't even need a target region to make Clang emit the explicit instantiation...