Are you confident that /homie/kli/clang-install/lib is the same on all of the nodes used by the MPI program?
And that it contains the same version of libomp.so everywhere?
Perhaps you should also set an envirable to have the OpenMP runtime print its version, something like this
$ KMP_VERSION=1 ./a.out
LLVM OMP version: 5.0.20140926
LLVM OMP library type: performance
LLVM OMP link type: dynamic
LLVM OMP build time: no_timestamp
LLVM OMP build compiler: Clang 12.0
LLVM OMP alternative compiler support: yes
LLVM OMP API version: 5.0 (201611)
LLVM OMP dynamic error checking: no
LLVM OMP thread affinity support: no
I don’t think that is the case. There is only one task “-np 1” on one node. Both ‘./a.out’ and ‘mpirun -np 1 ./a.out’ are issued on the same node which has the same library in /home/kli/clang-install/lib. That is puzzling me!
I don’t think that is the case. There is only one task “-np 1” on one node. Both ‘./a.out’ and ‘mpirun -np 1 ./a.out’ are issued on the same node which has the same library in /home/kli/clang-install/lib. That is puzzling me!
It really looks as if you’re getting two different versions of the runtime, though, so having the runtime tell you its properties is still likely useful.If nothing else, it may show up that you’re not propagating envirables as you might have hoped (if the MPI version doesn’t print anything !)
I figure out how to make it work. I need to preload libarcher.so. I don’t understand why it cannot be done automatically in the “mpirun … ./a.out” case.
I tried both commands with LD_DEBUG. It seems that somehow the
libarcher.so cannot be found to resolve ompt_start_tool.
libomp calls ompt_start_tool directly (by name) in
In a typical execution, this will find the implementation in libomp:
If I did not miss something, implementation and call should be in the
same ifdef branch, i.e., both active or not.
Reasoning: This explicit call by name is necessary to catch the case of
a static tool compiled in the application (linker will prefer
ompt_start_tool from the static tool). Such static version might not be
found by dlsym.
Later, libomp implicitly assumes "libarcher to be the last entry in
OMPT_TOOL_LIBRARIES", dlopens libarcher and dlsyms ompt_start_tool:
Therefore you see libarcher in the LD_DEBUG output.
This is how it looks for me:
$ LD_DEBUG=bindings ./a.out 2>&1| grep ompt_start_tool
1568: binding file libomp.so [0] to libomp.so [0]: normal symbol
`ompt_start_tool' [VERSION]
1568: binding file libarcher.so [0] to libarcher.so [0]: normal
symbol `ompt_start_tool'
$ LD_DEBUG=bindings LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out
2>&1| grep ompt_start_tool
9105: binding file /home/kli/clang-install/lib/libomp.so [0] to
/home/kli/clang-install/lib/libomp.so [0]: normal symbol `ompt_start_tool'
[VERSION]
9105: /home/kli/clang-install/lib/libomp.so: error: symbol
lookup error: undefined symbol: ompt_start_tool (fatal)
Here the execution complains about the missing symbol, but does not
abort. It is unclear to me, why the execution does not abort in this case.
9105: binding file /home/kli/clang-install/lib/libarcher.so [0]
to /home/kli/clang-install/lib/libarcher.so [0]: normal symbol
`ompt_start_tool'
$ LD_DEBUG=bindings LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np
1 ./a.out 2>&1| grep ompt_start_tool
27652: binding file /home/kli/clang-install/lib/libomp.so [0] to
/home/kli/clang-install/lib/libomp.so [0]: normal symbol `ompt_start_tool'
[VERSION]
27652: /home/kli/clang-install/lib/libomp.so: error: symbol
lookup error: undefined symbol: ompt_start_tool (fatal)
Same message as above, but now the execution aborts:
./a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so:
undefined symbol: ompt_start_tool
Since the dlopen for libarcher is after the explicit call to
ompt_start_tool, it is clear that the message about libarcher is missing
here.
Does your libomp contain an implementation of ompt_start_tool?