Background
Investigating an issue, that I was having for years, I’ve come across the problem, that CommandLine does not allow loading more than one library linking to the same libLLVM version. Instead it aborts. The comment above the abort seems to indicate, that this is unrecoverable.
llvm/lib/Support/CommandLine.cpp
// Fail hard if there were errors. These are strictly unrecoverable and
// indicate serious issues such as conflicting option names or an
// incorrectly
// linked LLVM distribution.
if (HadErrors)
report_fatal_error("inconsistency in registered CommandLine options");
So I tested that hypothesis by toying around and provide the result in [⚙ D141019 Support/CommandLine: replace argument mapping error with a warning]. It does not hold. Problem with this revision is, that this now allows loading of libLLVM any times including that of different versions plus it allows duplicate options. So this is not a solution.
Related issues
- cl::opt + LLVM_BUILD_LLVM_DYLIB is completely broken · Issue #23326 · llvm/llvm-project · GitHub
- Inconsistency in commandline options with multiple OpenCL vendor libraries installed · Issue #29935 · llvm/llvm-project · GitHub
Related topics
Related topics seem to relate to mixing static/dynamic LLVM libraries or mixing versions mostly. This topic relates to linking (indirectly) against the same libLLVM library multiple times.
- Opt: Option registered more than once!
- Can something be done with the "inconsistency in registered CommandLine options" error
- LLVM as a shared library
- Flang and LLVM_BUILD_LLVM_DYLIB=ON
Problem instances
Mesa with Clover and RocM installed
The problem can be reproduced by installing mesa with the Clover OpenCL implementation (-Dgallium-opencl=icd) and rocm-opencl-runtime at the same time. Both link to libLLVM, so clinfo will abort reproducibly on such systems.
LLVM is compiled with:
-DBUILD_SHARED_LIBS=OFF
-DLLVM_BUILD_LLVM_DYLIB=ON
-DLLVM_LINK_LLVM_DYLIB=ON
Clover linkage:
$ ldd /usr/lib64/libMesaOpenCL.so.1.0.0
…
libLLVM-15.so => /usr/lib/llvm/15/lib64/libLLVM-15.so
…
RocM linkage:
$ ldd /usr/lib64/libamd_comgr.so.2.4.0
…
libLLVM-15.so => /usr/lib/llvm/15/lib64/libLLVM-15.so
…
Without revision:
$ clinfo
mesa: CommandLine Error: Option 'h' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Aborted
With revision:
$ clinfo
warning: mesa: CommandLine Error: Option 'h' registered more than once!
Number of platforms 2
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 22.2.3
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP.dbg (3486.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Host timer resolution 1ns
…
With the revision also darktable-cltest works:
[dt_get_sysresource_level] switched to 1 as `default'
total mem: 257576MB
mipmap cache: 32197MB
available mem: 128788MB
singlebuff: 2012MB
OpenCL tune mem: OFF
OpenCL pinned: OFF
[opencl_init] opencl related configuration options:
[opencl_init] opencl: ON
[opencl_init] opencl_scheduling_profile: 'very fast GPU'
[opencl_init] opencl_library: 'default path'
[opencl_init] opencl_device_priority: '*/!0,*/*/*'
[opencl_init] opencl_mandatory_timeout: 400
[opencl_init] opencl_synch_cache: false
[opencl_init] opencl library 'libOpenCL' found on your system and loaded
warning: mesa: CommandLine Error: Option 'h' registered more than once!
[opencl_init] found 2 platforms
[opencl_init] found 2 devices
[dt_opencl_device_init]
DEVICE: 0: 'AMD Radeon RX 5500 XT (navi14, LLVM 15.0.6, DRM 3.48, 6.0.16-gentoo)', NEW
CANONICAL NAME: amdradeonrx5500xtnavi14llvm1506drm3486016gentoo
PLATFORM NAME & VENDOR: Clover, Mesa
DRIVER VERSION: 22.2.3
DEVICE VERSION: OpenCL 1.1 Mesa 22.2.3
DEVICE_TYPE: GPU
*** The OpenCL driver doesn't provide image support. See also 'clinfo' output ***
[dt_opencl_device_init]
DEVICE: 1: 'gfx1012:xnack-'
CANONICAL NAME: gfx1012xnack
PLATFORM NAME & VENDOR: AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc.
DRIVER VERSION: 3486.0 (HSA1.1,LC)
DEVICE VERSION: OpenCL 2.0
DEVICE_TYPE: GPU
GLOBAL MEM SIZE: 8176 MB
MAX MEM ALLOC: 6950 MB
MAX IMAGE SIZE: 16384 x 16384
MAX WORK GROUP SIZE: 256
MAX WORK ITEM DIMENSIONS: 3
MAX WORK ITEM SIZES: [ 1024 1024 1024 ]
ASYNC PIXELPIPE: NO
PINNED MEMORY TRANSFER: NO
MEMORY TUNING: NO
FORCED HEADROOM: 400
AVOID ATOMICS: NO
MICRO NAP: 250
ROUNDUP WIDTH: 16
ROUNDUP HEIGHT: 16
CHECK EVENT HANDLES: 128
PERFORMANCE: 2.107083
DEFAULT DEVICE: NO
KERNEL DIRECTORY: /usr/share/darktable/kernels
CL COMPILER OPTION: -cl-fast-relaxed-math
KERNEL LOADING TIME: 0.0195 sec
[opencl_init] OpenCL successfully initialized.
[opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
[opencl_init] 0 'gfx1012:xnack-'
[opencl_init] FINALLY: opencl is AVAILABLE on this system.
[opencl_init] initial status of opencl enabled flag is ON.
[opencl_init] set scheduling profile for very fast GPU.
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 0 0 0 0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 1 1 1 1 1
[opencl_synchronization_timeout] synchronization timeout set to 0
I chose Clover and RocM because it is the easiest to reproduce setup and you most likely do not need an amdgpu device for it. Note, that compiling one of the two libraries with the patched LLVM will result in no error or warning at all. Both have to be either be compiled with or without the patch/revision.
mpv utilizing hardware decoding and OpenCL
Using this on a system with solely RocM also reproduces the issue. mpv is compiled with vapoursynth support and using the plugin SVP. [1] SVP can run with and without OpenCL, so it must be activated via “Application settings”‣“GPU acceleration”‣“Your OpenCL device”.
mpv config must include:
no-resume-playback
input-ipc-server=/tmp/mpvsocket
hr-seek-framedrop=no
hwdec=vaapi-copy
hwdec-codecs=all
For Intel this is not reproducible, because their hardware decoding implementation does not link against libLLVM. Nouveau links against libLLVM with VA, but does not offer a OpenCL implementation. rusticl will link against it, so once it matures, it should be also reproducible with Intel GPUs.
Linkage mesa radeonsi vaapi driver
$ ldd /usr/lib64/va/drivers/radeonsi_drv_video.so
…
libLLVM-15.so => /usr/lib/llvm/15/lib64/libLLVM-15.so
…
Linkage RocM comgr
$ ldd /usr/lib64/libamd_comgr.so.2.4.0
…
libLLVM-15.so => /usr/lib/llvm/15/lib64/libLLVM-15.so
mpv execution without revision:
(+) Video --vid=1 (*) (vp9 3840x2160 23.976fps)
(+) Audio --aid=1 --alang=eng (*) (opus 2ch 48000Hz)
File tags:
Uploader: LLVM
Channel_URL: https://www.youtube.com/channel/UCv2_41bSAa5Y_8BacJUZfjQ
Using hardware decoding (vaapi-copy).
AO: [alsa] 48000Hz stereo 2ch float
VO: [gpu] 3840x2160 nv12
[xrandr] output DisplayPort-0 mode=5120x1440 old rate=96.04 refresh rates = 119.97 + 120.05 96.04* 72.00 60.00 50.00 48.01 100.00 60.00 59.98 30.00 25.00 24.00 23.98 scale = 1
[xrandr] container fps is 23.976024627686Hz, for output DisplayPort-0 mode 5120x1440 the best fitting display fps rate is 96.04Hz
[autoconvert] Converting nv12 -> yuv420p
mesa: CommandLine Error: Option 'h' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Aborted
mpv execution with revision:
$ mpv https://www.youtube.com/watch?v=VbFqA9rvxPs
(+) Video --vid=1 (*) (vp9 3840x2160 23.976fps)
(+) Audio --aid=1 --alang=eng (*) (opus 2ch 48000Hz)
File tags:
Uploader: LLVM
Channel_URL: https://www.youtube.com/channel/UCv2_41bSAa5Y_8BacJUZfjQ
Using hardware decoding (vaapi-copy).
AO: [alsa] 48000Hz stereo 2ch float
VO: [gpu] 3840x2160 nv12
[xrandr] output DisplayPort-0 mode=5120x1440 old rate=60.00 refresh rates = 119.97 + 120.05 96.04 72.00 60.00 50.00 48.01 100.00 60.00* 59.98 30.00 25.00 24.00 23.98 scale = 1
[xrandr] container fps is 23.976024627686Hz, for output DisplayPort-0 mode 5120x1440 the best fitting display fps rate is 96.04Hz
[autoconvert] Converting nv12 -> yuv420p
warning: mesa: CommandLine Error: Option 'h' registered more than once!
VO: [gpu] 3840x2160 yuv420p
AV: 00:00:03 / 00:45:44 (0%) A-V: 0.016 DS: 1.980/3 Dropped: 5 Cache: 5.9s/3MB
Exiting... (Quit)
[xrandr] switching output DisplayPort-0 that was set for replay to mode 5120x1440 at 96.04Hz and 1.0 scaling back to mode 5120x1440 with refresh rate 60.00Hz and 1 scaling
Proposal
In my understanding the behaviour so far has been:
- Abort when linking dynamically/statically against different LLVM versions.
- Abort when linking dynamically/statically against the same LLVM versions.
- Abort if a statically linked libLLVM is mixed with a dynamically linked one. This follows from 2 and might be desirable to continue.
I’d like to propose that 2 is just warned for the dynamic case, but I’m not sure how to achieve this. The linked revision so far just allows all cases, which will likely lead to broken software.
To prevent 1, a version option should precede any CommandLine call. I could use a hint, where this could be emitted, since I’m not familiar with the LLVM project. Maybe there is a base module that calls CommandLine with the argument string.
Afterwards the check could be something like this:
// somewhere defined at building and included by CommandLine.h
#define LLVM_VERSION "some version string"
// CommandLine.cpp
#include "llvm/Support/WithColor.h"
class CommandLineParser {
public:
//...
bool SameVersion = false;
//...
void addOption(Option *O, SubCommand *SC) {
bool HadErrors = false;
if (O->hasArgStr()) {
if (O->ArgStr == "version" && O->ValueStr == LLVM_VERSION)
SameVersion = true;
// If it's a DefaultOption, check to make sure it isn't already there.
if (O->isDefaultOption() &&
SC->OptionsMap.find(O->ArgStr) != SC->OptionsMap.end())
return;
// Add argument to the argument map!
if (!SC->OptionsMap.insert(std::make_pair(O->ArgStr, O)).second) {
if (SameVersion) {
WithColor::warning() << ProgramName << ": CommandLine Error: Option '" << O->ArgStr
<< "' registered more than once!\n";
} else {
errs() << ProgramName << ": CommandLine Error: Option '" << O->ArgStr
<< "' registered more than once!\n";
HadErrors = true;
}
}
}
// ...
}
}
It is still less than ideal, because this allows duplicate options as soon as a second library with the same version has been encountered. In general, while being at it, two refinements come into mind:
- Print a version mismatch when it occurs, including the involved versions. Older versions, that do not include a version string could just be declared as <16.x
- Print the names/symbols or whatever information we have at runtime about the two or more libraries loading libLLVM. This could help distributions, developers and end users in dealing with such issues.
So, all in all I wanted to point out the issue. I can’t provide a fully working solution here, so I’d appreciate any hints on that matter. If someone steps up to solve this problem on .* own, I would not mind. After all I’m just an end user investigating a software issue and while I have a C, bash and an embarrassing C# hat lying around, I unfortunately lack the ones for C++ and python.
Looking forward to your thoughts excluding those, why I dare to break stuff. Already have them on my mind.