A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).
There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community’s opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?
The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:
-s means dump symbols in GNU readelf, but dump sections in llvm-readobj
-t means dump section details in GNU readelf, but dump symbols in llvm-readobj
-a means dump all in GNU readelf, but dump arm attributes in llvm-readobj
There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won’t discuss those here.
Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).
I’d like to propose that we fix the three switches above such that they match GNU readelf’s interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.
We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:
For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
Provide a “–gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.