[RFC] llvm-dwarfdump's command line interface

I would like to grow llvm-dwarfdump to become a drop-in replacement for the dwarfdump utility that is currently shipping on Darwin. (You can search the web for "darwin dwarfdump manpage" to see the currently supported feature set.) Doing this means implementing the missing features, such as the ability to print only subsets of DIEs, looking up DIEs by name or address, and the option to produce more diff-friendly output. I'm fairly certain that these additional features will be beneficial on all LLVM-suported platforms.
To turn it into a drop-in replacement on Darwin, I will also need to re-orgnize the command line interface a bit. In particular (and this is pretty much the only difference)

$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc

becomes

$ dwarfdump --debug-info
$ dwarfdump --apple-objc

respectively.
My question is, how attached are users on other platforms to the current command line interface? I could easily create a separate command line parser for Darwin that mimicks Darwin dwarfdump (like llvm-objdump does), or we could just change the command line interface for llvm-dwarfdump. I know that there is also a dwarfdump utility on Linux (based on libdwarf?) that has an entirely different command line interface from both llvm-dwarfdump and Darwin dwarfdump.
Do people see value in keeping the llvm-dwarfdump command line interface or would changing it to the above format be acceptable?

thanks for your input!
Adrian

I would like to grow llvm-dwarfdump to become a drop-in replacement for the dwarfdump utility that is currently shipping on Darwin. (You can search the web for “darwin dwarfdump manpage” to see the currently supported feature set.)

For anyone looking: http://www.manpagez.com/man/1/dwarfdump/

Doing this means implementing the missing features, such as the ability to print only subsets of DIEs, looking up DIEs by name or address, and the option to produce more diff-friendly output. I’m fairly certain that these additional features will be beneficial on all LLVM-suported platforms.
To turn it into a drop-in replacement on Darwin, I will also need to re-orgnize the command line interface a bit. In particular (and this is pretty much the only difference)

$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc

becomes

$ dwarfdump --debug-info
$ dwarfdump --apple-objc

respectively.
My question is, how attached are users on other platforms to the current command line interface?

I’m not especially attached - though I imagine it’s pretty cheap to support both (though I don’t personally mind if you want to migrate from one to the other - will just take a bit to relearn the muscle memory).

One other thing: If we’re moving towards a point where llvm-dwarfdump is not just a tool for LLVM developers but a shipping product, might be worth being a bit more rigorous about testing for it (historically sometimes dwarfdump functionality hasn’t been tested - committed along with the LLVM functionality it was implemented to test - or the only testing is with checked in object files, which are a bit hard to maintain). Either looking at the DWARF YAML support and maybe fleshing it out a bit/making it more usable, or maybe assembly based tests? Not sure.

I would like to grow llvm-dwarfdump to become a drop-in replacement for the dwarfdump utility that is currently shipping on Darwin. (You can search the web for “darwin dwarfdump manpage” to see the currently supported feature set.)

For anyone looking: http://www.manpagez.com/man/1/dwarfdump/

Doing this means implementing the missing features, such as the ability to print only subsets of DIEs, looking up DIEs by name or address, and the option to produce more diff-friendly output. I’m fairly certain that these additional features will be beneficial on all LLVM-suported platforms.
To turn it into a drop-in replacement on Darwin, I will also need to re-orgnize the command line interface a bit. In particular (and this is pretty much the only difference)

$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc

becomes

$ dwarfdump --debug-info
$ dwarfdump --apple-objc

respectively.
My question is, how attached are users on other platforms to the current command line interface?

I’m not especially attached - though I imagine it’s pretty cheap to support both (though I don’t personally mind if you want to migrate from one to the other - will just take a bit to relearn the muscle memory).

If we’re looking for the path of least resistance, we could even support both variants as aliases at the same time, since they don’t conflict.

One other thing: If we’re moving towards a point where llvm-dwarfdump is not just a tool for LLVM developers but a shipping product, might be worth being a bit more rigorous about testing for it (historically sometimes dwarfdump functionality hasn’t been tested - committed along with the LLVM functionality it was implemented to test - or the only testing is with checked in object files, which are a bit hard to maintain).

Fully agreed, and indeed with all recent patches that went into llvm-dwarfdump we are already moving in that direction.

Either looking at the DWARF YAML support and maybe fleshing it out a bit/making it more usable, or maybe assembly based tests? Not sure.

I’m trying to figure this out right now by looking at https://reviews.llvm.org/D36993

– adrian

Sony delivers the GNU objdump and a proprietary utility which have DWARF-dumping features. So in that sense, we don’t mind what you do with llvm-dwarfdump. Being able to extract subtrees sounds like a cool feature, I will say; Katya was just asking me about that yesterday.

We have a longer term ideal (I won’t call it a plan, although it might be considered an intention) to deliver the LLVM utilities instead, so more robust testing along the lines Dave suggested would be beneficial.

–paulr

I would like to grow llvm-dwarfdump to become a drop-in replacement for the dwarfdump utility that is currently shipping on Darwin. (You can search the web for “darwin dwarfdump manpage” to see the currently supported feature set.)

For anyone looking: http://www.manpagez.com/man/1/dwarfdump/

Doing this means implementing the missing features, such as the ability to print only subsets of DIEs, looking up DIEs by name or address, and the option to produce more diff-friendly output. I’m fairly certain that these additional features will be beneficial on all LLVM-suported platforms.
To turn it into a drop-in replacement on Darwin, I will also need to re-orgnize the command line interface a bit. In particular (and this is pretty much the only difference)

$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc

becomes

$ dwarfdump --debug-info
$ dwarfdump --apple-objc

respectively.
My question is, how attached are users on other platforms to the current command line interface?

I’m not especially attached - though I imagine it’s pretty cheap to support both (though I don’t personally mind if you want to migrate from one to the other - will just take a bit to relearn the muscle memory).

If we’re looking for the path of least resistance, we could even support both variants as aliases at the same time, since they don’t conflict.

One other thing: If we’re moving towards a point where llvm-dwarfdump is not just a tool for LLVM developers but a shipping product, might be worth being a bit more rigorous about testing for it (historically sometimes dwarfdump functionality hasn’t been tested - committed along with the LLVM functionality it was implemented to test - or the only testing is with checked in object files, which are a bit hard to maintain).

Fully agreed, and indeed with all recent patches that went into llvm-dwarfdump we are already moving in that direction.

There was some work this way in llvm-objdump as well :slight_smile:

-eric

+1 for the shorter options.

In my opinion assembly-based testing is the way forward. We used this
in lld and it went a long way.
YAML I think it's fine to simulate what MC can't emit (e.g. broken
object files).
YAML IMHO, introduces an obfuscation layer (at least for me, but maybe
I spent too much time looking at object files).
Also, we found out issues with YAML when reducing testcases with
obj2yaml/yaml2obj (in particular, the mapping is not isomorphic &
loses interesting information).

Thanks,

I would like to grow llvm-dwarfdump to become a drop-in replacement for
the dwarfdump utility that is currently shipping on Darwin. (You can search
the web for “darwin dwarfdump manpage” to see the currently supported
feature set.)

For anyone looking: http://www.manpagez.com/man/1/dwarfdump/

Doing this means implementing the missing features, such as the ability to
print only subsets of DIEs, looking up DIEs by name or address, and the
option to produce more diff-friendly output. I’m fairly certain that these
additional features will be beneficial on all LLVM-suported platforms.
To turn it into a drop-in replacement on Darwin, I will also need to
re-orgnize the command line interface a bit. In particular (and this is
pretty much the only difference)

$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc

becomes

$ dwarfdump --debug-info
$ dwarfdump --apple-objc

respectively.
My question is, how attached are users on other platforms to the current
command line interface?

I’m not especially attached - though I imagine it’s pretty cheap to support
both (though I don’t personally mind if you want to migrate from one to the
other - will just take a bit to relearn the muscle memory).

One other thing: If we’re moving towards a point where llvm-dwarfdump is not
just a tool for LLVM developers but a shipping product, might be worth being
a bit more rigorous about testing for it (historically sometimes dwarfdump
functionality hasn’t been tested - committed along with the LLVM
functionality it was implemented to test - or the only testing is with
checked in object files, which are a bit hard to maintain). Either looking
at the DWARF YAML support and maybe fleshing it out a bit/making it more
usable, or maybe assembly based tests? Not sure.

In my opinion assembly-based testing is the way forward. We used this
in lld and it went a long way.

I think there’s a few differences here:
lld isn’t testing for invalid object input, so assembly’s sufficient
lld’s mostly testing low-level object constructs, so assembly’s already intended to be a fairly terse/legible/maintainable textual form for object files. This doesn’t seem to scale well to DWARF, which is a format within a format, with no native dense textual representation.

For example, when writing assembly you don’t need to write the section headers/tables/etc - but when writing DWARF in assembly you have to create the headers, the abbreviations, etc, all manually.

So having a higher level dense textual DWARF representation would be more analogous to how lld tests are written in assembly.

YAML I think it’s fine to simulate what MC can’t emit (e.g. broken
object files).
YAML IMHO, introduces an obfuscation layer (at least for me, but maybe
I spent too much time looking at object files).
Also, we found out issues with YAML when reducing testcases with
obj2yaml/yaml2obj (in particular, the mapping is not isomorphic &
loses interesting information).

Certainly ought to be fixed.

  • Dave

>
>
>>
>> I would like to grow llvm-dwarfdump to become a drop-in replacement for
>> the dwarfdump utility that is currently shipping on Darwin. (You can
>> search
>> the web for "darwin dwarfdump manpage" to see the currently supported
>> feature set.)
>
>
> For anyone looking: man page dwarfdump section 1
>
>>
>> Doing this means implementing the missing features, such as the ability
>> to
>> print only subsets of DIEs, looking up DIEs by name or address, and the
>> option to produce more diff-friendly output. I'm fairly certain that
>> these
>> additional features will be beneficial on all LLVM-suported platforms.
>> To turn it into a drop-in replacement on Darwin, I will also need to
>> re-orgnize the command line interface a bit. In particular (and this is
>> pretty much the only difference)
>>
>> $ llvm-dwarfdump --debug-dump=info
>> $ llvm-dwarfdump --debug-dump=apple-objc
>>
>> becomes
>>
>> $ dwarfdump --debug-info
>> $ dwarfdump --apple-objc
>>
>> respectively.
>> My question is, how attached are users on other platforms to the
>> current
>> command line interface?
>
>
> I'm not especially attached - though I imagine it's pretty cheap to
> support
> both (though I don't personally mind if you want to migrate from one to
> the
> other - will just take a bit to relearn the muscle memory).
>
> One other thing: If we're moving towards a point where llvm-dwarfdump is
> not
> just a tool for LLVM developers but a shipping product, might be worth
> being
> a bit more rigorous about testing for it (historically sometimes
> dwarfdump
> functionality hasn't been tested - committed along with the LLVM
> functionality it was implemented to test - or the only testing is with
> checked in object files, which are a bit hard to maintain). Either
> looking
> at the DWARF YAML support and maybe fleshing it out a bit/making it more
> usable, or maybe assembly based tests? Not sure.
>

In my opinion assembly-based testing is the way forward. We used this
in lld and it went a long way.

I think there's a few differences here:
lld isn't testing for invalid object input, so assembly's sufficient
lld's mostly testing low-level object constructs, so assembly's already
intended to be a fairly terse/legible/maintainable textual form for object
files. This doesn't seem to scale well to DWARF, which is a format within a
format, with no native dense textual representation.

For example, when writing assembly you don't need to write the section
headers/tables/etc - but when writing DWARF in assembly you have to create
the headers, the abbreviations, etc, all manually.

So having a higher level dense textual DWARF representation would be more
analogous to how lld tests are written in assembly.

I'm not opposed to using an higher level representation, assuming it's
not too verbose.

YAML I think it's fine to simulate what MC can't emit (e.g. broken
object files).
YAML IMHO, introduces an obfuscation layer (at least for me, but maybe
I spent too much time looking at object files).
Also, we found out issues with YAML when reducing testcases with
obj2yaml/yaml2obj (in particular, the mapping is not isomorphic &
loses interesting information).

Certainly ought to be fixed.

It its getting fixed, one bug at the time :slight_smile: