RFC: Improving our DWARF (and ELF) emission testing capabilities

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.
2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.
3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

Therefore, as an interim stage, I propose to adopt some external tool
that parses DWARF and emits decoded textual dumps which makes tests
easy to write.

Concretely, I have a pure Python library named pyelftools
(https://bitbucket.org/eliben/pyelftools) which provides comprehensive
ELF and DWARF parsing capabilities and has a dumper that's fully
compatible with the readelf command. Using pyelftools would allow us
to immediately improve the quality of our tests, and as lib/DebugInfo
matures llvm-dwarfdump can gradually replace the dumper without
changing the actual tests.

pyelftools is relatively widely used so it's well tested, all it
requires is Python 2.6 and higher, and its code is in the public
domain. So it can live in tools/ or test/Scripts or wherever and be
distributed with LLVM. I actively maintain it and hacking it to LLVM's
purposes should be relatively easy. As a bonus, it has a much smarter
ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
been successfully adapted in the past to read DWARF from MachO files,
if that's required.

Eli

+ other debug info people (Eric & Paul)

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.

Ideally I'd like to see support added whenever a code change is made
to a feature - so long as we hold ourselves to a "test new changes"
that can gate/encourage the necessary feature support in
llvm-dwarfdump.

Since no one's likely to go back & write a bunch of regression tests
for all the existing code it seems premature to add new features to
llvm-dwarfdump before there's a use-case. It does sometimes mean bug
fixes appear to be costly because they include adding the missing test
infrastructure support, but that's essentially where the cost is
anyway.

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

Are there particular recent commits you've been concerned about the
test quality of? I've been trying to keep an eye on this but, again,
don't necessarily fully understand the ramifications of some changes.

Therefore, as an interim stage, I propose to adopt some external tool
that parses DWARF and emits decoded textual dumps which makes tests
easy to write.

Concretely, I have a pure Python library named pyelftools
(https://bitbucket.org/eliben/pyelftools) which provides comprehensive
ELF and DWARF parsing capabilities and has a dumper that's fully
compatible with the readelf command. Using pyelftools would allow us
to immediately improve the quality of our tests, and as lib/DebugInfo
matures llvm-dwarfdump can gradually replace the dumper without
changing the actual tests.

I would be a little hesitant about test execution performance if
involved invoking new python processes for each debug info test. But
numbers could convince me. Beyond that I can't rationally claim any
particular need to support llvm-dwarfdump as the tool of choice over
any 3rd party tool.

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.

Ideally I'd like to see support added whenever a code change is made
to a feature - so long as we hold ourselves to a "test new changes"
that can gate/encourage the necessary feature support in
llvm-dwarfdump.

Since no one's likely to go back & write a bunch of regression tests
for all the existing code it seems premature to add new features to
llvm-dwarfdump before there's a use-case. It does sometimes mean bug
fixes appear to be costly because they include adding the missing test
infrastructure support, but that's essentially where the cost is
anyway.

See test/MC/ELF/cfi-register.s for a test I consider unmaintainable
since it just matches an elf-dump and requires manual decoding of the
data for every change and addition. When tests are too hard to write,
fewer tests get written.

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

Are there particular recent commits you've been concerned about the
test quality of? I've been trying to keep an eye on this but, again,
don't necessarily fully understand the ramifications of some changes.

See basically every test employing elf-dump for non-trivial things.

Therefore, as an interim stage, I propose to adopt some external tool
that parses DWARF and emits decoded textual dumps which makes tests
easy to write.

Concretely, I have a pure Python library named pyelftools
(https://bitbucket.org/eliben/pyelftools) which provides comprehensive
ELF and DWARF parsing capabilities and has a dumper that's fully
compatible with the readelf command. Using pyelftools would allow us
to immediately improve the quality of our tests, and as lib/DebugInfo
matures llvm-dwarfdump can gradually replace the dumper without
changing the actual tests.

I would be a little hesitant about test execution performance if
involved invoking new python processes for each debug info test. But
numbers could convince me. Beyond that I can't rationally claim any
particular need to support llvm-dwarfdump as the tool of choice over
any 3rd party tool.

This is already done with elf-dump (a Python script) which is used for
a lot of tests for lack better options.

Eli

> I would be a little hesitant about test execution performance if
> involved invoking new python processes for each debug info test. But
> numbers could convince me. Beyond that I can't rationally claim any
> particular need to support llvm-dwarfdump as the tool of choice over
> any 3rd party tool.

This is already done with elf-dump (a Python script) which is used for
a lot of tests for lack better options.

Those tests should be done with llvm-objdump these days. If they can't be
then llvm-objdump should be extended.

-eric

I'm fine with this as long as llvm-dwarfdump gets maintained.

The only problem is that LLVM does not require Python 2.6, I think the
min version is still 2.4. Although I would love to move to 2.6 :stuck_out_tongue:

- Michael Spencer

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.
2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.
3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

Therefore, as an interim stage, I propose to adopt some external tool
that parses DWARF and emits decoded textual dumps which makes tests
easy to write.

Concretely, I have a pure Python library named pyelftools
(https://bitbucket.org/eliben/pyelftools) which provides comprehensive
ELF and DWARF parsing capabilities and has a dumper that's fully
compatible with the readelf command. Using pyelftools would allow us
to immediately improve the quality of our tests, and as lib/DebugInfo
matures llvm-dwarfdump can gradually replace the dumper without
changing the actual tests.

pyelftools is relatively widely used so it's well tested, all it
requires is Python 2.6 and higher, and its code is in the public
domain. So it can live in tools/ or test/Scripts or wherever and be
distributed with LLVM. I actively maintain it and hacking it to LLVM's
purposes should be relatively easy. As a bonus, it has a much smarter
ELF parser & dumper that can replace the ad-hoc elf-dump. It has also
been successfully adapted in the past to read DWARF from MachO files,
if that's required.

Eli

I'm fine with this as long as llvm-dwarfdump gets maintained.

I agree, and as I said in the original email, in the long term I
believe llvm-dwarfdump is the correct solution.

The only problem is that LLVM does not require Python 2.6, I think the
min version is still 2.4. Although I would love to move to 2.6 :stuck_out_tongue:

Was this not covered by a previous discussion? I had the feeling it
was decided that 2.6 was OK to require, since it's simple to install
on platforms that don't ship it by default.

Eli

> I'm fine with this as long as llvm-dwarfdump gets maintained.
>

I agree, and as I said in the original email, in the long term I
believe llvm-dwarfdump is the correct solution.

The problem is that if no one is working on testing these sorts of things
with llvm-dwarfdump then it won't be maintained for this purpose. See
elf-dump and people not expanding/fixing bugs in llvm-objdump and using
that for tests.

-eric

> I'm fine with this as long as llvm-dwarfdump gets maintained.
>

I agree, and as I said in the original email, in the long term I
believe llvm-dwarfdump is the correct solution.

The problem is that if no one is working on testing these sorts of things
with llvm-dwarfdump then it won't be maintained for this purpose.

See
elf-dump and people not expanding/fixing bugs in llvm-objdump and using that
for tests.

Can you clarify/elaborate on this last sentence?

Eli

Sure. People are updating, modifying and adding new tests that use elf-dump
and not updating, modifying or fixing llvm-objdump to test the same thing.

-eric

So, using llvm-dwarfdump (and, therefore, lib/DebugInfo) for testing leads to the following:
if you extend debug info emitted by Clang/LLVM, you also have to fix lib/DebugInfo to support
these extensions (looks like Eric was doing this in his DWARF5-related changes). While this is
tiring and certainly slows down the development, this also helps to keep the tools “in sync” in some sense.

As a “user” of lib/DebugInfo I find it pretty useful and it would be a pity if it wouldn’t be able to parse or

would lack important features of the code produced by LLVM itself.

Speaking as a relative newbie, I don't see any way to know that elf-dump
is deprecated in favor of llvm-objdump; neither of them are mentioned
anywhere on the website. Nor is llvm-dwarfdump for that matter.
D'you think there could be some mention of these on the Command Guide
or Testing Infrastructure Guide pages?
http://llvm.org/docs/CommandGuide/index.html
http://llvm.org/docs/TestingGuide.html
(Yeah, yeah, I know, "patches welcome.")

--paulr

> I'm fine with this as long as llvm-dwarfdump gets maintained.

I agree, and as I said in the original email, in the long term I
believe llvm-dwarfdump is the correct solution.

The problem is that if no one is working on testing these sorts of things
with llvm-dwarfdump then it won't be maintained for this purpose.

See
elf-dump and people not expanding/fixing bugs in llvm-objdump and using that
for tests.

Can you clarify/elaborate on this last sentence?

Sure. People are updating, modifying and adding new tests that use
elf-dump and not updating, modifying or fixing llvm-objdump to test the
same thing.

Speaking as a relative newbie, I don't see any way to know that elf-dump
is deprecated in favor of llvm-objdump; neither of them are mentioned
anywhere on the website. Nor is llvm-dwarfdump for that matter.

By the very nature of elf-dump's output, I would expect anyone (even a
newbie) to look for an alternative (examples in other tests) or at
least ask on the mailing list and/or IRC. Moreover, hopefully folks
knowledgeable about this area would react on a patch that does
something that could be done better.

D'you think there could be some mention of these on the Command Guide
or Testing Infrastructure Guide pages?

Yes!

http://llvm.org/docs/CommandGuide/index.html
http://llvm.org/docs/TestingGuide.html
(Yeah, yeah, I know, "patches welcome.")

And, I will add, "appreciated".

Eli

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

Hmmm. "Proper" testing would exercise each component involved, as well
as possibly longer paths that maybe are not exactly the sum of the parts.
Debug info changes are quite likely to involve most or all of:
- Clang's C/C++ to IR (metadata)
- LLVM's IR to assembler source
- assembler source to object file
- LLVM's IR to object file (which partly bypasses or can be different
  from the previous two paths)
Properly speaking they should each get their own tests.
Not to mention a unit-test (or debuginfo-test) to exercise the complete
Clang -> object (-> debugger) sequence.

I try to be good about this, but as a developer I find that sort of
thing tedious. Which mostly proves that I suck at QA, and have to depend
on reviewers to keep me on the straight and narrow. This works to the
extent that those reviewers are willing to be critical of my efforts,
and insist on adequate (instead of minimal) testing. But testing is
an art unto itself, and most developers aren't good at it.

--paulr

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.

Could you point out what you mean?
In particular, what parts you think it does not support (since you say
it supports a small subset).
What do you want out of debug_frame, past simple parsing?
Anything else requires real evaluation.

I ask because I wrote a DWARF reader that google uses internally, and
then was open sourced and contributed to google breakpad.
(see google-breakpad - Crash reporting - Monorail,
in particular dwarf2reader.cc).

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

Err, .cfi_ and used because the encoding is tricky to get right, and
assemblers are better at optimizing it.
However, i'll point out that breakpad also has a CFI assembler
(google-breakpad - Crash reporting - Monorail)

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort.

???
Having written about 6 DWARF parsers, I strongly disagree it is either
notoriously hard or a large investment of time and effort. People
have written DWARF parsers on the weekend. One of the reasons DWARF
is popular is because it is relatively simple to *parse*, even though
semantic extraction is more difficult.

In any case, I mention the above project (google-breakpad) because i'd
be more than happy to get that DWARF related code relicensed to the
LLVM license if someone wanted it.

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.

Could you point out what you mean?
In particular, what parts you think it does not support (since you say
it supports a small subset).
What do you want out of debug_frame, past simple parsing?
Anything else requires real evaluation.

As I said, it doesn't support debug_frame, as one relevant example.

I ask because I wrote a DWARF reader that google uses internally, and
then was open sourced and contributed to google breakpad.
(see google-breakpad - Crash reporting - Monorail,
in particular dwarf2reader.cc).

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

Err, .cfi_ and used because the encoding is tricky to get right, and
assemblers are better at optimizing it.
However, i'll point out that breakpad also has a CFI assembler
(google-breakpad - Crash reporting - Monorail)

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort.

???
Having written about 6 DWARF parsers, I strongly disagree it is either
notoriously hard or a large investment of time and effort. People
have written DWARF parsers on the weekend. One of the reasons DWARF
is popular is because it is relatively simple to *parse*, even though
semantic extraction is more difficult.

I do mean semantic extraction which provides a representation that's
meaningful to a user and hence can be effectively compared in a test.
But really, I gave up arguing on this topic a few messages (and heated
IRC discussions) ago. RFC retracted.

In any case, I mention the above project (google-breakpad) because i'd
be more than happy to get that DWARF related code relicensed to the
LLVM license if someone wanted it.

This is utterly impossible because your code does not start variables
with a capital! Seriously though, I would expect this to be
challenging since lib/DebugInfo already has quite a bit of parsing and
infrastructure already in place, and I'm not sure how easy it would be
to merge with a completely different parser.

Eli

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

Hmmm. "Proper" testing would exercise each component involved, as well
as possibly longer paths that maybe are not exactly the sum of the parts.
Debug info changes are quite likely to involve most or all of:
- Clang's C/C++ to IR (metadata)

Simple enough: changes to Clang require tests in Clang (OK, so there's
no mechanism in place to avoid tests in Clang testing optimization,
for example, but we try to be good about it)

- LLVM's IR to assembler source
- assembler source to object file
- LLVM's IR to object file (which partly bypasses or can be different
  from the previous two paths)

& these generally go in LLVM - yeah, they could be separate, but I'd
expect the assembler and object file emission to be tested separately
already - the only benefit to testing particular IR->object &
separately testing particular IR->assembly is probably not worthwhile.
If we could test against the precursor to those outputs then we might
get the advantage of only having the right tests fail for the right
reasons (debug info tests wouldn't fail when we broken the
assembler/object emission).

Properly speaking they should each get their own tests.
Not to mention a unit-test (or debuginfo-test) to exercise the complete
Clang -> object (-> debugger) sequence.

Just because it makes me twitch (though I admit debating test taxonomy
terminology verges on a religious topic): these tests are the
antithesis of unit tests. Taking source code, compiling it with
clang/LLVM, loading it in a debugger and interacting with the debugger
is a scenario test.

A unit test would be API level, say building IR by calling Clang APIs
& then passing it into DebugInfo generation & watching the MI calls
that resulted (preferably stubbing them out in some way).

I try to be good about this, but as a developer I find that sort of
thing tedious. Which mostly proves that I suck at QA, and have to depend
on reviewers to keep me on the straight and narrow. This works to the
extent that those reviewers are willing to be critical of my efforts,
and insist on adequate (instead of minimal) testing. But testing is
an art unto itself, and most developers aren't good at it.

We do what we can (because we must).

- David

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

Hmmm. "Proper" testing would exercise each component involved, as well
as possibly longer paths that maybe are not exactly the sum of the parts.
Debug info changes are quite likely to involve most or all of:
- Clang's C/C++ to IR (metadata)

Simple enough: changes to Clang require tests in Clang (OK, so there's
no mechanism in place to avoid tests in Clang testing optimization,
for example, but we try to be good about it)

- LLVM's IR to assembler source
- assembler source to object file
- LLVM's IR to object file (which partly bypasses or can be different
  from the previous two paths)

& these generally go in LLVM - yeah, they could be separate, but I'd
expect the assembler and object file emission to be tested separately
already - the only benefit to testing particular IR->object &
separately testing particular IR->assembly is probably not worthwhile.

I cite PR13303/PR14524, where asm and direct-object output differ.
This came up early in my LLVM career and has doubtless poisoned my
outlook for life....

In many cases I think the same test _source_ can be used to check both
asm and object, with appropriate RUN lines, and whether you want to
count that as the same or separate depends on how you like to game the
counts. What matters to me is both paths get tested.

If we could test against the precursor to those outputs then we might
get the advantage of only having the right tests fail for the right
reasons (debug info tests wouldn't fail when we broken the
assembler/object emission).

Properly speaking they should each get their own tests.
Not to mention a unit-test (or debuginfo-test) to exercise the complete
Clang -> object (-> debugger) sequence.

Just because it makes me twitch (though I admit debating test taxonomy
terminology verges on a religious topic): these tests are the
antithesis of unit tests. Taking source code, compiling it with
clang/LLVM, loading it in a debugger and interacting with the debugger
is a scenario test.

A unit test would be API level, say building IR by calling Clang APIs
& then passing it into DebugInfo generation & watching the MI calls
that resulted (preferably stubbing them out in some way).

Not sure why I said unit test there...
I'd think of compiling .cpp->.o as a Clang/LLVM integration test, while
I'd think of running the debugger on the object as a system test
(because gdb is not part of what this community delivers).
I also think I'd spectacularly fail the CSSP test. :slight_smile:

I try to be good about this, but as a developer I find that sort of
thing tedious. Which mostly proves that I suck at QA, and have to depend
on reviewers to keep me on the straight and narrow. This works to the
extent that those reviewers are willing to be critical of my efforts,
and insist on adequate (instead of minimal) testing. But testing is
an art unto itself, and most developers aren't good at it.

We do what we can (because we must).

Amen, brother.
--paulr

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

Hmmm. "Proper" testing would exercise each component involved, as well
as possibly longer paths that maybe are not exactly the sum of the parts.
Debug info changes are quite likely to involve most or all of:
- Clang's C/C++ to IR (metadata)

Simple enough: changes to Clang require tests in Clang (OK, so there's
no mechanism in place to avoid tests in Clang testing optimization,
for example, but we try to be good about it)

- LLVM's IR to assembler source
- assembler source to object file
- LLVM's IR to object file (which partly bypasses or can be different
  from the previous two paths)

& these generally go in LLVM - yeah, they could be separate, but I'd
expect the assembler and object file emission to be tested separately
already - the only benefit to testing particular IR->object &
separately testing particular IR->assembly is probably not worthwhile.

I cite PR13303/PR14524, where asm and direct-object output differ.
This came up early in my LLVM career and has doubtless poisoned my
outlook for life....

In many cases I think the same test _source_ can be used to check both
asm and object, with appropriate RUN lines, and whether you want to
count that as the same or separate depends on how you like to game the
counts. What matters to me is both paths get tested.

Sure - I'd just rather see these separated into object emission/asm
tests if possible, rather than littering the other test cases with two
modes each. I assume the code is sufficiently factored such that
testing in this way would be generally fairly reliable (ie: I hope we
can hit the same code paths that produce .byte from debug info as
would produce it from anywhere else in the backend and just test that
once directly)

Hi All,

While working on some recent patches for x32 support, I ran into an
unpleasant limitation the LLVM eco-system has with testing DWARF
emission. We currently have several approaches, neither of which is
great:

1. llvm-dwarfdump: the best approach when it works. But unfortunately
lib/DebugInfo supports only a (small) subset of DWARF. Tricky sections
like debug_frame aren't supported.
2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.
3. Using elf-dump and examining the raw binary dumps. This makes tests
nearly unmaintainable.

The latter is also why IMHO our ELF emission in general isn't well
tested. elf-dump is just too rudimentary and relies on simple (=dumb)
binary contents dumps.

The long-term solution for DWARF would be to enhance lib/DebugInfo to
the point where it can handle all interesting DWARF sections. But this
is a lofty goal, since DWARF parsing is notoriously hard and this
would require a large investment of time and effort. And in the
meantime, we just don't write good enough tests (and enough of them)
for this very important feature.

I'm pretty I made Benjamin K. started on lib/DebugInfo. :slight_smile: There were two primary motivations for lib/DebugInfo 1) to add source debug info capability to llvm disassembler, and 2) to migrate LLDB's dwarf parsing to LLVM (to ease sharing). I suspect that migration wasn't quite complete (and / or LLDB's DWARF parsing has since improved). Anyway, IMO the best step forward to continue to migrate LLDB's DWARF parsing library over and make it fully featured.

Evan

2. Relying of assembly directive emissions (i.e. .cfi_*), which is
cumbersome and misses a lot of things like actual DWARF encoding.

I'm not sure what you mean by "actual DWARF encoding" here.
(disclaimer: I've only recently started dabbling with debug info, so I
may be missing obvious things)

I mean that it doesn't test the whole way, and there's quite a bit of
DWARF-related functionality in MC. So when a test relies on matching
directives in ASM output, there's quite a bit of code in MC it doesn't
exercise.

Hmmm. "Proper" testing would exercise each component involved, as well
as possibly longer paths that maybe are not exactly the sum of the parts.
Debug info changes are quite likely to involve most or all of:
- Clang's C/C++ to IR (metadata)

Simple enough: changes to Clang require tests in Clang (OK, so there's
no mechanism in place to avoid tests in Clang testing optimization,
for example, but we try to be good about it)

- LLVM's IR to assembler source
- assembler source to object file
- LLVM's IR to object file (which partly bypasses or can be different
  from the previous two paths)

& these generally go in LLVM - yeah, they could be separate, but I'd
expect the assembler and object file emission to be tested separately
already - the only benefit to testing particular IR->object &
separately testing particular IR->assembly is probably not worthwhile.

I cite PR13303/PR14524, where asm and direct-object output differ.
This came up early in my LLVM career and has doubtless poisoned my
outlook for life....

In many cases I think the same test _source_ can be used to check both
asm and object, with appropriate RUN lines, and whether you want to
count that as the same or separate depends on how you like to game the
counts. What matters to me is both paths get tested.

Sure - I'd just rather see these separated into object emission/asm
tests if possible, rather than littering the other test cases with two
modes each. I assume the code is sufficiently factored such that
testing in this way would be generally fairly reliable (ie: I hope we
can hit the same code paths that produce .byte from debug info as
would produce it from anywhere else in the backend and just test that
once directly)

That works for sections where we do actually produce .byte directives.
.debug_info, .debug_str, .debug_macinfo, and more are like this.

But for .debug_line (the topic of the PRs I cited) and for CFI (which
was what the OP specifically mentioned, if we can remember that far back)
that's not how it works. The paths get pretty different internally, and
demonstrably can emit different object files.

So, at least in those cases, tests for (a) LLVM producing the correct
assembler directives, (b) LLVM producing the correct binary encoding,
and (c) the assembler producing the correct binary encoding, are all
appropriate and valuable. Exactly how they get packaged is less
important than that the tests all exist.

--paulr

P.S. as long as I'm replying anyway....

I also think I'd spectacularly fail the CSSP test. :slight_smile:

one reason being I can't even spell CSDP correctly!