[DebugInfo] RFC: Introduce LLVM DI Checker utility

Hi,

I am sharing the proposal [0] which gives a brief introduction for the implementation of the LLVM DI Checker utility. On a very high level, it is a pair of LLVM (IR) Passes that check the preservation of the original debug info in the optimizations. There are options controlling the passes, that could be invoked from ``clang`` as well as from ``opt`` level.

By testing the utility on the GDB 7.11 project (using it as a testbed), it has found a certain number of potential issues regarding the DILocations (using it on LLVM project build itself, it has found one bug regarding DISubprogram metadata). Please take a look into the final report (on the GDB 7.11 testbed) generated from the script that collects the data at [1]. By looking at these data, it looks that the utility like this could be useful when trying to detect the real issues related to debug info production by the compiler.

Any thoughts on this? Thanks in advance!

[0] https://github.com/djolertrk/llvm-di-checker
[1] https://djolertrk.github.io/di-checker-html-report-example/

Best regards,
Djordje

That's a neat idea!

How would a tool like this distinguish between situations where debug locations are expected to be dropped or merged, such as the ones outlined in https://reviews.llvm.org/D81198? Is it generating false positives?

You mention that "An alternative to this is the debugify utility, but the difference is that the LLVM DI Checker deals with real debug info, rather than with the synthetic ones". How is that an advantage? Are you seeing too many false positives with the debugify-generated debug locations?

-- adrian

Hey Djordje,

It looks like a lot of the new infrastructure introduced here consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already.

Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running ./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim finds the same issue.

In D81939, you discuss finding the new tool useful when responding to bug reports about optimized-out variables or missing locations. We sorely do need something better than -opt-bisect-limit, but why not start with something simple? -check-debugify already knows how to report when & where a location is dropped, it would be simple to teach it to emit a report when a variable is fully optimized-out.

Hi Adrian,

Thanks for the comments!

> How would a tool like this distinguish between situations where debug locations are expected to be dropped or merged, such as the ones outlined in https://reviews.llvm.org/D81198? Is it generating false positives?

Since it is still a proposal, it does not cover these cases, but it shouldn't generate false positives in that case. My impression is that we can check if dropping/merging a location meets requirements outlined within D81198 (e.g. to check whether the instruction is in the same basic block when dropping occurs etc.) & mark it as a "known dropping".

> You mention that "An alternative to this is the debugify utility, but the difference is that the LLVM DI Checker deals with real debug info, rather than with the synthetic ones". How is that an advantage? Are you seeing too many false positives with the debugify-generated debug locations?

I was wrong when saying "alternative". These two are more likely to be used in the combination. There are no false positives from debugify report (at least I haven't seen it; the same core logic was used for di-checker), but I think that since debugify deals with synthetic debug info it is potentially limited to certain set of metadata kinds that could be generated synthetically (but I might have been mistaken about that) & it is part of Transformation lib, but the di-checker performs analysis only (I am not sure what is the overhead if we run debugify on a large project on every single CU; my impression was that this analysis was chipper) & the di-checker reports failures (instead of e.g for variables called "1", "2", etc.) for real entities such as "a", "b", etc. (and these are the entities being reported from users as "My var 'a' is optimized out..." or "I cannot attach breakpoint to function 'fn1()'"). I don't want to make a picture that we are choosing between these two, since I really think the debugify is great tool & these two can/should coexist. I use the di-checker to detect failures from clang's level & then I run debugify on the certain pass-test-directory. As I just mentioned, the di-checker option could be called from clang's level, since it has been linked into the IR library. In addition, the di-checker should be extended to support all kinds of debug info metadata, such as DILexicalScopes, DIGlobalVariables, dbg_labels, and so on.

Best,

Djordje

Hi Vedant,

Thanks a lot for your comments!

It looks like a lot of the new infrastructure introduced here consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already.

Since it is a proposal, I thought it’d easier to understand the idea if I duplicate things. Ideally, we can make an API that could be used from both tools. Initially, I made a few patches locally turning the real debug info into debugify ones, but I realized it breaks the original idea/design of the debugify & that is why I decided to make a separate pass(es). This cannot stay as is with the respect to the implementation, it should be either merged into debugify file(s) or refactored using the API mentioned above. Another reason for implementing it as a different pass was the fact the debugify is meant to be used from ‘opt’ level only, but if we want to invoke the option from front end level, we need to merge it into the IR library.

Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running ./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim finds the same issue.

As I mentioned in the previous mail, I do really think the debugify technique is great & I use it. But, in order to detect that variable “x” was optimized-out starting from pass Y, I only run the di-checker option (that performs analysis only) & find the variable in the final html report. I think that is very user friendly concept. At the end, when we detected what was the spot of loosing the location, we can run debugify on the pass-directory-tests (but there is a concern the tests does not cover all the possible cases; and the case found from the high level could be new to the pass). In addition, the di-checker detects issues for metadata other than locations (currently, the preservation map keeps the disubprograms only, but it should keep other kinds too).

In D81939, you discuss finding the new tool useful when responding to bug reports about optimized-out variables or missing locations. We sorely do need something better than -opt-bisect-limit, but why not start with something simple? -check-debugify already knows how to report when & where a location is dropped, it would be simple to teach it to emit a report when a variable is fully optimized-out.

I agree. We can do that and that could be used from both utilities.

Best regards,

Djordje

Hi Vedant,

Thanks a lot for your comments!

It looks like a lot of the new infrastructure introduced here consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already.

Since it is a proposal, I thought it’d easier to understand the idea if I duplicate things. Ideally, we can make an API that could be used from both tools. Initially, I made a few patches locally turning the real debug info into debugify ones, but I realized it breaks the original idea/design of the debugify & that is why I decided to make a separate pass(es). This cannot stay as is with the respect to the implementation, it should be either merged into debugify file(s) or refactored using the API mentioned above.

Oh, this wasn’t clear to me. In the future, please describe what is/isn’t part of the proposal. It’d be especially helpful to include some discussion of the pros & cons of the prototype design and its alternatives.

Another reason for implementing it as a different pass was the fact the debugify is meant to be used from ‘opt’ level only, but if we want to invoke the option from front end level, we need to merge it into the IR library.

The debugify passes are now linked by llc via the Transforms library as part of the -mir-debugify implementation. A frontend should be able to use them.

Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running ./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim finds the same issue.

As I mentioned in the previous mail, I do really think the debugify technique is great & I use it. But, in order to detect that variable “x” was optimized-out starting from pass Y, I only run the di-checker option (that performs analysis only) & find the variable in the final html report. I think that is very user friendly concept.

About the analysis — why not emit a report in -check-debugify when the # of non-undef debug uses of a variable drops to 0? This doesn’t require the -debugify step. The list of local variables is preserved via DISubprogram::retainedNodes.

At the end, when we detected what was the spot of loosing the location, we can run debugify on the pass-directory-tests (but there is a concern the tests does not cover all the possible cases; and the case found from the high level could be new to the pass). In addition, the di-checker detects issues for metadata other than locations (currently, the preservation map keeps the disubprograms only, but it should keep other kinds too).

Note that it’s possible to to increase code coverage by running a -debugify-each pipeline on -g0 IR produced by a frontend.

Is it common for a pass to drop an entire DISubprogram? I would hope this either never happens or is extremely rare.

best,
vedant

Sorry If I wasn’t clear enough. I thought if I write [PROPOSAL], all of it is considered as a proposal. I’ll try that, thanks.

Hi Vedant,

I’ve managed to merge this idea into the existing Debugify Pass by introducing two modes:

  1. synthetic - everything stay as is; this mode deals with the synthetic debug info

  2. original - this is the new mode; it checks debug info metadata preservation on real/original/-g debug info

The initial patches are: https://reviews.llvm.org/D82545 and https://reviews.llvm.org/D82546.

There are still TODOs (such as to cover the dbg.values tracking by using the idea you pointed out in a previous mail; or to cover metadata other then DILocations and DISubprograms (if there are any that make sense)).

Best regards,

Djordje