RFC: DWARF reduction tool

Proposal

To add a new tool similar to llvm-reduce that works on DWARF object files. The tool should progressively remove parts of debug info not essential to reproduce a bug, while making sure that it still passes the test program.

Motivation

Recently I’ve been debugging Unreal Engine with LLDB, and it was quite tricky to catch some of the issues. Debug info is ~11GB combined, and the engine takes 20 minutes to start under the debugger, making troubleshooting of LLDB issues especially fun.

In some cases it is possible to narrow down the problem by reducing the source code, by disabling debug info for a part of the project, or by reducing it on LLVM-IR level. These methods are quite difficult to perform on UE due to scale of the codebase, build system complexity, or when a bug happens with pre-built binaries.

However, we can attempt to reduce binaries instead, debug info sections specifically. Only a small portion of debug info may be relevant to the problem, so by cutting away everything else, we can get a much smaller reproducer.

I’ve tried this method with Unreal Engine, and cobbled together a naive reduction program by using LLVM DWARFLinker (taking llvm-dwarfutil as an example) and parts of llvm-reduce. It required a few changes to DWARFLinker to override what DWARF elements we want to discard or keep. I’m not sure if it is a good idea to use DWARFLinker, since we don’t really need any “linking”, just filtering. Perhaps a separate implementation can be simpler.

The algorithm to decide what parts of DWARF are not necessary is probably the most important here. As a quick solution I used an algorithm that removes DWARF elements randomly. Restricting it to only some tags (compile units, subprograms, etc.) helps the algorithm to converge faster.

With this approach I was able to reduce one of UE libraries from ~34MB of debug info down to 1KB, and converted it into a LIT test (bug report LLDB "forcefully completed" types cause an assert in clang · Issue #90938 · llvm/llvm-project · GitHub). It took about 3 hours to reduce it with a simple algorithm.

Questions

I wonder how people deal with troubleshooting DWARF info without a reduction tool like this? Do you think such a tool is useful enough to have it in LLVM? Is there a better way of doing this?

The code is not published yet, though I plan to make it public eventually and contribute to LLVM mainline. I wanted to get some feedback, and maybe change the design accordingly.

3 Likes

11 gigs of debug info? And I thought lldb’s 1.5 gigs was a lot!

How do people troubleshoot issues on large codebases (not just DWARF issues!)?

  • Suffer
  • Use DWO/spit DWARF (-gsplit-dwarf)
  • Selectively strip debug info from objects/libraries that don’t matter for the issue

As an occasional LLDB contributor, I see the value in such tool.

My experience with creduce suggests that users of such tool would benefit from exposing additional knobs. For instance, creduce allows individual passes to be disabled. Not something a first iteration needs, but useful to keep in mind.

1 Like

@asavonic, is your target using static linking or dynamic linking. If most of the debug info are evenly distributed across many shared libraries (dynamic linking), you can try enable symbol on-demand feature which will lazily parse debug info on-demand until required by breakpoint or callstack etc…
https://lldb.llvm.org/use/ondemand.html

Generally on Linux, large debug info can cause slow startup time because lldb has to manually index symbols. If that is the bottleneck, you should considering enabling dwarf5 .debug_names accelerator table which will require near zero indexing time.

As some example data, 2/3 targets in our company are > 10GB, and some targets can have debug info more than 70GB. We commonly have to enable split dwarf + gsimple_template_names to even build targets successfully. The two features above are the key features enabling our customers to debug targets in scale.

Btw: a quick way to identify the cost distribution is running “statistics dump” command.

I’d worry that DWARF consumers aren’t that robust to arbitrary DWARF - so such a tool is likely to trip over situations the consumer would never see in practice, and has limited interest in supporting. This is why I’ve generally advocated for source reduction/source fuzzing to make the debugger more robust to the sort of things it’s likely to see in practice.

but I realize that isn’t always practical, as you say, with situations of prebuilt code (though prebuilt code with debug info? if you’ve got debug info presumably you’ve also got the source, or you may be better off just stripping out the debug info?)

I’d generally encourage more work into automated tools for the techniques you mentioned, like selectively disabling debug info, selectively disabling LTO, etc.

If we were going to go for DWARF reduction that’s likely to look more like the DWARF generated by existing compilers - I’d consider a graph analysis of DWARF (starting at function definitions and global/namespace-scoped variables) and then trimming things in certain ways - eg: remove one of these top level entities, and discard any referenced entities that aren’t otherwise referenced - and demoting type definitions to declarations, removing anything that becomes unreferenced by that transformation.

(that latter one is more likely to trip up debuggers, it could create some more novel/unrealistic situations - but should be closer to common idioms/reality)

(also, I’d love a tool to render the DWARF reference graph - to help identify pinch-points in the graph that might be broken to help reduce debug info size… (I’d love such a tool to be really generic - to be able to reuse it for linker dependence/gc graph, build dependence graph, etc)

1 Like

Good point. Split DWARF is very helpful and it allows to reduce debug info without affecting libraries in any way.

Selectively removing debug info from libraries can be the first step of reduction. Very coarse grained, but effective at removing completely irrelevant libraries.

Agree, the tool must be adjustable. An algorithm that performs reduction on DWARF is not trivial, so if it gets stuck somewhere we need a way to guide it.

The target uses dynamic linking. There are ~830 shared libraries in total, but most of them are small (maybe not even loaded at the same time, not sure).

Thank you, I’ll try the “on demand” feature. Sounds great for actually using LLDB, but it may change things too much and hide some of the LLDB issues that I’m trying to catch. Easy to check though and fallback to the regular mode if it doesn’t work.

Great to see these optimizations. I’m stuck with prebuilt binaries for now that don’t have this section, but it sure is helpful.

So if LLDB crashes on one of these monsters, how do you reduce the problem down to something that can be used as a LIT test, for example?

Oh, I didn’t know about this option. Thank you!

My assumption is that if we start from a valid DWARF, and remove parts of it that are not relevant, we should still get a valid DWARF, just incomplete. Granted, if remove fields of structures, for example, then it will not have the same memory layout, but it should be OK for troubleshooting LLDB itself.

The goal for a reduced DWARF is to execute the same code path in LLDB as the original does, and trigger the same bug. Anything else should be interpreted as a “bad” attempt at reduction, and the tool should try again.

Build systems can be tricky to setup. It can be customer’s code and they are not willing to share it. Or, as you’ve mentioned, there may not be an automated way to perform source code reduction. Actually, I don’t know if anything like this is available. It was always a manual process for me.

Therefore if a DWARF reduction tool can bypass all of this, and just give me a snippet of DWARF that triggers the issue - great.

Thank you for summarizing this! Some of this is how the prototype based on DWARFLinker works - DWARFLinker has to pull all dependencies (like fields for classes, or argument types for functions), and with a bit of hacking we can trim them all at once. Order of operations matters to keep the original bug present, and not erase everything.

Building a graph doesn’t seem too difficult, and perhaps it is already done in LLVM tools to some degree. What I couldn’t find is an ability to build a graph, modify it without breaking, and write it back.

So perhaps having this as a separate library can also enable tooling that you mentioned.

so I have quite a bit of experience reducing things and what David is saying here is absolutely the way to go. focus on the most coarse-grained possible transformations, and aim specifically at those that do not seem likely to create weirdly broken DWARF. test input reducers like C-Reduce often end up accidentally being highly capable fuzzers. if that’s what you want, then great. but if you just need to get a specific bug fixed, then that’s not very useful.

the specific bad thing that can happen here is something we’ve called hijacking: you start out reducing a trigger for one bug and end up reducing a trigger for a totally different bug. this is extremely likely to happen when:

  1. the software under test has a high latent bug rate
  2. the reducer does all sorts of crazy transformations
  3. the interestingness test is somewhat non-specific, for example it simply looks for a crash

item 1 isn’t really under our control. item 2 can be addressed by taming down the reducer. item 3 is the user’s responsibility, but it’s tricky in practice.

2 Likes

Thanks everyone for insightful comments and suggestions! I scoped this a bit more, and prepared a draft implementation plan. Any feedback is welcome.

TLDR: we can add some components as generic libraries to LLVM and make them useful for other tools:

  • DWARF Write - there is a library for parsing DWARF objects, but no generic library for writing them.

  • DWARF Graph - a set of classes and routines for building a directed graph that represent DWARF of a compile unit, an object file, or multiple files.

Other components are specific to dwarfreduce:

  • Passes that transform a DWARF graph.
  • Reduction logic.
  • Interface with the test program.

DWARF reduce algorithm

  1. Input DWARF: read DWARF from an object file, or from a separate file in case of split DWARF.

  2. Build a graph where nodes represent DIEs and edges are dependencies (references) between them.

  3. Run a sequence of passes to remove some nodes from the graph.

  4. Output DWARF: write the graph back in the same format as the original.

  5. Output Metadata: write additional information attached to nodes.

  6. Call a test program:

    • If it succeeds, goto (3).

    • If the test program fails, roll back the changes made to the graph, mark the affected nodes as “interesting”, goto (3) and take this new information into account when selecting nodes to remove.

Input DWARF

Use existing LLVM API for reading a DWARF object: llvm/DebugInfo/DWARF/DWARFContext.h, DWARFObject.h, and others.

DWARFContext parses the input, and allows to iterate over compile units recursively and access individual DWARFDie objects.

There should be an ability to run reduction on multiple input object files. This can be done in a script, separate from the main llvm-dwarfreduce program.

Build a DWARF Graph

DWARF Graph is a directed graph with nodes representing DWARFDie objects such as types, functions, arguments, etc. Edges represent dependencies between them: a record depends on fields and member functions, functions depend on arguments, arguments depend on types, etc. Cycles are possible when a node indirectly references itself (a linked list type, for example, contains a pointer to itself).

Nodes are created by iterating recursively over DWARFDie objects of a compile unit.

Edges (dependencies) may be a bit tricky:

  • We can find direct dependencies by checking DIE attributes with DWARFFormValue::FC_Reference values. These can be dereferenced to find an offset of the dependency DIE.

  • If any other dependencies are possible (I don’t know), we have to find them as well. If we miss any, output DWARF will be invalid.

llvm::DirectedGraph, DGNode and DGEdge are used implement DWARF Graph. With llvm::GraphTraits and llvm::DOTGraphTraits it is possible to reuse LLVM graph algorithms and graphviz output support.

Although graphviz output is quite easy to do, visualization may be problematic due to size of DWARF graphs even for simple programs. It is a difficult problem, and there are different methods and programs that help to visualize large graphs.

DWARF reduction passes

Passes transform a DWARF graph by removing nodes and edges, taking information from previous passes and test runs into account.

Sequence of passes is important. Due to size of a typical graph, we first need to run passes with low computational complexity. These passes should remove a significant portion of a graph, and allow passes with high complexity to execute in reasonable time.

Passes should be configurable via options, so a user can limit or guide them to some degree, when a generic algorithm fails.

The following passes may be useful:

  1. Remove leaf nodes. A leaf node is formed when a DIE is not referenced by any other DIE. These can be removed without breaking DWARF structure. The pass can iterate until there are no more leaf nodes (i.e. only nodes that form cycles remain).

  2. Remove edges. Different types of edges (DIE references) may be eliminated or substituted. Typed pointers can be substituted by void pointers, fields can be replaced by untyped “padding” field, friend references can be removed, etc. Removing such edges may break cycles and allow for more leaf nodes to be removed in pass (1).

  3. Remove sub-graphs. Remove a node (root) and every node that depends on it. Selecting a root can be tricky. Ideally we want to identify clusters of interconnected nodes and remove these clusters one by one, but such algorithms have high complexity. We can use heuristics instead - the goal is to progress with reduction, not to find the most optimal way of doing it. Feedback from a test program can be used to improve accuracy.

Output DWARF

Unlike DWARF input, a nice standalone LLVM library for writing DWARF files does not exist (yet). There are at least two independent implementations: one in AsmPrinter, another in DWARFLinker.

AsmPrinter uses a its own version of DWARF classes internally (as opposed to Read DWARF API), and there is no generic library interface - it works with LLVM IR metadata as part of regular CodeGen.

DWARFLinker uses the same API that we use to read DWARF, but the implementation is tied to the linker. We can refactor it into a generic library, and use for llvm-dwarfreduce to output a modified graph.

Metadata

Nodes of a DWARF graph (DIEs) can have metadata attached to them.

When we remove a node and it causes the test program to fail, we can mark the node as “interesting” and avoid removing it during next passes.

DWARF DIEs are identified by their offset from the beginning of the section, so they are going to change when we modify DWARF. We can add a persistent ID metadata to track the node between iterations.

Test program

Test program is provided by a user. It checks if a modified DWARF is still “interesting” (for example, if it triggers a bug). If it is not, we have to roll back the graph to the previous state, apply feedback to the algorithm, and try again.

If the test program shows that a modified DWARF is fine, we can use it as an input and perform another iteration of reduction.

It is important to balance run time of the algorithm and run time of the test program.

The interface between dwarfreduce and the test program is still TBD. We can output DWARF separately, and let the test program integrate it back into the source binary. Or we can try and modify the source in-place.

Have you tried llvm-dwarfutil --linker=parallel on a -gdwarf-4 -glldb binary with no type units and no split DWARF? It requires that you are able to link your binary without hitting and 4GB DWARF limitations, but this allows you to produce a full DWARF binary and then lets llvm-dwarfutil unique the type information into a single compile unit, and it dead strips the DWARF and produces a new .debug_names accelerator table if you ask it to. We have seen up to 60% size reductions from this and there are still some bugs that cause some classes to maintain duplicate C++ class definitions in actual compile units that llvm-dwarfutil could be taught to dead strip.

My idea is to eventually teach llvm-dwarfutil to link -gdwarf-5 -glldb + split DWARF and have llvm-dwarfutil replace llvm-dwp and it can produce vanilla DWARF and remove all split DWARF’isms from the DWARF while uniquing types and dead stripping what isn’t needed.

I can see the need for a tool like you propose for reducing debug info for test case reproduction, but for day to day building and debugging, I am interested in having a smart DWARF linker that can produce smaller output files with complete debug info.

I’d at least recommend looking at llvm-dwarfutil (that @clayborg mentioned) and llvm-dsymutil (they share some of their implementation) that do implement parsing, modification, and writing.

Whatever solution you implement, should probably share a lot of its implementation with these tools - if you introduce a more structured reading/writing API, it’s probably worth refactoring the DwarfLinker to use such an API. Alternatively extracting the functionality that DwarfLinker already has/uses into a separate API that enables other DWARF modifications, etc, would be a way to go about it.

Thank you! That is a great way to troubleshoot issues when multiple object files are involved. Link everything together, unique type info, and then you have a single file to debug (assuming that that the issue is still there). Size reduction is important too, it is a good first step.

I can see the need for a tool like you propose for reducing debug info for test case reproduction, but for day to day building and debugging, I am interested in having a smart DWARF linker that can produce smaller output files with complete debug info.

Great! Perhaps some components might be useful for such tool as well.

Whatever solution you implement, should probably share a lot of its implementation with these tools

Absolutely. Looking into DWARFLinker code, the algorithm is straightforward enough, but there are a ton of special cases, optional and platform-specific features. I think starting from the current DWARFLinker code is probably the best way to go.

if you introduce a more structured reading/writing API, it’s probably worth refactoring the DwarfLinker to use such an API. Alternatively extracting the functionality that DwarfLinker already has/uses into a separate API that enables other DWARF modifications, etc, would be a way to go about it.

Agree. It is important that people working on DWARFLinker are on board with this idea. Otherwise we can make a separate library.

Either way, I’m going to start with implementation and refactoring of DWARFLinker at the same time. This way I can test the implementation with all current LIT tests. Later we can drop the refactoring part, if necessary.