[RFC] Improve LLDB progress reporting

cassanova · December 15, 2023, 4:56pm

LLDB uses a progress reporting infrastructure to inform the user when long-running operations are taking place so that the user does not think that LLDB has frozen while running. Long-running operations in upstream LLDB include:

Symbol table parsing
DWARF index loading
Manual DWARF indexing
Locating external symbol files

This infrastructure is implemented with the Progress class as a part of LLDB’s core. Currently progress reports are delivered individually by creating one Progress object per action being done for an operation. This meant that if 50 symbol tables are being parsed, 50 Progress objects are created for each symbol table being parsed. This approach works, but currently it causes users to be notified for each action within an overall operation. In an IDE this means every action in an operation gets its own progress bar which is not ideal. Progress reports in LLDB can currently be visualized as such:

Parsing symbol table for foo [         ]
Parsing symbol table for foo [|||||||||]
Parsing symbol table for bar [         ]
Parsing symbol table for bar [|||||||||]

Information about the operation being performed is currently all placed within the title field of a progress report, but progress reports have had the functionality to add details to a report (⚙ D143690 [lldb] Add the ability to provide a message to a progress event update). This would allow for umbrella progress reports wherein the title denotes the overall operation being performed (e.g. parsing symbol tables) and the detail field denotes the specific action for that operation (e.g. the fact that the “foo” symbol table is what’s currently being parsed). These changes would allow progress reports to instead be visualized like this:

Parsing symbol table [         ] (foo)
Parsing symbol table [||||     ] (bar)
Parsing symbol table [|||||||||]

This would allow for UIs or IDEs to show progress reports as one main progress bar per operation. Progress reporting for long-running Swift operations have been modified to use the detail field downstream on the Apple Swift fork (https://github.com/apple/llvm-project/pull/7769).

We’d like to implement this change upstream for general long-running operations (as shown above) but the architecture of LLDB makes it difficult to have a single place to have an umbrella progress report for these operations. One solution that we came up with is to use an enum flag that denotes whether progress reports are considered “aggregate” or not. “Aggregate” progress reports are those that happen for operations that can be triggered multiple times during a debugging session (e.g. parsing a new set of symbol tables that have been added after initially starting up a debug session) and happen at a such a speed that they cause a high amount of rapid reports. “Non-aggregate” reports denote those that only happen once during a debug session (e.g. importing Swift modules).

This flag would allow our IDE team to filter out “aggregate” progress reports, but after some external discussions this solution causes upstream progress reports to all be considered too numerous and spammy to display in an IDE which isn’t the best course of action for the wider community. @clayborg suggests using strings as a category for a progress report so that an IDE can filter out progress reports as they need. We want to make sure that we have something that works well for the entire LLDB community so we’re open to suggestions and alternatives for how to solve this problem.

Endill · December 15, 2023, 8:30pm

Sounds good. I hope this doesn’t mess various editor integrations too much.

clayborg · December 15, 2023, 8:36pm

Right now we have the ability to have a title and add some detail as we increment the progress for progress reporting. In upstream, the only progress reports that take advantage of the detail functionality seems to be llvm-project/lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.cpp. All other progress dialogs set the title one time ("Parsing symbol table for <path>" or "Manually indexing DWARF for <path>" so the title strings are all different, which can make it hard to group things together in a UI where progress reports can be started and finished very quickly and make things noisy when trying to display this effectively.

For symbol table parsing, we make a single progress object whose title is always Parsing symbol table for <path> where <path> changes for each module. We don’t have centralized code that parses all symbol tables. We could in the symbol preload code if we wanted to, but we don’t right now. If preloading is off, then any symbol tables can be parsed anywhere in the code at any time depending on what APIs trigger the symbols tables to need to be loaded. So that makes it hard to create a single progress object for all symbol table parsing. So the question is how to allow clients that want to receive progress notifications do to something better for display in a UI.

Right now if we do have a command that triggers all symbol tables to be loaded, they usually happen serially, so we will get a progress start notification for each module, and then a progress finished notification for that same module, and then repeat for all other modules. Something like breakpoint set --name foo can trigger an iteration over all modules and any modules that don’t have their symbol tables parsed yet, will have them parsed on demand. But a command like “breakpoint set --module a.out --name main” will cause only one symbol table to need to be parsed.

This is very similar to DWARF indexing, which tends to take a lot of time on platforms that don’t have really good accelerator tables, like any non Darwin targets (linux, android, etc).

If we don’t want to update how progress reports are created and reported, one idea is to add the notion of wether a progress is considered “aggregate”. If this is true, then the ideas is user interfaces can decide to not show these reports. This method has been submitted as a PR [lldb][progress] Add discrete boolean flag to progress reports by chelcassanova · Pull Request #69516 · llvm/llvm-project · GitHub. My reservations with this approach is it currently marks almost every current progress as aggregate and might cause user interfaces to decide not to show this progress. In my experience on linux, many of these progress notifications are vital to knowing why the debug session is taking a long time to startup. Symbol table parsing takes some time, but manually indexing DWARF take a large amount time (up to 10 - 20 minutes if you have 70gb of unindexed debug info) and without being able to see progress updates in a UI, this would make the debugger seem to be hung and doing nothing which causes many users to kill the debugger and try again.

One idea it to add a category string to progress notifications. for symbol table parsing, this would look something like:

category = "Symbol table parsing"
title = "Parsing symbol table for /tmp/a.out"
detail = ""

For manually indexing DWARF, it would be:

category = "Manually indexing DWARF"
title = "Manually indexing DWARF for /tmp/a.out"
detail = ""

If a user interface gets too many notifications, then the UI could create an umbrella progress using category as the progress title, and then make it show a continuous spinning progress UI. Even if the one symbol table progress finishes, a small timeout can keep the progress alive waiting for more progress notifications that have the same category value. If another one comes in, continue and renew the timeout, else close the progress.

The code for progress reports that don’t update a count looks like:

Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name));

This will create a RAII C++ object where on construction, it will report a progress with a value of 0 to indicate the progress started event, and on destruction, will report a progress with a value of UINT64_MAX to indicate completion.

For progress reporting where you know you have N units of work to do, the code looks like:

  const uint64_t total_progress = ...;
  Progress progress(
      llvm::formatv("Manually indexing DWARF for {0}", module_desc.GetData()),
      total_progress);
  for (...) {
     do_somework();
     progress.Increment();
  }

This allows progress objects to keep people up to date with the progress (or lack thereof). When the object is constructed, it will report a progress with a value of 0 to indicate the progress started event, and on destruction, will report a progress with a value of total_progress to indicate completion. And during the expensive work, it will update the progress as work gets done.

The main thing we are trying to solve if how to manage how progress gets reported so that everyone is happy.

If we have a category for each progress object, it could help an IDE do something better, by allowing multiple progress start and end notifications to be received, but it can now group them together better. Code would like:

Progress progress(
    /*category=*/ "Parsing symbol tables", 
    /*title=*/ "/tmp/a.out")

And

  const uint64_t total_progress = ...;
  Progress progress(
    /*category=*/"Manually indexing DWARF",
    /*title=*/"/tmp/a.out",
    /*total=*/total_progress);
  for (...) {
     do_somework();
     progress.Increment();
  }

The IDE now has the ability to notice it is getting many progress start and stop notifications from a given category and then put them into a group.

Another ideas to enforce the category stuff shown above and then change the way that progress notifications are delivered to the IDE. Code can be added to automatically aggregate notifications by category, even if we have internal notifications like:

progress started for {category="Parsing symbol tables", title="/tmp/a.out"}
progress ended for {category="Parsing symbol tables", title="/tmp/a.out"}
progress started for {category="Parsing symbol tables", title="/usr/lib/libc.so"}
progress ended for {category="Parsing symbol tables", title="/usr/lib/libc.so"}

We could notify the IDE always with the same API, but LLDB would manage the notifications where it would send the IDE a notifications like:

progress started for {title="Parsing symbol tables"}
progress update for {title="Parsing symbol tables", detail = "/tmp/a.out"}
progress update for {title="Parsing symbol tables", detail = "/usr/lib/libc.so"}
// wait for a small timeout of maybe 1 second and if no other progress for symbol tables comes in...
progress ended for {title="Parsing symbol tables"}

This leaves the public API untouched and we re-used the existing detail functionality that was added, but adds new code into LLDB to manage how progress updates are delivered. We could also add code to throttle notifications to only be sent once per second for progress items that update too quickly, like in the DWARF case. For example, we can make a fake progress that would incrmrent way too quickly and cause a flurry of notifications with sample code like:

  Progress progress(
    /*category=*/"Testing spammy progress",
    /*title=*/"task a",
    /*total=*/100000);
  for (int i = 0; i<100000; ++i) {
     usleep(1000);
     progress.Increment();
  }

This would send a ton of progress updates with no delay in between, so having code in LLDB that could throttle any progress updates to only go out once per second, it would help cut down on spammy progresses. When multiple progress events with a valid totals are running in different threads, we would need to make sure the totals were combined when doing the reporting.

This solution would help us not have to change the public API and could imrpove things for everyone.

clayborg · December 18, 2023, 7:08am

I have fully implemented my idea of having the debugger throttle progress dialogs using some new settings in this PR:

github.com/llvm/llvm-project

Add settings and code that limits the number of progress events.

llvm:main ← clayborg:progress

opened 07:06AM - 18 Dec 23 UTC

clayborg

+344 -5

Currently in LLDB if clients listen for Debugger::eBroadcastBitProgress events, …this can cause too many progress events to get sent to clients which makes the feature a bit less useful that originally planned. This patch fixes this by adding two new settings in LLDB: ``` (lldb) apropos progress The following settings variables may relate to 'progress': progress-minimum-duration -- The minimum time in milliseconds required for a progress to get reported. If a progress starts and completes and the elapsed time is under this threshold, then no progress will get reported. progress-update-frequency -- The minimum time in milliseconds between progress updates for progress notifications that have a valid count. ``` This patch also adds code that spawns a progress thread in each debugger that manages all of the incoming events and only sends out progress events if they meet the setting requirements. The default settings are: ``` (lldb) settings show progress-minimum-duration progress-update-frequency progress-minimum-duration (unsigned) = 500 progress-update-frequency (unsigned) = 1000 ``` The `progress-minimum-duration` setting is the number of milliseconds that we wait before broadcasting any events for a progress. If the progress start and end events happen quicker than this number, no progress events get broadcast. This keeps the progress events from becoming too cumbersome on clients and makes sure that if we have thousands of very quick internal progress reports, we don't have to show any of them. The default of 500ms value is open to suggestions, maybe 250ms would be better? The `progress-update-frequency` setting is for progress dialogs with finite amounts of work. When we index DWARF, we create progress dialogs with thousands and thousands of compile units and each time we manually index a compile unit, previously we would send thousands and thousands of progress updates for each compile unit no matter how quickly we would index them. With this new setting, will first wait `progress-minimum-duration` milliseconds before broadcasting any progress events, and then after that we will broadcast update progress events only every 1000 ms, or 1 second. This really keeps the number of progress notifications down to a minimum. Posting this as a solution to the RFC: https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717

Please download this and try it out and let me know what you think. Maybe this will help alleviate the need for categories? Although I do like the idea of categories still in case we do want to group things in User interfaces in the future.

JDevlieghere · December 18, 2023, 5:27pm

I think this example highlights (what I consider to be) the real problem with progress reports: the majority of the uses in upstream lldb don’t actually report progress. They’re more like status updates saying “hey I’m doing this particular operation right now”. When those operations happen consecutively, they appear to be progress. As Chelsea pointed out, that works in a UI that can show all the individual operations (like the command line and VSCode) but might not for others that want to be more conservative and show something like a spinner that you have to click on to show the details.

This is not so much a design limitation of the progress reports themselves but the result of LLDB’s design to be lazy and do work on demand.

I share that concern and we’re very much open to change the way progress reports are created and reported. For all the current instances, we considered if we could hoist the reporting up. Like the symbol table parsing and the DWARF indexing, the answer is usually “no” because these things happen on-demand rather than consecutively.

My personal preference would be to change where we report progress to places where we can report work towards a common goal (i.e. non-aggregate). That said, that requires changes what progress we report and I very much recognize that things like symbol table parsing and manual DWARF indexing are very important operations to report to users.

If I understand correctly, the category idea you’re proposing consists of two things:

Add another field to the progress reports to group separate instances together.
Use a small timeout to report instances belonging to the same group as one.

This is something we’ve had considered as well, but without the need for an extra field and instead using the title as the category name. So instead of:

category = "Manually indexing DWARF"
title = "Manually indexing DWARF for /tmp/a.out"
detail = ""

You just have:

title = "Manually indexing DWARF"
detail = "for /tmp/a.out"

The progress class can use the uniqueness of the title, or lack thereof, to group events by. It looks you ended up with something similar:

Progress progress(
    /*category=*/ "Parsing symbol tables", 
    /*title=*/ "/tmp/a.out")

if you just swap category with title and title with detail.

Which brings us to the second part on timeouts. If we want to double down on the “status updates as progress updates” then I don’t see another solution than using time to group things together.

The timeout needs to be large enough that we don’t end up with something like this:

progress started for {title="Parsing symbol tables"}
progress update for {title="Parsing symbol tables", detail = "/tmp/a.out"}
progress update for {title="Parsing symbol tables", detail = "/usr/lib/libc.so"}
// Wait for the small timeout to be exceeded
progress ended for {title="Parsing symbol tables"}
// Oops one just came in right after the timeout 
progress started for {title="Parsing symbol tables"}
progress update for {title="Parsing symbol tables", detail = "/usr/lib/bar.so"}
// Wait for the small timeout to be exceeded
progress ended for {title="Parsing symbol tables"}

But it should also be short enough as to not lie to the user and claim work is happening while there isn’t. We could probably make this value dynamic (something like twice the median time of all the previously seen events belonging to a certain category/group).

You also mention throttling/rate limiting. We had some discussion on that topic in https://reviews.llvm.org/D150805 and https://reviews.llvm.org/D152364. I do have more concerns about that, especially the notion of a minimum duration, which creates an additional error. I have a bunch of thoughts on thus, but unless it impacts the design here, I think this is a problem that’s sufficiently different from what the RFC is trying to achieve that we can probably discuss that separately.

–

TL;DR:

I would prefer progress reports to report progress instead of status which requires changing what we report.
If we don’t want to change what we report, I think grouping events using time is the only way to solve these events without ignoring them altogether (using the aggregate flag).

clayborg · December 18, 2023, 8:05pm

The original goal of the adding the progress reporting was to keep people apprised of long delays in debug sessions. Prior to the progress reports we could have huge delays in LLDB that would confuse users wondering what was going on. Now people know what is going on which is good. So the goal of progress reports has always been to inform the user of what is going in LLDB so they are not in the dark.

I like the idea of switching the internals of the progress report to use the current title as a category and always using the detail as you mentioned:

Progress progress(
    /*category=*/ "Parsing symbol tables", 
    /*title=*/ "/tmp/a.out")

Which brings us to the second part on timeouts. If we want to double down on the “status updates as progress updates” then I don’t see another solution than using time to group things together.

This is what they were originally designed for. One idea to to add something that clarifies the intent of the progress. Maybe instead of “aggregate” we have an enum that clarifies the intention:

enum Class ProgressType {
  /// For anything that can cause delays during debugging where the user might
  /// want to have insight into like symbol table parsing, dwarf indexing, 
  /// downloading symbols.
  BackgroundWork, 
  /// UIs will always want to report these progresses for a user initiated task as
  /// after the user has requested something to happen and we can show progress.
  UserProgress
};

If we have these types, it would allow the UI to ignore all ProgressType::BackgroundWork progress events if desired. We could also add a settings to allow enabling and disabling progresses by type so they never even make it to the IDE if desired.

(lldb) progress disable --type [background|user|all]

or using a setting:

(lldb) settings set progress-disabled-types [background|user|all]

The above enum seems like the equivalent of the “aggregate” approach you guys want, though “aggregate” didn’t seem as clear as the above enum. This still doesn’t allow for customizing what you would want to see on a fine grained level. Maybe you want to see DWARF indexing, but not symbol table parsing, they would both be marked as aggregate or ProgressType::BackgroundWork, though if we do enable categories and use the detail, we could have settings that allow us to add a list of categories users don’t want to see like:

(lldb) progress disable --category "Parsing Symbol tables"

or using a setting:

(lldb) settings set progress-disabled-categories "Parsing Symbol tables"

The timeout needs to be large enough that we don’t end up with something like this:

progress started for {title="Parsing symbol tables"}
progress update for {title="Parsing symbol tables", detail = "/tmp/a.out"}
progress update for {title="Parsing symbol tables", detail = "/usr/lib/libc.so"}
// Wait for the small timeout to be exceeded
progress ended for {title="Parsing symbol tables"}
// Oops one just came in right after the timeout 
progress started for {title="Parsing symbol tables"}
progress update for {title="Parsing symbol tables", detail = "/usr/lib/bar.so"}
// Wait for the small timeout to be exceeded
progress ended for {title="Parsing symbol tables"}

For symbol tables this is not how things would come in, they would come in as:

progress started for {category="Parsing symbol tables", detail = "/tmp/a.out}
progress ended for {category="Parsing symbol tables", detail = "/tmp/a.out"}
progress started for {category="Parsing symbol tables", detail = "/usr/lib/libc.so"}
progress ended for {category="Parsing symbol tables", detail = "/usr/lib/libc.so"}

So the timeout would only apply to each individual progress. So if “a.out” takes longer but “libc.so” doesn’t, “a.out” would get reported and “libc.so” wouldn’t.

With the way the code is, there is no way to create the major category that says “I am going to start parsing symbol tables”, these just happen lazily as debugging happens. Same thing for DWARF manual indexing, LLDB will lazily parse the DWARF if and only if it needs to, unless we have symbol preloading setting set to true. And even then, symbol preloading is in Target::GetOrCreateModule and also in TargetList::CreateTargetInternal(...) but I don’t think that does anything because Target::GetOrCreateModule will already preload the symbols.

So lldb_private::Progress started out as a way to solely inform the user that background tasks are going on that they might want to know about, and now has uses that are trying to be more standard progress dialogs.

I like your idea to switch “title” to “category” and then always use the “detail”. If we do this, we need the detail to be able to be set in the Progress constructor, not only at Progress::Increment(..., const std::string &detail) time. As we don’t want to have to send 3 events for progress objects that only send two right now. I would like to avoid the need for three events like:

progress started for {title="Parsing symbol tables" (no detail)}
progress update for {title="Parsing symbol tables", detail = "/tmp/a.out"}
progress ended for {title="Parsing symbol tables", detail = "/tmp/a.out"}

As this would end up sending 3 progress events, instead of just two. I would rather see:

progress started for {title="Parsing symbol tables", detail = "/tmp/a.out"}
progress ended for {title="Parsing symbol tables", detail = "/tmp/a.out"}

JDevlieghere · December 18, 2023, 10:05pm

clayborg:

So the timeout would only apply to each individual progress. So if “a.out” takes longer but “libc.so” doesn’t, “a.out” would get reported and “libc.so” wouldn’t.

With the way the code is, there is no way to create the major category that says “I am going to start parsing symbol tables”, these just happen lazily as debugging happens. Same thing for DWARF manual indexing, LLDB will lazily parse the DWARF if and only if it needs to, unless we have symbol preloading setting set to true. And even then, symbol preloading is in Target::GetOrCreateModule and also in TargetList::CreateTargetInternal(...) but I don’t think that does anything because Target::GetOrCreateModule will already preload the symbols.

So lldb_private::Progress started out as a way to solely inform the user that background tasks are going on that they might want to know about, and now has uses that are trying to be more standard progress dialogs.

I think I might have read something different in your suggestion, probably because we considered something similar when we were thinking about the aggregate approach. So let’s take the symbol table parsing. As you said, this generally happens lazily as debugging happens.

Let’s take a hypothetical scenario where I’m setting a symbolic breakpoint on foo and that there’s 3 libraries (libfoo, libbar, libbaz) left for which we haven’t pared the symbol table yet. The details don’t really matter but at some point we’re going to query SymbolFileSymtab which might trigger parsing the symbol table. Today, that would result in 3 unrelated progress instances and 6 events:

progress started for {title="Parsing symbol table", detail = "libfoo"}
progress ended for   {title="Parsing symbol table", detail = "libfoo"}
progress started for {title="Parsing symbol table", detail = "libbar"}
progress ended for   {title="Parsing symbol table", detail = "libbar"}
progress started for {title="Parsing symbol table", detail = "libbaz"}
progress ended for   {title="Parsing symbol table", detail = "libbaz"}

We could modify the progress infrastructure to keep a small timeout. Whenever it sees an progress instance is created with a title and a detail like above, it starts a new top level event (1) and immediately issue an update (2):

1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}

When the work for libfoo completes, it starts a small timer. If we see another operation with the same title before the timer goes off, we report it as an update to the current top level event (3) and disable the timer:

1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}
3 | progress update      {title="Parsing symbol table", detail = "libbar"}

[I’m repeating the same events (1, 2) as above to show the continuity, they are not new events]

When the work for libbar completes, we start a new timer again. Let’s assume the same thing happens for libbaz (4):

1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}
3 | progress update      {title="Parsing symbol table", detail = "libbar"}
4 | progress update      {title="Parsing symbol table", detail = "libbaz"}

We start the time again and this time, it goes off, which means we consider the top level event done. Here’s the full sequence of events received by the user:

1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}
3 | progress update      {title="Parsing symbol table", detail = "libbar"}
4 | progress update      {title="Parsing symbol table", detail = "libbaz"}
5 | progress ended for   {title="Parsing symbol table"}

The error, defined as the time an event is show to the user, while the corresponding work is not ongoing, is bounded by the timeout.

What I like about this approach is that:

It always over-reports (showing more work than is actually happening) as opposed to under-reporting (showing no work is happening when there is). The latter makes LLDB look slow/stuck while the former is mostly harmless.
We can tweak the timeout as more events come in. The more events we see, the more accurately we can estimate how long one of the events lasts and the smaller we can make the error.

My example above shows a scenario where we get it right. As I said in my previous post, if the timeout is too small, we could get it wrong. If for whatever reason the time between parsing the symbol table for libbar and libbaz exceeded the timeout, the result would look like this:

1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}
3 | progress update      {title="Parsing symbol table", detail = "libbar"}
4 | progress ended for   {title="Parsing symbol table"}
5 | progress started for {title="Parsing symbol table"}
6 | progress update      {title="Parsing symbol table", detail = "libbaz"}
7 | progress ended for   {title="Parsing symbol table"}

WDYT about this approach?

clayborg · December 18, 2023, 11:12pm

My example above shows a scenario where we get it right. As I said in my previous post, if the timeout is too small, we could get it wrong. If for whatever reason the time between parsing the symbol table for libbar and libbaz exceeded the timeout, the result would look like this:
1 | progress started for {title="Parsing symbol table"}
2 | progress update      {title="Parsing symbol table", detail = "libfoo"}
3 | progress update      {title="Parsing symbol table", detail = "libbar"}
4 | progress ended for   {title="Parsing symbol table"}
5 | progress started for {title="Parsing symbol table"}
6 | progress update      {title="Parsing symbol table", detail = "libbaz"}
7 | progress ended for   {title="Parsing symbol table"}

I like this idea as long as no real changes are needed from an internal code standpoint except for changing the title → category, and always specifying the detail. IIUC, the internal changes for the mach-o symbol table would be from the old code:

  Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name));

The new code would be something like:

  Progress progress(
    /*category=*/SymbolFile::GetParsingSymbolTableProgressCategory(), 
    /*detail=*/file_name);

Since we want category strings to match from symbol file to symbol file, I assume we can add a ConstString SymbolFile::GetParsingSymbolTableProgressCategory() to the symbol file that returns the right string. I suggest that we use a ConstString for this to allow quick comparisons in any infrastructure we need internally to be able to group these correctly per your above approach so we can make efficient maps.

One question with this new approach is how to handle DWARF indexing where we have a finite amount of work do to. Lets say we have “a.out” with 1000 units of work and “b.out” that has 2000 units of work to do. What would that look like? Like this?:

progress started for {title="Indexing DWARF"}
progress update      {title="Indexing DWARF", detail = "a.out" 0/1000}
progress update      {title="Indexing DWARF", detail = "a.out" 1/1000}
...
progress update      {title="Indexing DWARF", detail = "a.out" 1000/1000}

progress update      {title="Indexing DWARF", detail = "b.out" 0/2000}
progress update      {title="Indexing DWARF", detail = "b.out" 1/2000}
...
progress update      {title="Indexing DWARF", detail = "a.out" 2000/2000}
progress ended for   {title="Indexing DWARF"}

It would be hard to combine two finite progresses into a complete and total if things come and go and if a.out and b.out starting indexing at different times but they overlap, it would be hard to make a nice UI of that since the work load count would change as different DWARF indexes start and complete.

We do still need to throttle the events, so the PR I posted, which isn’t related to this RFC’s purpose, is still really needed to make sure clients don’t get spammed with too many updates like we do now.

clayborg · December 18, 2023, 11:16pm

Also the PR I posted would sit on top of all this stuff since with your new approach we would manipulate the progress events we want to send and still use the same old API to eventually send the events via:

static void PrivateReportProgress(Debugger &debugger, uint64_t progress_id,
                                  std::string title, std::string details,
                                  uint64_t completed, uint64_t total,
                                  bool is_debugger_specific);

clayborg · December 19, 2023, 12:58am

With this new approach we also don’t get to know when the symbol table parsing for something is done until the timeout kicks in as we will update when the symbol table parsing starts for a file, but not for one that doesn’t.

Another ideas is to make the change that you suggest where we convert to category + detail, but leave progress events alone. Then we add a new broadcast bit that allows for new events to be sent out that combine categories into a combined progress for categories. So if the user listens to Debugger::eBroadcastBitProgressByCategory we would send the same kind of events, but combined as you suggest with a timeout that would cause the progress to go away.

This way clients of the existing progress can continue to see the individual progress dialogs if they listen to Debugger::eBroadcastBitProgress and see the start and end notifications, and clients that want combined can use the new combined progress notifications.

JDevlieghere · December 19, 2023, 5:11am

clayborg:

I like this idea as long as no real changes are needed from an internal code standpoint except for changing the title → category, and always specifying the detail. IIUC, the internal changes for the mach-o symbol table would be from the old code:
  Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name));
The new code would be something like:
  Progress progress(
    /*category=*/SymbolFile::GetParsingSymbolTableProgressCategory(), 
    /*detail=*/file_name);
Since we want category strings to match from symbol file to symbol file, I assume we can add a ConstString SymbolFile::GetParsingSymbolTableProgressCategory() to the symbol file that returns the right string. I suggest that we use a ConstString for this to allow quick comparisons in any infrastructure we need internally to be able to group these correctly per your above approach so we can make efficient maps.

Yep, that matches what I had in mind. A ConstString makes sense in this context but if we use a StringMap that might be fine too.

Great question. I think that if there’s a finite amount of progress, we should treat it as a “top level” event. So I would imagine it looks like this:

progress start       {title="Indexing DWARF", detail = "a.out" 0/1000}
progress update      {title="Indexing DWARF", detail = "a.out" 1/1000}
[...]
progress end.        {title="Indexing DWARF", detail = "a.out" 2000/2000}

We can add a boolean to the progress instance to indicate whether something should be considered a top level event (like Indexing DWARF here) or not (like the symbol table parsing). We could use the finite amount of work to automatically set it to true or false, and have the option for us to overwrite if we ever deem that necessary.

I really like this idea. The concept of grouping events might be generic enough that other broadcasters could try to benefit from something similar in the future.

cassanova · December 19, 2023, 6:52pm

Right now in the CLI progress reports are displayed one at a time, if a new report comes in while one is already in progress the one in progress gets priority when being displayed and the new one won’t be displayed. It would be nice if we could display consecutive reports at once, something like this:

Indexing DWARF:
╰─ a.out (1/1000)
╰─ b.out (3/2000)

Using the categories as strings (or using the titles to categorize the reports) here would make it easier to display the reports like this if we wanted to.

clayborg · December 19, 2023, 7:03pm

If we implement the new Debugger::eBroadcastBitProgressByCategory then the command interpreter can do something like this by listening to the new broadcast event instead of the raw progress events with Debugger::eBroadcastBitProgress

cassanova · December 19, 2023, 7:34pm

Even better! I also like the idea of having 2 different broadcast bits for clients to listen to.

Topic		Replies	Views
RFC: LLDB Statusline LLDB	16	426	April 4, 2025
LLDB startup takes ~15 seconds parsing symbol tables LLDB	9	2266	February 7, 2023
New DWARF parser proposal LLDB	4	161	August 31, 2013
LLDB Evolution LLDB	70	124	September 6, 2016
[RFC] lld: mostly-concurrent symbol resolution LLVM Dev List Archives	4	106	April 23, 2019

[RFC] Improve LLDB progress reporting

Related topics