logging in lldb

Hello all,

I've been thinking about how to simplify the way we do logging in
lldb. The main two problems I see are:
a) inability to print complex types easily - this was always tricky,
but lately has become even more problematic, as we're starting to
introduce llvm::StringRef, chrono, and other non-C types everywhere
b) the need to explicitly specify the log source (class and function
names) explicitly - this leads to increased verboseness and
inconsistency
c) the need to write if(log) everywhere -- also increases visual noise

I tried to not reinvent the wheel as much as possible, so I decided to
model my proposal based on how gtest logs it's failure messages. It is
not a super-well-known feature of gtest, but you can stream additional
data into the ASSERT statement, in case you think the plain gtest log
message will not provide enough context for the user. For example you
can write:
for (auto test: test_cases)
  EXPECT_TRUE(do_test(test)) << "Input was: " << test;

I think this pattern has potential to solve all the problems I
outlined above. Usage of streams (llvm streams in our case) enables us
to do any fancy formatting, including the use of recently introduced
llvm::formatv. And the presence of the macro means that we can hide
things like "if(log)" and __FUNCTION__ there. If the macro is written
carefully, the whole log statement will have minimal overhead (the
same as our current solution) if logging is disabled.

I have created straw-man implementation of such an interface in
Dxxxxx, together with some examples of how it's used, and I'd like to
hear you comments. I've deliberately made it as a custom layer added
on top of the current implementation, to show how simple the thing can
be -- after converting just half of a single file to the new
interface, the implementation has already "payed off" in the number of
LoC added/removed. If I will be doing the thing it for real, I'd like
make the integration with llvm streams much deeper.

So, my question is: Do you like the design in general (if not, why
not?). If you do, then I'd like to hear your thoughts about the
particular details of this design:
- the new implementation prints File::Function as the source, instead
of the conventional Class::Function. The reason for this is that the
the class name is not easily accessible from the preprocessor. I think
this is fine as generally file_name = class_name+".cpp", but possible
implementations I can think of are:
  - file+line: I didn't choose this one as lines do not say much, if
you don't have the source file open. They can also change from
revision to revision due to unrelated changes.
  - __PRETTY_FUNCTION__: It contains the class name, but can get
pretty messy for complex functions. Possibly we could do some
processing on this value to shorten it (?)...
  - Have each (loggable) class contain a static consexpr field NAME__,
and then reference that from the log macro: works, but it's quite
messy

- LLDB_LOG() vs. LLDB_LOG(log) -- All our log variables are named log,
so maybe we don't have to reference them explicitly? I couldn't decide
whether that helps or hurts readability. (I wanted to keep the log
variables, instead of specifying the log flags explicitly, as that
means individual log statements can be very cheap, and short.)

Note, I am not proposing we do a wholesale replacement of the whole
lldb code base, as that will be a huge job -- the new implementation
can live together with the old one quite happily (as it does in my
example). I will certainly go and refactor code I own (linux and
android), and anyone else can do the same but I am proposing a more
gradual migration path, where the log statements are updated as we
touch the affected functions -- I am hoping people will gravitate
towards that naturally, as the new log statements will be shorter and
easier to use with more complex types. After usage of the old logs
drops below a certain percentage, we can go and replace the rest.

So, what do you think?
pl

Sorry, I'm being dense. What is Dxxxxx?

Jim

Code review with an unspecified number I think

https://reviews.llvm.org/D27459 :smiley:

Sorry, I forgot to replace this by actual revision when I submitted the code.

pl

Ah, cool! I'll have a look.

Jim

My only concern reading this is that LLVM already has a similar mechanism for channel-based logging that is used throughout the backends. Is there a reason to invent our own? Maybe we can take what is in LLVM, enhance it and use shared functionality?

There are a few differences between your solution and the LLVM one, but I'm not sure they are sufficient to justify creating a new solution.

The LLVM debug logging solution is documented in the LLVM Programmers Manual here:
http://llvm.org/docs/ProgrammersManual.html#the-debug-macro-and-debug-option

-Chris

Ah, I totally forgot about the llvm DEBUG macro. I would be open to
trying merge these two approaches together. However, I think our's is
a much more heavy-weight system, so it might be too heavy for llvm,
and I am not sure how much would we be willing to lose in the process:
- we have two level logging architecture ("process" category in "lldb" channel)
- logging can be enabled/disabled at runtime
- logs can be redirected to various destinations

Maybe it will be possible to design it in a way that the complexity is
apparent to those who wish to use it. I'll need to think about that...

pl

Hello all,

I've been thinking about how to simplify the way we do logging in
lldb. The main two problems I see are:
a) inability to print complex types easily - this was always tricky,
but lately has become even more problematic, as we're starting to
introduce llvm::StringRef, chrono, and other non-C types everywhere

Streaming output tends to chunk the output text into smaller bits. That's a problem when you are doing "I have a log line from some customer, I have no idea where it is in the code, so I'm going to search for the text in the code." That actually happens pretty frequently. The more you chunk up the text, the harder this task is. This problem exists in the current method especially since the restriction to 80 chars meant that lots of contiguous log lines are now nonsensically broken up into chunks. It would be nice not to make it worse. This is more about ways to write log lines, however; more cultural than implementation.

Like Greg, I find streaming harder to read than printf, particularly if I'm trying to make things indent nicely for reading. So again, I'd argue for adding these macros, but not trying to replace the current mechanism.

b) the need to explicitly specify the log source (class and function
names) explicitly - this leads to increased verboseness and
inconsistency

I think having a convenience to add function name decorators is fine, but we need a way not to do it all the time. We have logs that don't have function names because they log over several functions, and having the different names adds visual noise, would make the rest of the log impossible to line up nicely, and adds no important information. You will write log statements infrequently, but you will have to read them much more frequently, so taking a little more effort to make them readable is well worth it.

c) the need to write if(log) everywhere -- also increases visual noise

Sometimes you need to do some preliminary calculations to gather the materials you want to log, so the code in the "if (log)" is not trivial. It has to be inside the "if (log)" however, since you only want to do logging computations if you are logging. I'd really rather not have to write this in a macro as that makes it hard to read and harder to debug. That doesn't argue against adding these macros, but it does argue that we need to keep the old form.

Aesthetically, your form does not make it clear that the logging statements are not getting done if logging is off. I don't want somebody casually working on the code not to log something because they were unsure of this. We could address that through documentation, a solution whose goodness is weighted by the faith you have in how many people read this sort of documentation.

I tried to not reinvent the wheel as much as possible, so I decided to
model my proposal based on how gtest logs it's failure messages. It is
not a super-well-known feature of gtest, but you can stream additional
data into the ASSERT statement, in case you think the plain gtest log
message will not provide enough context for the user. For example you
can write:
for (auto test: test_cases)
EXPECT_TRUE(do_test(test)) << "Input was: " << test;

I think this pattern has potential to solve all the problems I
outlined above. Usage of streams (llvm streams in our case) enables us
to do any fancy formatting, including the use of recently introduced
llvm::formatv. And the presence of the macro means that we can hide
things like "if(log)" and __FUNCTION__ there. If the macro is written
carefully, the whole log statement will have minimal overhead (the
same as our current solution) if logging is disabled.

I have created straw-man implementation of such an interface in
Dxxxxx, together with some examples of how it's used, and I'd like to
hear you comments. I've deliberately made it as a custom layer added
on top of the current implementation, to show how simple the thing can
be -- after converting just half of a single file to the new
interface, the implementation has already "payed off" in the number of
LoC added/removed. If I will be doing the thing it for real, I'd like
make the integration with llvm streams much deeper.

So, my question is: Do you like the design in general (if not, why
not?). If you do, then I'd like to hear your thoughts about the
particular details of this design:
- the new implementation prints File::Function as the source, instead
of the conventional Class::Function. The reason for this is that the
the class name is not easily accessible from the preprocessor. I think
this is fine as generally file_name = class_name+".cpp", but possible
implementations I can think of are:
- file+line: I didn't choose this one as lines do not say much, if
you don't have the source file open. They can also change from
revision to revision due to unrelated changes.
- __PRETTY_FUNCTION__: It contains the class name, but can get
pretty messy for complex functions. Possibly we could do some
processing on this value to shorten it (?)...
- Have each (loggable) class contain a static consexpr field NAME__,
and then reference that from the log macro: works, but it's quite
messy

- LLDB_LOG() vs. LLDB_LOG(log) -- All our log variables are named log,
so maybe we don't have to reference them explicitly? I couldn't decide
whether that helps or hurts readability. (I wanted to keep the log
variables, instead of specifying the log flags explicitly, as that
means individual log statements can be very cheap, and short.)

I'd rather keep this explicit. I don't know if we do it ever, but it is totally reasonable that one function might want to funnel information to two different channels. Also, it makes explicit the connection between the log object and the channel it is opened for. It is a bit weird to have a seemingly unused log creation line, then a bunch of logs later on that don't look connected to it. The more oddities you put in the code the harder it is for a new engineer to begin to grok it.

Jim

Ah, I totally forgot about the llvm DEBUG macro. I would be open to
trying merge these two approaches together. However, I think our's is
a much more heavy-weight system, so it might be too heavy for llvm,
and I am not sure how much would we be willing to lose in the process:
- we have two level logging architecture ("process" category in "lldb" channel)

This might not map to LLVM, but it might be worth looking into.

- logging can be enabled/disabled at runtime

This actually should be doable in LLVM. Since the LLVM DEBUG macro controls the -debug=... flag, I can't immediately see a reason why that wouldn't work for runtime toggling.

- logs can be redirected to various destinations

The LLVM logging uses streams, so I don't think this is really different. You just change the stream under it.

-Chris

BTW, I just want to make it clear in this discussion that logging is a really important part of lldb. This isn't a matter of "you had some debug logging you put in to solve a problem, and you didn't want to have to take it out next time you use it." We very often get bugs from customers who can't reduce the problem to a small test case, and can't make their code available to us. So logging is the only way you are ever going to solve this class of problems. Fortunately the logging in lldb is pretty good at present, and we have all often been able to solve this sort of problem on logging alone.

As you write code in lldb you should be thinking "If I had to figure out what was going on in some tricky circumstance, and all I had was this log, would I be able to figure it out." And to this end, being able to output complex log messages is crucial.

Sorry if this is obvious to folks, but on the off chance it isn't, it's worth a few bytes to restate it.

Jim

Merging the thread over from lldb-commits:

The other problem with using streams is we need each log line to come out in a single chunk. So we somehow need to specify the start and end of a log. For example if two threads are doing:

if (log)
  log->Printf("%i %i %i\n", x, y z);

And we switch over to using:
   if (log) *log << 1 << ' ' << 2 << ' ' << 3 << '\n';

We can't have that come out separately. We need to be able to have the logging know that everything has to come out at once as we don't want to see the following if two threads were logging these lines at the same time:

1 12 2 3 3\n\n

So the stream stuff all now needs to mark where the logging begins and ends:

   if (log) *log << log::start_message << 1 << ' ' << 2 << ' ' << 3 << '\n' << std::end_message;

And that starts to ruin the nice concrete log calls we had with log->Printf() since each call to log->Printf() was a complete message. I am open to ideas on how to solve this. Anyone have any ideas?

Zach, how many people are using llvm::formatv already? Will it be hard to change/evolve the format strings and contents if necessary? Might be hard if there are a lot of clients, but I am just trying to get a feel.

I have had to do the "where the heck does this log line come from in the code" search many many times debugging external reports. I worry a bit that anything that makes output more magical as you are suggesting also makes this task harder. Again, IME you will write log code fairly infrequently, but you will grub around trying to figure out what the logs mean quite frequently. So you always want to optimize for easy of use rather than of emission.

That being said, the formatv stuff does sound pretty cool.

Jim

I like the the llvm::formatv stuff and could see using this exclusively. I am getting scared of using streams and << the more I think about it after my previous email since we would need to mark a log message as starting and ending somehow and that would make things messy.

Note that our logging has many options:

(lldb) help log enable
     Enable logging for a single log channel.

Syntax: log enable <cmd-options> <log-channel> <log-category> [<log-category> [...]]

Command Options Usage:
  log enable [-STagnpstv] [-f <filename>] <log-channel> <log-category> [<log-category> [...]]

       -S ( --stack )
            Append a stack backtrace to each log line.

       -T ( --timestamp )
            Prepend all log lines with a timestamp.

       -a ( --append )
            Append to the log file instead of overwriting.

       -f <filename> ( --file <filename> )
            Set the destination file to log to.

       -g ( --debug )
            Enable debug logging.

       -n ( --thread-name )
            Prepend all log lines with the thread name for the thread that generates the log line.

       -p ( --pid-tid )
            Prepend all log lines with the process and thread ID that generates the log line.

       -s ( --sequence )
            Prepend all log lines with an increasing integer sequence id.

       -t ( --threadsafe )
            Enable thread safe logging to avoid interweaved log lines.

       -v ( --verbose )
            Enable verbose logging.
     
So we will need to be able to do each of these with each log line. Streams will make things messy.

Greg

Hello all,

I've been thinking about how to simplify the way we do logging in
lldb. The main two problems I see are:
a) inability to print complex types easily - this was always tricky,
but lately has become even more problematic, as we're starting to
introduce llvm::StringRef, chrono, and other non-C types everywhere

Streaming output tends to chunk the output text into smaller bits. That's a problem when you are doing "I have a log line from some customer, I have no idea where it is in the code, so I'm going to search for the text in the code." That actually happens pretty frequently. The more you chunk up the text, the harder this task is. This problem exists in the current method especially since the restriction to 80 chars meant that lots of contiguous log lines are now nonsensically broken up into chunks. It would be nice not to make it worse. This is more about ways to write log lines, however; more cultural than implementation.

Like Greg, I find streaming harder to read than printf, particularly if I'm trying to make things indent nicely for reading. So again, I'd argue for adding these macros, but not trying to replace the current mechanism.

Fair enough. llvm::formatv() (see lldb-commits thread, we seem do have
diverged a bit...) should satisfy both of these requirements
("monolithic strings" and "not-streams"). Would you be OK with that?
(BTW, having a file+function name in each log message makes it easier
to find the source)

b) the need to explicitly specify the log source (class and function
names) explicitly - this leads to increased verboseness and
inconsistency

I think having a convenience to add function name decorators is fine, but we need a way not to do it all the time. We have logs that don't have function names because they log over several functions, and having the different names adds visual noise, would make the rest of the log impossible to line up nicely, and adds no important information. You will write log statements infrequently, but you will have to read them much more frequently, so taking a little more effort to make them readable is well worth it.

What I am optimizing for is readability of the source code (the fact
that it coincides with "the time to write the log statement" is a
lucky coincidence). I often find that I am looking at a function which
spans several screens, even though it doesn't actually do much,
because half of the function is logging. Normally, I would try to
split this function up into smaller ones, but I find that the amount
of work done by the function is actually quite reasonable and
undivisible, but due to logging it gets spread out enormously.

I would be fine with some way of inhibiting decorations. But, I want
to hear what you think about this alternative: Just have file names
and line numbers always aligned. For lines, we could reserve four
characters, for filenames 35 (only 7 files currently are longer than
35 chars -- these would get truncated, or misaligned, as you see fit).

c) the need to write if(log) everywhere -- also increases visual noise

Sometimes you need to do some preliminary calculations to gather the materials you want to log, so the code in the "if (log)" is not trivial. It has to be inside the "if (log)" however, since you only want to do logging computations if you are logging. I'd really rather not have to write this in a macro as that makes it hard to read and harder to debug. That doesn't argue against adding these macros, but it does argue that we need to keep the old form.

Aesthetically, your form does not make it clear that the logging statements are not getting done if logging is off. I don't want somebody casually working on the code not to log something because they were unsure of this. We could address that through documentation, a solution whose goodness is weighted by the faith you have in how many people read this sort of documentation.

I would assume people who care about performance to this level will be
familiar with the performance characteristics of logging. For me this
potential confusion is less important than making the code flow
nicely, although I see how you may have different opinion.

- LLDB_LOG() vs. LLDB_LOG(log) -- All our log variables are named log,
so maybe we don't have to reference them explicitly? I couldn't decide
whether that helps or hurts readability. (I wanted to keep the log
variables, instead of specifying the log flags explicitly, as that
means individual log statements can be very cheap, and short.)

I'd rather keep this explicit. I don't know if we do it ever, but it is totally reasonable that one function might want to funnel information to two different channels. Also, it makes explicit the connection between the log object and the channel it is opened for. It is a bit weird to have a seemingly unused log creation line, then a bunch of logs later on that don't look connected to it. The more oddities you put in the code the harder it is for a new engineer to begin to grok it.

OK.

For starters, it has builtin support for formatting ranges. So if you're ever looping over entities formatting them, as long as the entity itself has a formatter defined, you can simply pass it directly to formatv and it will do the right thing. (It has some customization support built in as well so you can control how to separate items).
Secondly, When you have to do some work to get the value you want to format, you could either define a custom formatter for the type in question, and the formatter does all the work, or you could write a function to return the value you want to format, and call that in the argument list, as long as the return type has an associated formatter.
I'm sure there will still be cases where you want to use the old method, but I think this would handle quite a few of them and reduce the amount of code you have to read / write when dealing with logging code.

Agreed. We're bound to run into cases where we want to have code that
is run only when logging is enabled, but it isn't directly a logging
statement. So we should have some form of "if(log)", but it should not
need to be used that often.

So the stream stuff all now needs to mark where the logging begins and ends:

   if (log) *log << log::start_message << 1 << ' ' << 2 << ' ' << 3 << '\n' << std::end_message;
And that starts to ruin the nice concrete log calls we had with log->Printf() since each call to log->Printf() was a complete message. I am open to ideas on how to solve this. Anyone have any ideas?

So, the way I solve that in my mockup is by having a temporary object.
It's constructor does any initialization work necessary (in my case,
printing file and line number). In the destructor I flush the log
message to the underlying streams. You could use the same approach,
but synchronize via mutexes, or whatever else you want.

Take a look at the step logs, for instance, which I have to read all the time. In that case, when reading them you are trying to follow the logical flow of the stepping algorithm, and there is already a lot of noise there. In that case, I don't need to know where the log lines come from, that is clear from the log content. So I definitely not want to have to peer past useless file & line or function information.

And more generally, I do think prioritizing ease of reading (or scrolling past) log statements in code over ease of making them complete and easy to read them is the wrong tradeoff. The number of hours of my life that have gone into peering at logs far outweighs the trouble of scrolling past "if(log)" blocks when reading code. If both goals can be achieved, that's great. And for simple "X was weird" kind of logs, having a nice compact form is great. But for things like the unwind logs, packet, step or expression logs, you don't want anything getting in the way of laying out the output in the easiest to read form possible.

Jim

Hello all,

I've been thinking about how to simplify the way we do logging in
lldb. The main two problems I see are:
a) inability to print complex types easily - this was always tricky,
but lately has become even more problematic, as we're starting to
introduce llvm::StringRef, chrono, and other non-C types everywhere

Streaming output tends to chunk the output text into smaller bits. That's a problem when you are doing "I have a log line from some customer, I have no idea where it is in the code, so I'm going to search for the text in the code." That actually happens pretty frequently. The more you chunk up the text, the harder this task is. This problem exists in the current method especially since the restriction to 80 chars meant that lots of contiguous log lines are now nonsensically broken up into chunks. It would be nice not to make it worse. This is more about ways to write log lines, however; more cultural than implementation.

Like Greg, I find streaming harder to read than printf, particularly if I'm trying to make things indent nicely for reading. So again, I'd argue for adding these macros, but not trying to replace the current mechanism.

Fair enough. llvm::formatv() (see lldb-commits thread, we seem do have
diverged a bit...) should satisfy both of these requirements
("monolithic strings" and "not-streams"). Would you be OK with that?
(BTW, having a file+function name in each log message makes it easier
to find the source)

That would be a great option to add to "log enable"!

b) the need to explicitly specify the log source (class and function
names) explicitly - this leads to increased verboseness and
inconsistency

I think having a convenience to add function name decorators is fine, but we need a way not to do it all the time. We have logs that don't have function names because they log over several functions, and having the different names adds visual noise, would make the rest of the log impossible to line up nicely, and adds no important information. You will write log statements infrequently, but you will have to read them much more frequently, so taking a little more effort to make them readable is well worth it.

What I am optimizing for is readability of the source code (the fact
that it coincides with "the time to write the log statement" is a
lucky coincidence). I often find that I am looking at a function which
spans several screens, even though it doesn't actually do much,
because half of the function is logging. Normally, I would try to
split this function up into smaller ones, but I find that the amount
of work done by the function is actually quite reasonable and
undivisible, but due to logging it gets spread out enormously.

I would be fine with some way of inhibiting decorations. But, I want
to hear what you think about this alternative: Just have file names
and line numbers always aligned. For lines, we could reserve four
characters, for filenames 35 (only 7 files currently are longer than
35 chars -- these would get truncated, or misaligned, as you see fit).

I would like this to be an option you can enable and not require it to be on.

c) the need to write if(log) everywhere -- also increases visual noise

Sometimes you need to do some preliminary calculations to gather the materials you want to log, so the code in the "if (log)" is not trivial. It has to be inside the "if (log)" however, since you only want to do logging computations if you are logging. I'd really rather not have to write this in a macro as that makes it hard to read and harder to debug. That doesn't argue against adding these macros, but it does argue that we need to keep the old form.

Aesthetically, your form does not make it clear that the logging statements are not getting done if logging is off. I don't want somebody casually working on the code not to log something because they were unsure of this. We could address that through documentation, a solution whose goodness is weighted by the faith you have in how many people read this sort of documentation.

I would assume people who care about performance to this level will be
familiar with the performance characteristics of logging. For me this
potential confusion is less important than making the code flow
nicely, although I see how you may have different opinion.

Different parts of the code do different amounts of logging. The expression parses is quite verbose in its logging, but this logging is required to figure out bugs that people run into.

- LLDB_LOG() vs. LLDB_LOG(log) -- All our log variables are named log,
so maybe we don't have to reference them explicitly? I couldn't decide
whether that helps or hurts readability. (I wanted to keep the log
variables, instead of specifying the log flags explicitly, as that
means individual log statements can be very cheap, and short.)

I'd rather keep this explicit. I don't know if we do it ever, but it is totally reasonable that one function might want to funnel information to two different channels. Also, it makes explicit the connection between the log object and the channel it is opened for. It is a bit weird to have a seemingly unused log creation line, then a bunch of logs later on that don't look connected to it. The more oddities you put in the code the harder it is for a new engineer to begin to grok it.

OK.

For starters, it has builtin support for formatting ranges. So if you're ever looping over entities formatting them, as long as the entity itself has a formatter defined, you can simply pass it directly to formatv and it will do the right thing. (It has some customization support built in as well so you can control how to separate items).
Secondly, When you have to do some work to get the value you want to format, you could either define a custom formatter for the type in question, and the formatter does all the work, or you could write a function to return the value you want to format, and call that in the argument list, as long as the return type has an associated formatter.
I'm sure there will still be cases where you want to use the old method, but I think this would handle quite a few of them and reduce the amount of code you have to read / write when dealing with logging code.

Agreed. We're bound to run into cases where we want to have code that
is run only when logging is enabled, but it isn't directly a logging
statement. So we should have some form of "if(log)", but it should not
need to be used that often.

It should be a formatter option for collections to allow you to print out different ranges. If I have a target and a target knows how to print the process and the process knows how to print the threads and the threads know how to print the stack frames, it doesn't mean I always want to see everything. Sounds like we can easily control this in the llvm::formatv case with extra formatting options:

This might print just a summary of the target:

llvm::formatv("{0}", target);

This might print the target and info about the process:

llvm::formatv("{0;process}", target);

The question becomes, can we then forward arguments into the "Process" logger to control if the threads are displayed?

llvm::formatv("{0;process{threads{frames}}}", target);

This would allow recursive formatters to be called with specified arguments. Just a thought, but if we are going to delve into recursive formatters, we might need to think about that.

So the stream stuff all now needs to mark where the logging begins and ends:

  if (log) *log << log::start_message << 1 << ' ' << 2 << ' ' << 3 << '\n' << std::end_message;
And that starts to ruin the nice concrete log calls we had with log->Printf() since each call to log->Printf() was a complete message. I am open to ideas on how to solve this. Anyone have any ideas?

So, the way I solve that in my mockup is by having a temporary object.
It's constructor does any initialization work necessary (in my case,
printing file and line number). In the destructor I flush the log
message to the underlying streams. You could use the same approach,
but synchronize via mutexes, or whatever else you want.

So the file and line number should be done by the logging logic probably at the same time as any of the other options (pid/tid, thread name, timestamp, backtrace, etc).

I probably haven't read as many logs as you have, but I've done my
share. FWIW, I actually believe adding more structure to the logs
makes them more understandable. If you have the file name (nicely
formatted in a table), then it's easy to ignore it by just scrolling
the editor window sideways. And it means you can even grep by it, and
remove the unrelated lines completely. I find the log sources which do
multi-line log lines moderately annoying (data formatters come to mind
mosly, they tend to add empty lines to group things visually), as they
assume that they are the ones you care about. If you do it's great,
but if you don't, they just break up the flow of the log file for no
reason.

That said, I like Greg's idea of making the file information a log
enable flag, although I am afraid some of the current messages would
be somewhat useless without it.

pl

A formatter can literally do anything with the style string. So for example, suppose you have formatters defined for Process and Thread. each of these ultimately boils down to a function:

void format(llvm::raw_ostream &Out, const Process &P, StringRef Style);
void format(llvm::raw_ostream &Out, const Thread &P, StringRef Style);

We implement each of these, so Process can define its own Style language, as can Target, as can Thread. We can come up with something better than this, but say for the sake of illustration Thread’s style string worked like this: If empty, print thread id, otherwise treat each alphabetic character as representing a “code” of what to print. P = process, T = thread id, N = thread name. So “P - T - N” would print the process id, then the thread id, then the thread name, separated by dashes.

Now suppose you want to print a process. If style string is empty, it prints only the process id, otherwise if it equals T then it prints the first thread, or if it equals T* it prints all threads. T or T* can optionally be followed by arguments to forward to the thread in square brackets. So you could have:

formatv(“Process {0;P - T*[N]}”, process);

There’s obviously a balance between flexibility and conciseness, but I guess that’s no different than C++. It’s as flexible as you could possibly need, sometimes to the point of allowing you to make bad design decisions :slight_smile:

In the beginning probably best to just define some simple formatters that don’t do anything recursive, and then some obvious patterns might start to fall out and we can refine the styles we support for all of the various LLDB types.

Pavel: You might start this effort to improve logging by just adding a Formatv member function to log to replace printf.

I suspect many call sites would be made much less verbose this way. I think we’ll still need to do the rest too, but this seems like the least amount of work to make incremental progress