LLDB Evolution - Final Form

Ok, in that case I agree with you more. We should test the scripting interface. It’s part of the software, it should be treated as such. 100% on board. But if we find that it is lacking (or even just inconvenient) to test the full capabilities of the debugger, we shouldn’t force it to.

All of our tests right now are full-scale integration tests. Integration tests are certainly helpful, but they aren’t the entire world. And they shouldn’t even be a first line of defense, but rather a 3rd or 4th line of defense due to how general they are. Like the example I gave earlier about TestSigned and TestUnsigned on Windows, where one was failing and one was passing, and the reason had nothing to do with signedness.

Ok, in that case I agree with you more. We should test the scripting interface. It's part of the software, it should be treated as such. 100% on board. But if we find that it is lacking (or even just inconvenient) to test the full capabilities of the debugger, we shouldn't force it to.

I would make an even stronger statement. As lldb developers, if there's something we want to do and it isn't available, we go add it to lldb_private, and maybe wire it up to a command so we can use it. That's the path of least resistance. If it weren't for writing the testsuite using the SB API's, we wouldn't tend to write much code using it ourselves. Writing test cases for the Python API in the testsuite is a nice forcing function to get US using the SB API's to solve actual problems. So where reasonable, I think we actually SHOULD force ourselves to use it.

There are some things it is annoying to test from the outside, particularly small utility functions that you need to test error cases and the like that it's just hard to get the actual debugger to generate reliably. Those are great candidates for white box type unit testing. But the full capabilities of the debugger should either be available through the SB API's or they really aren't doing anybody any good. So if the SB API's are not up to revealing the full capabilities of the debugger, we need to fix that, not fall back on using stuff only lldb developers can have access just so we can get our tests written more quickly.

Jim

BTW, another way to achieve this worthy goal is to require that all the lldb command-line commands be written using the C++ Version of the SB API's. It should be its own module, and the driver should link to that, not directly to the LLDB.framework at all. That would also force us to provide a way to make "real" command line commands using the SB API, rather than commands that parse their arguments and options and provide help in their own ad-hoc way, and don't do command line completion at all.

I don't think we can force Python here, partly because I don't want to favor a particular scripting language at this fundamental a level, but more because there are legit targets where providing Python is problematic. But this would test everything up to the SWIG bindings, and make sure they stay full-featured enough for real work.

We didn't do it this way originally because getting the commands up and working and building on them provided a more plausible development path than having to design our public API before we could start testing it well. I am pretty sure that was the right choice at the time, but the reasons are less relevant now that we've got a working debugger and a pretty well designed public API.

Jim

But StringRef is a smart string wrapper and it is there for exactly this reason: to make string handling correct. So let us let it be smart and not crash if you make it with null and call .size() on it...

FWIW I added a function to StringRef the other day that looks like this:

static StringRef withNullAsEmpty(const char *Str) {
return StringRef(Str ? Str : “”);
}

I’ve been using this code in converting existing uses of const char * over to StringRef. It’s not 100% what you want, but at least it’s better than nothing.

Following up with Kate's post from a few weeks ago, I think the dust has settled on the code reformat and it went over pretty smoothly for the most part. So I thought it might be worth throwing out some ideas for where we go from here. I have a large list of ideas (more ideas than time, sadly) that I've been collecting over the past few weeks, so I figured I would throw them out in the open for discussion.

I’ve grouped the areas for improvement into 3 high level categories.

  • De-inventing the wheel - We should use more code from LLVM, and delete code in LLDB where LLVM provides a solution. In cases where there is an LLVM thing that is *similar* to what we need, we should extend the LLVM thing to support what we need, and then use it. Following are some areas I've identified. This list is by no means complete. For each one, I've given a personal assessment of how likely it is to cause some (temporary) hiccups, how much it would help us in the long run, and how difficult it would be to do. Without further ado:
    • Use llvm::Regex instead of lldb::Regex
      • llvm::Regex doesn’t support enhanced mode. Could we add support for this to llvm::Regex?
      • Risk: 6
      • Impact: 3
      • Difficulty / Effort: 3 (5 if we have to add enhanced mode support)

As long as it supports all of the features, then this is fine. Otherwise we will need to modify the regular expressions to work with the LLVM version or back port any advances regex features we need into the LLVM regex parser.

    • Use llvm streams instead of lldb::StreamString
      • Supports output re-targeting (stderr, stdout, std::string, etc), printf style formatting, and type-safe streaming operators.
      • Interoperates nicely with many existing llvm utility classes
      • Risk: 4
      • Impact: 5
      • Difficulty / Effort: 7

I have worked extensively with both C++ streams and with printf. I don't find the type safe streaming operators to be readable and a great way to place things into streams. Part of my objection to this will be quelled if the LLVM streams are not stateful like C++ streams are (add an extra "std::hex" into the stream and suddenly other things down stream start coming out in hex). As long as printf functionality is in the LLVM streams I am ok with it.

    • Use llvm::Error instead of lldb::Error
      • llvm::Error is an error class that *requires* you to check whether it succeeded or it will assert. In a way, it's similar to a C++ exception, except that it doesn't come with the performance hit associated with exceptions. It's extensible, and can be easily extended to support the various ways LLDB needs to construct errors and error messages.
      • Would need to first rename lldb::Error to LLDBError so that te conversion from LLDBError to llvm::Error could be done incrementally.
      • Risk: 7
      • Impact: 7
      • Difficulty / Effort: 8

We must be very careful here. Nothing can crash LLDB and adding something that will be known to crash in some cases can only be changed if there is a test that can guarantee that it won't crash. assert() calls in a shared library like LLDB are not OK and shouldn't fire. Even if they do, when asserts are off, then it shouldn't crash LLDB. We have made efforts to only use asserts in LLDB for things that really can't happen, but for things like constructing a StringRef with NULL is one example of why I don't want to use certain classes in LLVM. If we can remove the crash threat, then lets use them. But many people firmly believe that asserts belong in llvm and clang code and I don't agree.

I’m surprise by your aversion to assertions, what is your suggested alternative? Are you expecting to check and handle every possible invariants everywhere and recover (or signal an error) properly? That does not seem practical, and it seems to me that it'd lead to just involving UB (or breaking design invariant) without actually noticing it.

We have to as a debugger. We get input from a variety of different compilers and we parse and use things there were produced by our toolchains and others. Crashing Xcode because someone accidentally created a llvm::StringRef(NULL) should not happen. It is not OK for library code to crash. This is quite OK for compilers, linkers and other tools, but it isn't for any other code.

Well, except that no, it is not OK for the compiler either, we want to use it as a library (JIT, etc.).
But it comes back to my point: you are against asserting for enforcing “external input” where there should be sanitization done with proper error handling, which does not say anything about the majority of assertions which are not dealing with user input.

I am all for asserting in debug builds. Just not in release builds. And just because you assert, this doesn't mean it is ok to not handle what is being asserted. There is code in clang like:

void do_something(foo *p)
{
    assert(p);
    p->crash();
}

This should be:

void do_something(foo *p)
{
    assert(p);
    if (p)
  p->crash();
}

For instance, it is still not clear to me what’s wrong with the asserting Error class mentioned before.

If someone adds code that succeeds almost all of the time, including in the test suite, and then we release LLDB and someone uses bad debug info and the error condition suddenly fires and they didn't check the error, it is not acceptable to crash.

Since there are so many asserts, LLDB code is currently responsible for figuring out what will make clang unhappy and we must be sure to not feed it anything that makes it unhappy or we crash. So I don't see that as better.

Also many asserts are in header files so even if you build llvm and clang with asserts off, but then build our project that uses LLVM with asserts on, you get LLVM and clang asserts when you don't want them.

It is not really supported to use the LLVM C++ headers without assertions and link to a LLVM build with assertions (and vice-versa).

As we have found out. We claim llvm and clang can be linked into tools a libraries, but it really means, you should really use it out of process because it might crash at any time so don't try and use it in a program you actually want to be able run for a long time.

I’m not sure how this sentence relates to the fact that the headers are not ABI insensitive to -DDEBUG.
It seems totally orthogonal.

It causes us to crash. There is nothing that currently enforces the fact that you can build llvm/clang without asserts yet use the headers with asserts on. It happens, and it is happening to us. If you are designing a library, then asserts in header files are just a bad idea. Very easy to avoid and fix.

This kind of philisophical debate is probably worthy of a separate thread :slight_smile: That being said, I think asserts are typically used in places where the assert firing means “You’re going to crash anyway

It’s like trying to handle a bad alloc. What are you going to do anyway? You can crash now or you can crash later.

It’s for guaranteeing the invariants of a function. If you violate the invariants of a function, then you might as well be passing null to strlen(), or aliasing a value through a union, or dereferencing a null pointer, or making a signed overflow, or accessing an array out of bounds. It’s all equally undefined behavior, so crashing as soon as possible at least alerts you to the source of the undefined behavior, rather than trying to hold on as long as possible and then try to reconstruct the crash dynamics from the crime scene after the fact, so to speak.

The example you gave about
assert(p);
if (p)
p->crash();

I disagree with. There are two ways to look at this. One way to look at it is “I made my code safer by being defensive. I don’t know why p might be null, but if it is, at least we won’t crash”. Another way to look at it is “I made my code more complex. Now it’s harder to reason about, harder to analyze, and harder to test”.

Which one is better? We’re in subjective territory, but peraonslly I lean towards simple = better. Simple code is easier to test because the are fewer code paths, and I believe that if a code path isn’t tested, then it’s broken.

Following up with Kate’s post from a few weeks ago, I think the dust has settled on the code reformat and it went over pretty smoothly for the most part. So I thought it might be worth throwing out some ideas for where we go from here. I have a large list of ideas (more ideas than time, sadly) that I’ve been collecting over the past few weeks, so I figured I would throw them out in the open for discussion.

I’ve grouped the areas for improvement into 3 high level categories.

• De-inventing the wheel - We should use more code from LLVM, and delete code in LLDB where LLVM provides a solution. In cases where there is an LLVM thing that is similar to what we need, we should extend the LLVM thing to support what we need, and then use it. Following are some areas I’ve identified. This list is by no means complete. For each one, I’ve given a personal assessment of how likely it is to cause some (temporary) hiccups, how much it would help us in the long run, and how difficult it would be to do. Without further ado:
• Use llvm::Regex instead of lldb::Regex
• llvm::Regex doesn’t support enhanced mode. Could we add support for this to llvm::Regex?
• Risk: 6
• Impact: 3
• Difficulty / Effort: 3 (5 if we have to add enhanced mode support)

As long as it supports all of the features, then this is fine. Otherwise we will need to modify the regular expressions to work with the LLVM version or back port any advances regex features we need into the LLVM regex parser.

• Use llvm streams instead of lldb::StreamString
• Supports output re-targeting (stderr, stdout, std::string, etc), printf style formatting, and type-safe streaming operators.
• Interoperates nicely with many existing llvm utility classes
• Risk: 4
• Impact: 5
• Difficulty / Effort: 7

I have worked extensively with both C++ streams and with printf. I don’t find the type safe streaming operators to be readable and a great way to place things into streams. Part of my objection to this will be quelled if the LLVM streams are not stateful like C++ streams are (add an extra “std::hex” into the stream and suddenly other things down stream start coming out in hex). As long as printf functionality is in the LLVM streams I am ok with it.

• Use llvm::Error instead of lldb::Error
• llvm::Error is an error class that requires you to check whether it succeeded or it will assert. In a way, it’s similar to a C++ exception, except that it doesn’t come with the performance hit associated with exceptions. It’s extensible, and can be easily extended to support the various ways LLDB needs to construct errors and error messages.
• Would need to first rename lldb::Error to LLDBError so that te conversion from LLDBError to llvm::Error could be done incrementally.
• Risk: 7
• Impact: 7
• Difficulty / Effort: 8

We must be very careful here. Nothing can crash LLDB and adding something that will be known to crash in some cases can only be changed if there is a test that can guarantee that it won’t crash. assert() calls in a shared library like LLDB are not OK and shouldn’t fire. Even if they do, when asserts are off, then it shouldn’t crash LLDB. We have made efforts to only use asserts in LLDB for things that really can’t happen, but for things like constructing a StringRef with NULL is one example of why I don’t want to use certain classes in LLVM. If we can remove the crash threat, then lets use them. But many people firmly believe that asserts belong in llvm and clang code and I don’t agree.

I’m surprise by your aversion to assertions, what is your suggested alternative? Are you expecting to check and handle every possible invariants everywhere and recover (or signal an error) properly? That does not seem practical, and it seems to me that it’d lead to just involving UB (or breaking design invariant) without actually noticing it.

We have to as a debugger. We get input from a variety of different compilers and we parse and use things there were produced by our toolchains and others. Crashing Xcode because someone accidentally created a llvm::StringRef(NULL) should not happen. It is not OK for library code to crash. This is quite OK for compilers, linkers and other tools, but it isn’t for any other code.

Well, except that no, it is not OK for the compiler either, we want to use it as a library (JIT, etc.).
But it comes back to my point: you are against asserting for enforcing “external input” where there should be sanitization done with proper error handling, which does not say anything about the majority of assertions which are not dealing with user input.

I am all for asserting in debug builds. Just not in release builds.

Right, we’re on the same page, and that’s what we do in LLVM.

And just because you assert, this doesn’t mean it is ok to not handle what is being asserted. There is code in clang like:

void do_something(foo *p)
{
assert(p);
p->crash();
}

This should be:

void do_something(foo *p)
{
assert(p);
if (p)
p->crash();
}

Well to be honest I wouldn’t use a pointer in the first place but a reference in such case.
Otherwise, I’d be fine with such code inside a private API (inside a LLVM library for example), where you sanitize the input at the API boundary and assert on invariant inside the library.

For instance, it is still not clear to me what’s wrong with the asserting Error class mentioned before.

If someone adds code that succeeds almost all of the time, including in the test suite, and then we release LLDB and someone uses bad debug info and the error condition suddenly fires and they didn’t check the error, it is not acceptable to crash.

This is not what it is about, the assertion fires when the error is unchecked, independently of success or failure, for example:

ErrorOr getFoo();
Foo bar() {
auto F = getFoo();
return F.getValue();
}

Here the developer does not check the returned error from the call to getFoo(). This is a case where bad-things would happen when there is an actual error, but it wouldn’t be caught during testing until the error actually happens.

The LLVM Error class will always assert, even when getFoo() succeeds and there is no actual error.

This kind of philisophical debate is probably worthy of a separate thread :slight_smile: That being said, I think asserts are typically used in places where the assert firing means "You're going to crash *anyway*"

It's like trying to handle a bad alloc. What are you going to do anyway? You can crash now or you can crash later.

With a StringRef being init'ed with NULL? You don't need to crash for that. In case where this is the only thing that you can do, that might be ok, but there are many many places where crashing isn't necessary.

It's for guaranteeing the invariants of a function. If you violate the invariants of a function, then you might as well be passing null to strlen(), or aliasing a value through a union, or dereferencing a null pointer, or making a signed overflow, or accessing an array out of bounds. It's all equally undefined behavior, so crashing as soon as possible at least alerts you to the source of the undefined behavior, rather than trying to hold on as long as possible and then try to reconstruct the crash dynamics from the crime scene after the fact, so to speak.

The example you gave about
assert(p);
if (p)
  p->crash();

I disagree with. There are two ways to look at this. One way to look at it is "I made my code safer by being defensive. I don't know why p might be null, but if it is, at least we won't crash". Another way to look at it is "I made my code more complex. Now it's harder to reason about, harder to analyze, and harder to test".

Which one is better? We're in subjective territory, but peraonslly I lean towards simple = better. Simple code is easier to test because the are fewer code paths, and I believe that if a code path isn't tested, then it's broken.

My only point is: if you are a library, you can't crash because you are unhappy with what you have. What that really means is take it out of process otherwise you will crash. I think the code should be as resilient as possible. I just moves the onus onto users of your library to figure out what makes you assert before you can possibly assert. And if there are rare cases that don't happen often, you find out about them when your users crash as they do something that causes the problem to be seen for the first time which doesn't make for a great experience. If all assertions were documented, then that would be one thing. But they aren't. You just run into them and must decipher what it means.

Then I am fine with it. If it is always enforced, then there is no problem.

This kind of philisophical debate is probably worthy of a separate thread :slight_smile: That being said, I think asserts are typically used in places where the assert firing means "You're going to crash *anyway*"

This is emphatically NOT the case.

One of the first tasks I was given when I started at Qualcomm was to fix the disassembler for Hexagon. It was a mess - it would assert if the disassembly tables couldn't identify an opcode. Maybe that's fine for an assembler, assert if you can't generate an opcode, but an unidentified opcode should print a warning and move on. It's not a fatal error to disassemble data!

There are other instances of this in llvm. I agree with Greg - libraries shouldn't assert. Send an error back to the caller, and let the caller handle it. A typo in my expression that lldb sends to clang shouldn't crash my debug session.

From: lldb-dev [mailto:lldb-dev-bounces@lists.llvm.org] On Behalf Of
Zachary Turner via lldb-dev
Sent: Tuesday, September 20, 2016 12:47 PM

> This kind of philisophical debate is probably worthy of a separate
thread :slight_smile: That being said, I think asserts are typically used in places
where the assert firing means "You're going to crash *anyway*"

This is emphatically NOT the case.

One of the first tasks I was given when I started at Qualcomm was to fix
the disassembler for Hexagon. It was a mess - it would assert if the
disassembly tables couldn't identify an opcode. Maybe that's fine for an
assembler, assert if you can't generate an opcode, but an unidentified
opcode should print a warning and move on. It's not a fatal error to
disassemble data!

There are other instances of this in llvm. I agree with Greg - libraries
shouldn't assert. Send an error back to the caller, and let the caller
handle it. A typo in my expression that lldb sends to clang shouldn't crash
my debug session.

It depends on what you use the assert for, right? If it's used to document
invariants that you control, then having them crash is ok, since those
should always be true. You shouldn't assert on input you don't control.

So in a public api, you'd do:

my_err my_public_api(int p) {
  if (p < 0 || p > 10) {
    return make_err("p must be between 0 and 10");
  internal_fun(p);
}

But then in internal_fun you can do

void internal_fun(int p) {
  assert((p >= 0 && p <= 10) && "must pass correct p");
}

I do agree that asserts are sometimes used improperly. But who’s to say that the bug was the assert, and not the surrounding code? For example, consider this code:

assert(p);
int x = *p;

Should this assert also not be here in library code? I mean it’s obvious that the program is about to crash if p is invalid. Asserts should mean “you’re about to invoke undefined behavior”, and a crash is better than undefined behavior. It surfaces the problem so that you can’t let it slip under the radar, and it also alerts you to the point that the UB is invoked, rather than later.

What about this assert?

assert(ptr);
int x = strlen(ptr);

Surely that assert is ok right? Do we need to check whether ptr is valid EVERY SINGLE TIME we invoke strlen, or any other function for that matter? The code would be a disastrous mess.

If you think you’ve found an improper assert, the thing to do is raise it on the proper mailing list and present your case of why you think it’s an improper assert. asserting is not inherently wrong, but like anything, you have to use it right.

My 2¢:

assert(p);
int x = *p;

assert(ptr);
int x = strlen(ptr);

Both of these should either check for null, be in a situation where p is obviously good (e.g., p is data() from a stack-allocated std::vector), or use references. The assertion to my mind is like an admission "I'm not 100% sure, so let me crash if I'm wrong..." – if we're making that admission and not doing error handling, we're already a little shady.

Sean

I do agree that asserts are sometimes used improperly. But who's to say that the bug was the assert, and not the surrounding code? For example, consider this code:

assert(p);
int x = *p;

Should be written as:

assert(p);
if (!p)
    do_something_correct();
else
    int x = *p;

Should this assert also not be here in library code? I mean it's obvious that the program is about to crash if p is invalid. Asserts should mean "you're about to invoke undefined behavior", and a crash is *better* than undefined behavior. It surfaces the problem so that you can't let it slip under the radar, and it also alerts you to the point that the UB is invoked, rather than later.

What about this assert?

assert(ptr);
int x = strlen(ptr);

Surely that assert is ok right? Do we need to check whether ptr is valid EVERY SINGLE TIME we invoke strlen, or any other function for that matter? The code would be a disastrous mess.

Again, check before you call if this is in a shared library! What is so hard about that? It is called software that doesn't crash.

assert(ptr)
int x = ptr ? strlen(ptr) : 0;

If you think you've found an improper assert, the thing to do is raise it on the proper mailing list and present your case of why you think it's an improper assert. asserting is not inherently wrong, but like anything, you have to use it right.

The code to strlen() is well documented. All code in llvm and clang is not. Asserts fire off all the time for reasons the original author and others close to the code know about, but when the llvm and clang APIs are used by others, we don't know all of these cases. They aren't documented. The only way to find them is to crash when you run into them with questionable input. Not acceptable for a library.

I find it hard to believe that you are arguing that you cannot EVER know ANYTHING about the state of your program. :-/

This is like arguing that you should run a full heap integrity check every time you perform a memory write, just to be sure you aren’t about to crash.

If you make a std::vector<>, do we need to verify that its internal pointer is not null before we write to it? Probably not, right? Why not? Because it has a specification of how it works, and it is documented that you can construct one, you can use it.

It’s ok to document how functions work, and it is ok to assume that functions work the way they claim to work.

My concern about this example:

void do_something(foo *p)
{
assert(p);
if (p)
p->crash();
}

Is that by skipping the operation when the pointer is null is that it’s not clear what it should do if it’s precondition isn’t met. Sure, it won’t crash, but it’s also not going to “do something.” This can lead to corrupt state and postpones discovery of the bug.

If do_something is a public API, then, yes, you have to decide, document, and implement what to do if the caller passes in a null pointer. If it’s an internal API, then the silent elision of the crash merely hides the bug possibly corrupts the debugger’s state. A corrupt debugging state seems (to me) at least as bad as an obvious crash to the user. Crashes are going to get complained about and investigated. Silently doing the wrong thing just wastes everyone’s time.

Again, strlen is a stupid example as it is well documented. All of llvm and clang are not.

Another example of duplicated code is the debug info parsing (LLDB
source/Plugins/SymbolFile/DWARF vs LLVM lib/DebugInfo/DWARF). I took a
look at trying to rationalize the two when I first started working
with LLDB and it looked like a large task then, and they've only
continued to diverge.

However, diffs between the two trees are now at least not cluttered
with whitespace and formatting differences. I'll try to take another
look at these.

A better solution would be to return an error indicating what went wrong with llvm::Error.

I really can't believe people are ok with "well you called me with the wrong parameters that aren't documented, I am unhappy and will crash your program" mentality.