Delayed typo correction is fragile

Hi cfe-dev,

In clangd, we often hit the following assertion in the Sema’s destructor:

assert(DelayedTypos.empty() && “Uncorrected typos!”);

It’s purpose seems to be to indicate that there are places that should call CorrectDelayedTypos* and forget to do so.

However, when the assertion fires it’s hard to track down the exact place where the CorrectDelayedTypos* call should be inserted. Moreover, it never fires in normal clang executions because the driver passes -disable-free by default, which stops Sema’s destructor from running.

That leaves clangd and libclang in an unfortunate situation of being the only clients that experience crashes (apart from lit that run clang -cc1 mode and don’t pass -disable-free).

We would like to avoid assertion failures for those, which leads me to the following questions:

  • Is there a way to quickly track down the place that miss the CorrectDelayedTypos* call?
  • If no, would it be ok to turn this assertion into some kind of debug-only warning and
    document that some typos are never actually corrected due to limitations?
  • Even broader, are there any ideas for an alternative design that would be more resilient to
    changes in the codebase?
    E.g. no delayed typo corrections or easy-to-audit places that should run CorrectDelayedTypos*, etc.

Richard and I happened to discuss this problem two years ago, I think. What I half remember from the conversation is that the problem of delayed typo checking is somewhat equivalent to memory ownership management. Some methods produce Stmt nodes that may contain delayed typos, and others do not, and the caller just has to “know” if they are responsible for checking delayed typos or not. I guess it’s different in that you can check the same Stmt tree twice for typos, but you can’t free memory twice. The consequences of forgetting to check delayed typos are that the compiler may accept invalid code containing typos, only to crash later in CodeGen or analysis.

Richard explained that Clang used to do manual memory management for Stmt nodes. You can see evidence of this in clang/Sema/Ownership.h header. Experience showed that managing ownership was onerous, so a decision was made to make all nodes implicitly owned by the ASTContext. Given that clang removed one form of ownership management, it seems like it would be awkward to try to use the type system to track delayed typo correction.

Based on that background, I think we need a design that has safe failure modes. For example, if we could emit errors at end of TU instead of asserting, that would probably be good enough. Then placing the calls to CorrectDelayedTyposInExpr well is a matter of diagnostic quality, not correctness.

Anyway, we got this far in the discussion, but then had to run off and go do other things as usual. :slight_smile: If someone has time to improve this, I agree, it would be great, this assertion pops up in the clang fuzzer reports again and again.

Thanks for starting this discussion, this does need a systematic fix.

We would like to avoid assertion failures for those, which leads me to the following questions:

  • Is there a way to quickly track down the place that miss the CorrectDelayedTypos* call?

A common pattern is that an error causes an Expr subtree to be discarded, and the code that does so “forgets” to call CorrectDelayedTypos.
e.g. https://reviews.llvm.org/rL366200
There’s usually a diagnostic emitted before the Expr is discarded, so in these cases poking around the diag emit location often sheds light. But my fear is there are tens or hundreds of these bugs, and it’s hard to enumerate them.

At some level, this seems silly - if the Expr doesn’t survive, its typos don’t need to be corrected to protect CodeGen from them. The diagnostics are probably important though.
If we could ensure the diagnostics are emitted as Reid says, and reduce the requirement to be that Exprs that survive parsing get typo-corrected, then this might be tractable.

We also don’t have the capacity to completely redesign the typo correction, so ensuring it (1) does not crash us and (2) produces diagnostics without typo-correction looks like a reasonable short-term fix.
I will look into emitting the diagnostics emitted instead of asserting.

Thanks for the input!

https://reviews.llvm.org/D64799 removes the assertion and emits diagnostics at the end of the TU.

End of the TU sounds too late to me - IR generation is done incrementally (at the end of functions, for instance - though I’m not sure that’s the only point), so leaving typos in until the end of the TU could lead to the “IR generation getting weird because of pending typo corrections” issue, no?

End of the TU sounds too late to me - IR generation is done incrementally (at the end of functions, for instance - though I’m not sure that’s the only point), so leaving typos in until the end of the TU could lead to the “IR generation getting weird because of pending typo corrections” issue, no?

Well, we currently assert in ~Sema, which is after the end of the TU, so if we hit today’s assert, we’ve already done incremental codegen without crashing. Diagnosing at end of TU doesn’t make the situation worse. The way I understand clang’s incremental codegen strategy, we generate code after every top level decl. We sometimes skip incremental codegen if errors have been reported or if a Decl is invalid. I think it’s a bug if an error hasn’t been reported but a delayed typo expr gets sent to codegen, and we should add asserts to defend against that.

Thanks for pointing this out, the codegen does run on every top level declaration.
However, it does nothing if any errors were reported.

That means we could prevent codegen by:

  1. emitting the diagnostics for uncorrected typos on each top-level declaration, before the codegen kick in,
  2. checking if there are any “pending” typos in addition to checking for errors before doing the codegen.

Either should be doable. (1) has the advantage of reporting the errors earlier, making them easier to fix/diagnose.

However, (1) might not be a little involved. At least I got the impression from talking to various people that some typos are only fixed at template instantiation time.
The code to figure out at which point the uncorrected typos should be emitted for template instantiations might be a little involved because of this.

I would be surprised if the proposed assertion(!HasErrors || Typos.empty()) ever fired in practice. It’s rare to see only a single compiler error coming from clang, so I would expect almost any typo to induce at least another error right away. That’s actually why I’d expect the “broken codegen” to be hardly possible in practice.

Out of the options we have, I’ll probably add checks for (2) to codegen and emit the delayed typos at the end of TU. That seems to be the simplest option, at least.
Happy to go with (1) or the alternative assertion if people think the proposed approach would lead to too many diagnostics.

I think what you said about template instantiation means we can’t do 2, but I might not understand how the system works. What I had in mind was making clang assert (or making sure it already does) when trying to codegen TypoExpr. I see TypoExpr has no hits in clang/lib/CodeGen. It might already, though, in which case I don’t think we need to do anything more than the end-of-tu diagnosis.

Want to share some of my experience with non-empty DelayedTypos in ~Sema, hopefully it can be useful. Many times there are lingering delayed typos because of multiple typos in an expression. So when we try to fix one typo, it fails and we discard the entire expression which can contain more TypoExpr in it. Correct way seems to check discarded expressions and clean up corresponding TypoExpr. But with the current design there is no simple way to do that and it has limited value when assertions are disabled.

I think the change suggested in https://reviews.llvm.org/D64799 is reasonable and should help with low-value assertion failures.

Thanks,
Volodymyr