[ubsan] Check recovery

Many of ubsan's checks are recoverable, and it'd be great to support
doing so when possible. Example use-case might be trying
-fsanitize=undefined on a new code base that has many errors, where
recovery would allow fixing them in bulk and require fewer
build/test/fix cycles.

This has been discussed a bit previously, and one of the biggest
arguments against this was that making these paths recoverable might
have a negative impact on performance.

To help address this concern, attached are patches that add a
-fsanitize-recover /cc1/ flag to enable performance tests with and
without recovery enabled. Suggestions on how to proceed with such
testing would be appreciated :).

If performance testing proves the difference is not a concern, the
next step would be making recovery the default and
-fsanitize-no-recover a driver-level flag (with no need for
-fsanitize-recover). It's certainly important to enable users to
specify they want program execution halted on the first detected error
as is presently the default.

The "recover" phrasing is used (as opposed to "abort" of "trap"
terminology) because many checks cannot be recovered from, so a flag
like -fsanitize-no-abort would be misleading what it does.

Review and feedback on this approach and its direction would be much
appreciated.

Thanks!

~Will

0001-ubsan-Add-flag-to-enable-recovery-from-checks-when-p.patch (13.3 KB)

0001-ubsan-Refactor-handlers-to-never-abort-use-separate-.patch (5.5 KB)

Just a thought: it seems like it'd be useful to enable recovery for some checks but not others. "-fsanitize=undefined -fsanitize-recover=integer"?

Forgive me, the previous patches erroneously aborted on a dynamic type
cache miss, updated patches attached. Only the clang patch has
changed: added a parameter specifying this call should *always* be
recovered from.

Thanks!

0001-ubsan-Add-flag-to-enable-recovery-from-checks-when-p.patch (14 KB)

0001-ubsan-Refactor-handlers-to-never-abort-use-separate-.patch (5.5 KB)

I’ll leave all the ubsan-relevant part to Richard. Just a general comment:
I think you’d need driver support for these flags (especially if you follow Jordan’s idea).
Your flags targets -fsanitize=undefined, but it wouldn’t make any sense for e.g. -fsanitize=address
(as it cannot recover from errors by design).

Many of ubsan’s checks are recoverable, and it’d be great to support
doing so when possible. Example use-case might be trying
-fsanitize=undefined on a new code base that has many errors, where
recovery would allow fixing them in bulk and require fewer
build/test/fix cycles.

This has been discussed a bit previously, and one of the biggest
arguments against this was that making these paths recoverable might
have a negative impact on performance.

To help address this concern, attached are patches that add a
-fsanitize-recover /cc1/ flag to enable performance tests with and
without recovery enabled. Suggestions on how to proceed with such
testing would be appreciated :).

If performance testing proves the difference is not a concern, the
next step would be making recovery the default and
-fsanitize-no-recover a driver-level flag (with no need for
-fsanitize-recover). It’s certainly important to enable users to
specify they want program execution halted on the first detected error
as is presently the default.

As we discussed previously, I think this is a great direction. (Although at the driver level, we should have the no-recover flag, so that an earlier command-line argument can be overridden.) We would also need to make it clear in the documentation that we only make a best-effort attempt to recover, and that some sanitizers simply cannot recover (for instance, the vptr sanitizer might crash on an error, and asan doesn’t have the ability to recover).

The “recover” phrasing is used (as opposed to “abort” of “trap”
terminology) because many checks cannot be recovered from, so a flag
like -fsanitize-no-abort would be misleading what it does.

Review and feedback on this approach and its direction would be much
appreciated.

To keep the instrumented code size down, could you add separate handler entry points for the recovering and not-recovering cases? (Alternatively, find a spare bit in the static data and put the flag there.)

— a/include/clang/Basic/LangOptions.def
+++ b/include/clang/Basic/LangOptions.def
@@ -173,6 +173,8 @@ BENIGN_LANGOPT(RetainCommentsFromSystemHeaders, 1, 0, “retain documentation comm
BENIGN_LANGOPT(Sanitize##ID, 1, 0, NAME " sanitizer”)
#include “clang/Basic/Sanitizers.def”

+BENIGN_LANGOPT(SanitizeNoRecover, 1, 1, “abort at runtime when a sanitizer check fails”)

This doesn’t need to be a language option. Just add it to CodeGenOpts?

— a/include/clang/Driver/CC1Options.td
+++ b/include/clang/Driver/CC1Options.td
@@ -450,6 +450,10 @@ def fdeprecated_macro : Flag<[“-”], “fdeprecated-macro”>,
HelpText<“Defines the __DEPRECATED macro”>;
def fno_deprecated_macro : Flag<[“-”], “fno-deprecated-macro”>,
HelpText<“Undefines the __DEPRECATED macro”>;
+def fsanitize_recover : Flag<[“-”], “fsanitize-recover”>,

  • HelpText<“Attempt to recover if a sanitizer check fails at runtime”>;
    +def fsanitize_no_recover : Flag<[“-”], “fsanitize-no-recover”>,
  • HelpText<“Don’t attempt to recover if a sanitizer check fails at runtime”>;

Remove fsanitize_no_recover. -cc1 isn’t supposed to be user-friendly, and usually only a flag for the non-default value of an option.

— a/lib/CodeGen/CodeGenFunction.h
+++ b/lib/CodeGen/CodeGenFunction.h
@@ -2581,7 +2581,7 @@ public:
void EmitCheck(llvm::Value *Checked, StringRef CheckName,
llvm::ArrayRef<llvm::Constant *> StaticArgs,
llvm::ArrayRef<llvm::Value *> DynamicArgs,

  • bool Recoverable = false);
  • bool Recoverable = true, bool AlwaysRecover = false);

Please add a three-value enum for this.

Thanks for the feedback, updated patches attached.

Many of ubsan's checks are recoverable, and it'd be great to support
doing so when possible. Example use-case might be trying
-fsanitize=undefined on a new code base that has many errors, where
recovery would allow fixing them in bulk and require fewer
build/test/fix cycles.

This has been discussed a bit previously, and one of the biggest
arguments against this was that making these paths recoverable might
have a negative impact on performance.

To help address this concern, attached are patches that add a
-fsanitize-recover /cc1/ flag to enable performance tests with and
without recovery enabled. Suggestions on how to proceed with such
testing would be appreciated :).

If performance testing proves the difference is not a concern, the
next step would be making recovery the default and
-fsanitize-no-recover a driver-level flag (with no need for
-fsanitize-recover). It's certainly important to enable users to
specify they want program execution halted on the first detected error
as is presently the default.

As we discussed previously, I think this is a great direction. (Although at
the driver level, we should have the no-recover flag, so that an earlier
command-line argument can be overridden.) We would also need to make it
clear in the documentation that we only make a best-effort attempt to
recover, and that some sanitizers simply cannot recover (for instance, the
vptr sanitizer might crash on an error, and asan doesn't have the ability to
recover).

These comments are intended for the next iteration, yes? The one that
actually adds any driver flags?

If so, duly noted and agreed regarding documentation and ensuring
overrides work. If not, can you please explain? Thanks!

The "recover" phrasing is used (as opposed to "abort" of "trap"
terminology) because many checks cannot be recovered from, so a flag
like -fsanitize-no-abort would be misleading what it does.

Review and feedback on this approach and its direction would be much
appreciated.

To keep the instrumented code size down, could you add separate handler
entry points for the recovering and not-recovering cases? (Alternatively,
find a spare bit in the static data and put the flag there.)

Done. Renamed the aborting variants since they are just wrappers for
the non-aborting versions.

--- a/include/clang/Basic/LangOptions.def
+++ b/include/clang/Basic/LangOptions.def
@@ -173,6 +173,8 @@ BENIGN_LANGOPT(RetainCommentsFromSystemHeaders, 1, 0,
"retain documentation comm
BENIGN_LANGOPT(Sanitize##ID, 1, 0, NAME " sanitizer")
#include "clang/Basic/Sanitizers.def"

+BENIGN_LANGOPT(SanitizeNoRecover, 1, 1, "abort at runtime when a sanitizer
check fails")

This doesn't need to be a language option. Just add it to CodeGenOpts?

--- a/include/clang/Driver/CC1Options.td
+++ b/include/clang/Driver/CC1Options.td
@@ -450,6 +450,10 @@ def fdeprecated_macro : Flag<["-"],
"fdeprecated-macro">,
   HelpText<"Defines the __DEPRECATED macro">;
def fno_deprecated_macro : Flag<["-"], "fno-deprecated-macro">,
   HelpText<"Undefines the __DEPRECATED macro">;
+def fsanitize_recover : Flag<["-"], "fsanitize-recover">,
+ HelpText<"Attempt to recover if a sanitizer check fails at runtime">;
+def fsanitize_no_recover : Flag<["-"], "fsanitize-no-recover">,
+ HelpText<"Don't attempt to recover if a sanitizer check fails at
runtime">;

Remove fsanitize_no_recover. -cc1 isn't supposed to be user-friendly, and
usually only a flag for the non-default value of an option.

Good call on both fronts, updated accordingly.

--- a/lib/CodeGen/CodeGenFunction.h
+++ b/lib/CodeGen/CodeGenFunction.h
@@ -2581,7 +2581,7 @@ public:
   void EmitCheck(llvm::Value *Checked, StringRef CheckName,
                  llvm::ArrayRef<llvm::Constant *> StaticArgs,
                  llvm::ArrayRef<llvm::Value *> DynamicArgs,
- bool Recoverable = false);
+ bool Recoverable = true, bool AlwaysRecover = false);

Please add a three-value enum for this.

Much better, thanks!

~Will

0001-ubsan-Add-flag-to-enable-recovery-from-checks-when-p.patch (18 KB)

0001-ubsan-Refactor-handlers-to-have-separate-entry-point.patch (10.7 KB)

Thanks for the feedback, updated patches attached.

Just a few things, then this is fine. Clang patch:

+def fsanitize_recover : Flag<["-"], "fsanitize-recover">,
+ HelpText<"Attempt to recover from failed sanitizer checks when possible.">;

Drop the full stop (we're not consistent on this, but no full stop is
more common).

+ // Checks that have two variants use a suffix to differentiate them
+ bool NeedsAbortSuffix = ((RecoverKind == CRK_Recoverable) &&
+ !CGM.getCodeGenOpts().SanitizeRecover);

I think this should be RecoverKind != CRK_Unrecoverable: for the vptr
sanitizer, we should still choose between an _abort and non-_abort
variant, but the _abort variant would only sometimes abort.

+ Twine FunctionName = "__ubsan_handle_" + CheckName +
+ Twine(NeedsAbortSuffix? "__abort" : "");

A single underscore would be fine here.

+ /// Never terminate program due to this check firing
+ CRK_AlwaysRecoverable

Likewise update this, to say something like: the runtime needs to
perform more tests to determine whether the check has failed, so this
check must always be able to recover.

compiler-rt patch:

+ // RECOVER-NOT: fatal error
+ // ABORT-NOT: fatal error

This is no longer correct -- we should change this from "fatal error"
to "runtime error". This is something I've been meaning to do anyway,
to make the ubsan diagnostics more easily distinguishable from Clang
diagnostics. Preferably do this as a separate change (you have my LGTM
in advance for that one!).