RFC: adding `__builtin_verbose_trap(string-literal)`

Summary

We propose adding a new Clang builtin, __builtin_verbose_trap(string-literal) , that makes the program stop execution abnormally and (potentially) shows a human-readable description of the reason for the termination when a debugger is attached or in a symbolicated crash log. The description message is stored in the debug info and has no impact on the binary size.

Motivation

For a tool that detects runtime bugs that can lead to a security vulnerability (e.g. an out-of-bounds access), it is desirable to stop execution as fast as possible when a runtime issue is detected. This makes using __builtin_trap a natural choice; however, this builtin doesn’t provide the best user experience because it does not contain any details about what exactly went wrong. Adding the ability to attach a descriptive message to a trap would significantly simplify debugging a crashing executable; this new builtin would be useful for the ongoing libc++ hardening and the C++ buffer hardening efforts.

Implementation

The challenge is encoding the message with minimal size overhead. From this perspective, storing the string literal inside the executable is suboptimal — that would increase the size of the binary with strings that would never actually be referenced at the runtime. Instead, we plan to store the messages in the debug info.

Specifically, the idea is to encode the message as the name of an artificial “synthesized” function; a call to __builtin_verbose_trap will be lowered to llvm.trap in the resulting LLVM IR, but the compiler will additionally emit debug metadata that represents an artificial inline frame whose name encodes the given string literal, prefixed by a “magic” prefix . A similar technique is being used by the Swift standard library for traps on, e.g., arithmetic overflow, as well as the -fbounds-safety effort; we would need to make sure that the infrastructure that adds the debug info supports different kinds of traps.

The string literal would have to be known at the compile time (we don’t see this as a significant limitation). Because the name of the artificial function only exists in the debug info, it is not limited to the character set of a valid C identifier and can use any UTF-8 characters.

To give an example, consider the following code:

void foo(int* p) {
  if (p == nullptr)
    __builtin_verbose_trap("Argument must not be null");
}

The debug metadata would look as if it were produced for the following code:

__attribute__((always_inline))
inline void __llvm_verbose_trap_Argument_must_not_be_null() {
  __builtin_trap();
}

void foo(int* p) {
if (p == nullptr)
  __llvm_verbose_trap_Argument_must_not_be_null();
}

(However, like mentioned above, the LLVM IR would not actually contain a call to the artificial function — it only exists in the debug metadata)

On the LLDB side, we would add logic to translate the name of the artificial inline frame into a human-readable stop reason. When a program traps, LLDB would recognize that the trap instruction was called from a specially-prefixed frame, parse the name of the function and present the resulting message as the stop reason.

As the name of the function is stored in the debug info, this would have no impact on the size of the binary. If the debug info is not available, calling __builtin_verbose_trap results in the same user experience as the existing __builtin_trap .

By default, the compiler is free to merge the traps, even if they have different messages. This is good for code size, but it can make debugging which exact source location corresponds to a crash more difficult. At least in the first iteration, we intend to let the compiler merge traps, but we are open to exploring ways of improving the debugging experience in the future.

For the quality of implementation, we would need to make sure the proposed mechanism interacts well with diagnostics that display source locations (e.g. optimization remarks).

6 Likes

Encoding arbitrary strings into function names can be tricky.
Here’s another suggestion: Define an artificial label at the point of the trap, and associate the string with the label info in the DWARF. (I don’t know whether CodeView would allow this trick.) Then the debugger can look up the trap address, find the label, and retrieve the string.

void foo(int *p) {
  if (p == nullptr) {
    __llvm_verbose_trap.1: // suffixed to allow multiple labels per function
      __builtin_trap();
  }
}

The DWARF can have a “comment” attribute with “Argument must not be null”. I suspect that LLVM currently doesn’t have a way to specify a comment, but it wouldn’t be hard to add.

1 Like

We have been using the inline function trick in the Swift standard library for a long time now, but this sounds like a reasonable alternative. IIUC, we would define a

DW_TAG_label
  DW_AT_low_pc (address of trap)
  DW_AT_name ("__llvm_verbose_trap.1")
  DW_AT_description ("Argument must not be null")

The only downside of this encoding is that with an inlined function, the error message would even show up in a symbolicated backtrace produced by a DWARF consumer that doesn’t know about this feature, whereas to look at the label the consumer needs to actively look for the trap label.

1 Like

(Pinging @pogo59)
If we use an encoding that is still mostly human-readable (e.g., punycode) the inline frame approach has the advantage that even tools that don’t know about this feature could surface a somewhat understandable error message to users. To me this sounds slightly preferable over using a label.

If the inlined function names can be ensured to be reasonable that’s fine. I’ve had problems with that in the past which is why I suggested labels. The point about appearing in a backtrace without any special knowledge by the consumer is valid.

By default, the compiler is free to merge the traps, even if they have different messages. This is good for code size, but it can make debugging which exact source location corresponds to a crash more difficult. At least in the first iteration, we intend to let the compiler merge traps, but we are open to exploring ways of improving the debugging experience in the future.

I’ve noticed the compiler tends to, even when the branches themselves aren’t merged, merge all the traps within a function into a single ud2 location. Though I don’t know what stage this happens in. Would __builtin_verbose_trap preserve that behavior? If so, it seems we’d still have everything get merged and you’d ultimately only learn “some check inlined into this function failed”.

In Chrome, we’ve used inline asm and other tricks to stop the compiler from doing this for debuggability, but this is not ideal also impedes a lot of optimizations which we would otherwise like, like merging related parts of bounds checks. (E.g. I think Consecutive comparisons are not combined into a single comparison (impacts safety checks) · Issue #78875 · llvm/llvm-project · GitHub would be impossible to optimize without merging the traps.) It’d be great to have something that wasn’t one of these two extremes.

A thought: what if traps (verbose and perhaps even normal?) could be compiled something like…

  1. Allow the compiler to reason about traps all it likes. We don’t want to impede optimizations like reordering, combining conditions, etc. It is totally free to merge traps together.
  2. However, after optimizing things, for each distinct branch leading to a trap-only basic block, emit a distinct ud2.

That is, we’d be saying, if the compiler is able to combine conditions, move them around, etc., do that. Such optimizations helps both binary size and runtime, and probably have cascading effects on other optimizations. But where it is unable to do this and ultimately emitted distinct branches, the secondary optimization of merging the ud2 instructions just saves a few bytes of code, and knowing the jump that led here on crash reports is probably more useful.

(I’m sure turning this into actually coherent compiler behavior will take more than that. This is just idle thoughts of a person who doesn’t work on compilers.)

FWIW, I think it is mostly controlled by the -fno-enable-tail-merge flag, which I think happens very late during MIR passes. We turn that option off in the Julia programming language because we noticed that it can really harm debugging quite badly (all abort and assert calls in a function often end up getting rewritten to have the same, incorrect, line number), whereas the bytes saved by doing a jmp+call instead of a direct call seem quite minimal.

Oh neat! Thanks for the pointer! I’ll ask around to see if we’d explored that one.

Do you know if that would impede an optimization like Consecutive comparisons are not combined into a single comparison (impacts safety checks) · Issue #78875 · llvm/llvm-project · GitHub. I suppose it’s hard to say definitively when Clang doesn’t do that optimization today anyway, but at least in the scenario where it came up, it seems a very desirable optimization. But I could easily imagine an optimization like this relying on the tail blocks already being merged first.

No, I think that particular flag happens too late for it to get in the way. Though I think perhaps the IR you want there, to have 1 test but preserving line info, is:

void f(const int *data, size_t size) {
    if (_unlikely(size <= 5)) {
        if (size == 5) __builtin_verbose_trap("size == 5");
        else __builtin_verbose_trap("size < 5");
    }
}

Clang and LLVM have the nomerge attribute to prevent this behavior: Compiler Explorer. I hope that would work for Clang builtins as well.

2 Likes

That is helpful to know! I see it was added pretty recently, so I hadn’t know about it yet (⚙ D78659 Add nomerge function attribute to supress tail merge optimization in simplifyCFG)

Thanks for mentioning the nomerge attribute! (cc @ZequanWu ) I think it would be reasonable to mark these verbose traps with this attribute by default, to avoid losing the extra string carried in the debug info.

1 Like

Though I think perhaps the IR you want there, to have 1 test but preserving line info, is […]

Ah, that could be interesting. But, TBH, I would be equally fine with:

void f(const int *data, size_t size) {
    if (_unlikely(size <= 5)) {
        __builtin_verbose_trap("size == 5 OR size < 5");
    }
}

That may even be preferable if the extra branch would otherwise impede, say, later inlining because the function got too big.
The fact that they’re separate is not actually important. It’s just a consequence of how iterator-level preconditions translate into algorithm-level preconditions. Of course, knowing size == 5 vs size < 5 might help debugging, but separate crash points and debuggability ultimately in conflict with performance or at least binary size. And so I’m wondering whether the model in RFC: adding `__builtin_verbose_trap(string-literal)` - #6 by davidben would balance it better: don’t impede optimizations, but if the compiler couldn’t actually take meaningful advantage of merging, split them back up.

In particular, I would guess that mergeable conditions will typically be related (e.g. two checks on sinze), so that granularity would still allow you to pinpoint roughly the condition that failed. But it wouldn’t actually impede optimizations because any time it would help the compiler to merge, we still permit it.

The string literal would have to be known at the compile time (we don’t see this as a significant limitation).

Thinking further on the possibilities here for your case, if this limitation wasn’t present, then the optimizer might be more fully empowered to combine those branches, since it would not have to preserve it as a constant and instead could optimize the code to have the call use a computed argument via a select call (possibly fully in the dwarf opcodes). To do that, perhaps the DW_TAG_label implementation suggested above could then accept any dwarf variables expression for the DW_AT_description to point at a UTF-8 string?

Then this feature could possibly also be used to improve the assert() function macro, such that, after printing the message, it also passed the message to the compiler via this mechanism?

 void assert_implementation(const char *msg) {
    fprintf(stderr, "assertion failed: %s\n", msg);
    __builtin_verbose_trap(msg);
}

Unfortunately, nomerge doesn’t work with __builtin_trap() yet.

Issue: Make [[clang::nomerge]] work for trap intrinsics such as __debugbreak and __builtin_trap · Issue #53011 · llvm/llvm-project · GitHub
Proposed fix: ⚙ D146164 Fix nomerge attribute not working with __builtin_trap().. It only fixes it for __builtin_trap(), but there are tons of built-ins need to handle it. I’m not sure if there’s an easier and comprehensive way to handle them at once.

1 Like

This sounds like a good idea that sanitizers could leverage. UBSan for example emits a bunch of metadata at each callsite that bloats rodata, but much of (if not all) that metadata can be moved to debuginfo and be represented with existing dwarf tags. -fsanitize-trap=alignment could be lowered to the inlined check and on the failing path call __builtin_verbose_trap with a unique potentially encoded error message that contains information like the original type accessed, access kind, etc. The debugger could then recognize this trap and decode the string producing the nice error message seen with the normal ubsan runtime.

+1 to this sentiment.

If this approach is taken, it will be necessary to either limit the permissible characters in the string literal to those that are allowed in a symbol name or to map unrepresentable characters (and any ill-formed code unit sequences introduced by numeric escapes) to an escape sequence. This is already done for non-type template parameters, but produces symbol names that are rather unpleasant to read. See Compiler Explorer. This can also contribute to very long symbol names. But then again, the battle for short symbol names seems to have been lost long ago anyway.

If this approach is taken, it will be necessary to either limit the permissible characters in the string literal …

Can you elaborate why? These “symbol names” will only appear in the debug info, not anywhere in the symbol table.

I’m no expert in this area, but do debug formats not provide any guarantees about the contents of symbol names?

I assume that symbol names are not really intended to be text, but since they are displayed to users, text semantics are relevant. String literals may contain ill-formed code unit sequences due to numeric escapes and may contain right-to-left characters or other non-printable characters that require special consideration.

DWARF strongly encourages all strings to be UTF-8 encoded, but other than that there are no restrictions. A debug info consumer already needs to be able to cope with any such strings, so this doesn’t make that aspect any worse than before.