Defining what happens when a bool isn’t 0 or 1

Given that you explained above that this is only relevant when you’re misinterpreting some random data as a bool, could you elaborate how these bugs show up in your context? Is this in the context of security exploits where the memory is already corrupted, but the compiler optimization based on the range of bool values make it worse, since they can make program misbehave in a way that no non-poisonous bool value could?

1 Like

In our context, this mostly happens when engineers memcpy untrusted data onto structures that contain bool fields, or when they cast pointers to untrusted data to structures that contain bool fields.

2 Likes

Is the intent here to extend the logic of Boolean values to accomodate new values? Or is it just to support truthy values for performance optimization?

This is a refinement of Undefined Behavior. We’re defining what happens to a program that has broken the representation requirements for bool to force only interpretations that have a coherent understanding of whether the bool is true or false. Correct programs get no Observable difference in behavior under this flag.

2 Likes

I see. For such specific use, I wonder if it makes sense to tag objects with a type qualifier, say __untrusted, to indicate that they have a binary representation that was obtained from an untrusted source, for example:

__untrusted struct request r; // tagging object as __untrusted because we directly
                              // read it from untrusted source
if (fread(&r, sizeof r, 1, external_stream) == 1) {
   // r.which_one is qualified __untrusted: should behave as either false or true,
   // even if ill-formed representation in memory
   if (r.which_one) ...
   else ...
}

The thing is that I think this problem is not limited to bool, but extends to any type that is stored with a certain formatting in more bits than it directly has. For example, _BitInt(24) is stored as a sign-extended i32. When storing it to memory, clang uses a sext i24 ... to i32 instruction. When reading it from memory, clang uses an inverse trunc i32 ... to i24 operation that could perfectly have the nsw flag, unless __untrusted, or unless a -no-strict-bitint flag, etc.

The problem is limited to types which, when loaded, are attached range metadata by clang. Clang does not attach that metadata to loaded _BitInt types, so they don’t have that problem today. We could get ahead of the situation and create -fstrict-bitint (defaulting to no), as you suggest, and maybe also a -fstrict-integral-ranges that controls all three and all upcoming types that would land in that category.

While I think there’s a coherent model for __untrusted, I suspect it would be hard to get it off the ground. In our experience, there aren’t that many issues like invalid bool bit patterns that would compromise bounds safety even in the face of seemingly-correct (and/or compiler-inserted) checks. Because of that, it’s not something that I want to be leading, though I could tag along for the ride if somebody else made serious plans.

1 Like

While __untrusted works for systems where entry/egress of data is well defined, for the Sony use case a process is made up of numerous pieces of customer and middleware code, in which our SDK is a shared object with an API and numerous symbols. Pretty much every piece of memory passed over the API boundary would need to be considered “untrusted” under such a model, which would be unergonomic. A compiler flag is much more convenient to express that “this C++ language assumption does not hold”, as with other flags such as -fno-strict-aliasing or -fno-delete-null-pointer-checks.

3 Likes

Fair enough. I think it is plausible that clang evolves to remove the redundant formatting when loading _BitInts (e.g. to get rid of the shift instructions in test2: Compiler Explorer ), but I don’t know if this warrants proactive action.

Our implementation of -fno-strict-bool does not take a stance on how to interpret bool values that are neither 0 nor 1: the emergent behavior is that only the lowest bit is considered.

Extrapolating from this explanation, and a previous post about emitted assembly code: Does this mean that the current implementation logically bitwise-ands with 0x1 whenever is uses a bool (though mechanically some of the bitwise-ands may disappear after optimization)?

What is the desirable behavior for bool values that are in an invalid state? Users could find it more intuitive that all non-zero values are truthy, since that’s how it works for integers, but this is not the current implementation.

I have no opinion on which behavior is most intuitive.

However, as long as this implementation must (sometimes) emit extra instructions, there will always be demand for an implementation that emits fewer extra instructions, or none. Therefore, I favor an implementation that treats all non-zero values as truthy, because I believe that doing so requires the fewest extra instructions on most modern architectures.

Specifically, when loading a bool from memory, -fno-strict-bool logically bitwise-ands the value it loaded. If the bool did not start from memory (for instance, if it’s the result of a comparison operation that is consumed without being stored to memory), or you already had a copy of it loaded, then it’s not bitwise-anded to anything. As you said, there’s also cases where the operations disappear after optimization.

In the critical case of just testing the bool, this comes at no cost over -fstrict-bool: off the top of my head, on arm64, both will do ldrb w0, [bool_addr]; tbnz w0, #0, <dest>. In “truthy mode”, -fno-strict-bool would use cbnz instead of tbnz.

I expect truthy and low-bit interpretations will be a wash in term of code size, but we’ll need to measure that. I take the point that we should take the best one and prefer truthy if they’re the same.

Thanks for the clarification. The undefined behavior that is caused by compiler optimizations if is unintentional and not exploited, then truthy values make sense. However, if they are being used systematically to define program / application behavior, then it might make sense to model them as values and study their behavior in particular. Semantics of these undefined values are often not well clarified, which could be costly putting in place the contracts etc.

To summarize, if the undefined behavior is coincidental, this is okay, however if it is intential and being used as a pattern, this should be fixed as proposed above.

From the given discussion here I am curious how the fstrict-bool interacts with the backends, specifically target lowering. You can specify that a backend uses ZeroOrNegativeOneBooleanContent for example, in which case more than just the lowest bit is filled. Does this mean using a backend with specifically ZeroOrNegativeOneBooleanContent leads to undefined behavior when using a boolean encoded as a uint8 of all 1s?

https://llvm.org/doxygen/classllvm_1_1TargetLoweringBase.html#aa61af767c51a95e2dd0dff2001168755

ZeroOrNegativeOneBooleanContent is almost completely unrelated; it specifies the output of ISD::SETCC and related nodes, which are internal implementation details of the backend. This proposal is about how booleans are represented externally (in memory, and in calling conventions).

Sorry for the delay, this has been a rocky summer. The PR is now up: [clang] Implement -fstrict-bool by apple-fcloutier · Pull Request #160790 · llvm/llvm-project · GitHub

As clarified, this introduces the following options:

  • -fstrict-bool [the default, maintaining the status quo]: Clang can optimize code based on the assumption that bool values always have a bit pattern of 0 or 1.
  • -fno-strict-bool=truncate: Clang does not optimize code on the assumption that bool values always have a bit pattern of 0 or 1: when loading a bool from memory, it will treat it as true if the least significant bit is set, and false otherwise (ie, value & 1).
  • -fno-strict-bool=nonzero: Clang does not optimize code on the assumption that bool values always have a bit pattern of 0 or 1: when loading a bool from memory, it will treat it as true if any bit is set, and false otherwise (ie, value != 0).
  • -fno-strict-bool: for now, the same as -fno-strict-bool=truncate.

The nonzero and truncate options were introduced so that the community can assess impact on their preferred targets, and we can change the default for specific targets if that makes sense. In my experience, nonzero is typically what people expect, but the code for truncate on arm64 is slightly better in the worst case.

6 Likes

I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122068 Making sure you're not a bot! For gcc compatibility on these options. Note I might be the one implementing it.

2 Likes

@erichkeane comments that this hasn’t received the OK from the @clang-area-team. Eli lastly asked:

  • Can we clarify interactions with calling conventions? (This wasn’t done because it’s an LLVM responsibility rather than a clang responsibility, and it goes beyond what I feel good changing, but I could document it.)
  • Can we document the behavior? (The behavior is now documented.)
  • Between testing that the value is non-zero and testing that the low bit is zero, are we sure that we have the right behavior? (The behavior can optionally be specified as -fno-strict-bool={truncate|nonzero}. Truncating is slightly better in the worst case on arm64; with some people contributing their findings, we could make the default depend on the target.)

Aaron’s last contribution:

  • -fstrict-bool is a reasonable default, but we could change it if people find it makes no real difference. (The PR keeps the status quo.)
  • Preference for nonzero as the default. (The PR did not switch the default to that, but I’m not opposed to it; it’s just not what we qualified for internally.)
1 Like

Maybe for -O0 default to -fno-strict-bool .

Note I would not have a default choice between nonzero vs truncated per target. Especially for major targets that is X86_64 and aarch64 and riscv should all match. Otherwise you might end up with issues reporting different behavior when folks dont read the manual.

FWIW, at -O0, Clang already does not attach range metadata to bool loads. It behaves like -fno-strict-bool=truncate. (If you use -fno-strict-bool=nonzero, it will do the non-zero test instead even at -O0.)

Ok. Because right now gcc default (for -O0) is efficient -fstrict-bool i will change the default when I implement gccs changes.