making -ftrivial-auto-var-init=zero a first-class option

To add in, we (Microsoft) currently use zero initialization technology in Visual Studio in a large amount of production code we ship to customers (all kernel components, a number of user-mode components). This code is both C and C++.

We already have had multiple vulnerabilities killed because we shipped this technology in production. We received bug reports with repros that worked on older versions of Windows without the mitigation and new versions of Windows that do have it. The new versions don’t repro, the old ones do.

Using this sort of technology only in development (and not in production) is not sufficient. Some of these bugs will never produce crashes, the uninitialized data is copied across a trust boundary (i.e. from the kernel in to a untrusted user-mode process). This will never result in a crash but does result in a security issue. This is why shipping in production is a requirement even if you had perfect test coverage that exercises all code paths (which nobody has).

We do enable pattern initialization for debug builds and internal retail builds (using a developer mode in the build environment). We do this to help prevent “forking of the language” and also to force non-determinism. If your code relies on the zero-init, it won’t work when we do pattern init. If your code only works with a non-zero value but doesn’t care what that value is (Booleans, certain bit tests, etc.), it won’t work with zero-init. Developers cannot depend on the automatic initialization for program correctness.

How is this, as a compiler feature, shipped to users? Users don’t have a direct toggle available, but they say “I want memory init” with some compiler flag, then if they’re building in debug (any unoptimized build, or only those with debug information emitted?) that translates to pattern, and in a build with any optimizations enabled (even the lowest level) they get zero init?

While in a constrained environment within a company I can see how you could hold that bar - once you ship that to end users I’d be afraid that some of them would figure out the least-optimizing (most debuggable) way to get zero init and avoid the pattern init behavior & write code that depended on that zero init - thus forking the language.

  • Dave

What you’re proposing is, without question, a language extension. Our policy on language extensions is documented here: http://clang.llvm.org/get_involved.html

Right now, this fails at point 4. We do not want to create or encourage the creation of language dialects and non-portable code, so the place to have this discussion is in the C and C++ committees. Both committees have processes for specifying optional features these days, and they might be amenable to using those processes to standardize the behavior you’re asking for. (I mean, maybe not, but our policy requires that you at least try.)

However, there is a variant on what you’re proposing that might fare better: instead of guaranteeing zero-initialization, we could guarantee that any observation of an uninitialized variable either gives produces zero or results in a trap. That is: it’s still undefined to read from uninitialized variables – we still do not guarantee what will happen if you do, and will warn on uninitialized uses and so on – but we would bound the damage that can result from such accesses. You would get the security hardening benefits with the modest binary size impact. That approach would not introduce the risk of creating a language dialect (at least, not to the same extent), so our policy on avoiding language extensions would not apply.

Richard, just to check here, it sounds to me like you’re raising more a point of specification then of implementation right? That is, you’re not stating that the actual implementation must sometimes trap (when producing a zero wouldn’t), but that the specification of the flags and docs must leave the possibility there of?

Well, I think it’s not sufficient to merely say that we might do something like trap, if our intent is that we never will. We would need to reasonably agree that (for example) if someone came forward with a patch that actually implemented said trapping behavior and didn’t introduce any significant code size or performance impact, that we would consider such a change to be a quality of implementation improvement. But I don’t think we need anyone to have actually committed themselves to producing such a patch, or any timeline or expectation of when (or indeed whether) it would be done. Sorry if this is splitting a hair, but I think it’s an important hair to split.

Hair successfully split. I agree it is a key distinction.

Personally, I find that a bit too fine (but wouldn’t stand in the way of the decision) & would prefer at least a rough/most basic trapping behavior for that, so really obvious intentional use of zero init would fail - making it hard for anyone to develop a coding convention/style around it, etc. Wouldn’t need to be fancy at all, that being the point - make it as easy/unintrusive to implement, and catch the most blatant uses of zero init, so that if someone tried to write code against a zero-init language fork, their most obvious/common code would fail and they’d have sufficient contortions that it’d be hard to argue it was an intentional/consistent way to write code. "well, we explicitly zero init /these/ simple cases, but rely on compiler-derived zero init in the more complicated cases where we can (currently) get away with it… "

We currently do not document the automatic initialization flags because we haven’t had customers ask for it through our official channels.

You are correct though, if we shipped this to customers they could do whatever they wanted with the flags. I’m simply stating how we are using the automatic initialization in our build environment.

Clang could also ban zero-init for debug builds and only allow it for retail. This does potentially make it harder to debug actual issues though.

Joe

They mention that "zero is the safest value from security point-of-view",
and you mentioned it also in multiple places. Is there a detailed analysis
somewhere that explains this? (I'm not knowledgeable but always interested
to learn more)

I don't have a good direct reference handy. This has mostly been
internal manual examination of past flaws. I'll see if I can find
something.

Overall when I read their perf analysis I would try to advocate to kill the
flag and make zero-init the default for non-array POD. This would simplify
the perf tuning aspect of the compiler (the optimizer heuristics would be
always tuned for this) and most software out there could be hand-tuned with
respect to this as well. This seems better than yet another mode in the
compiler, in particular for a "production oriented" option.
This is also another way to address the "language fork" issue: if the major
compiler out there (clang, gcc, MSVC at least) were to implement this
behavior by default, this could then in turn help convincing the standard
committee to adopt it in a future revision.

While I'm all for this being a default, I think that's a much larger
step. At present, I'd just like to adjust the -enable... flag. Perhaps
we can revisit defaults at a later time.

Looks like you mention XNU at minute 24? I would infer that pattern init
is being used rather than zero?

Watching this also reminded me about the benefit of NULL pointers matching
existing "is the pointer NULL?" checks, instead of getting treated like
an allocation -- another behavioral improvement under zero-init.

The existence of the
--long-ugly-flag-name-that-says-we'll-remove-the-feature is the way we
currently try to avoid introducing a language dialect. If we remove that
flag as is proposed, then we are effectively relitigating the question of
whether to have the feature at all.

What about renaming the enable flag so it doesn't imply that zero-init
is going to be removed?

And indeed it might even be OK if the initial behavior is that we *always*
zero-initialize (as Philip asked), so long as our documentation clearly
says that we do not guarantee that the value will be zero (only that we
guarantee that *if the program continues*, the value will be zero), and our
intent is that we may still produce traps or otherwise abort the
computation.

Right -- I would see adding a trap path as a nice improvement. I still
think it'll be be too much overhead, though, given needing to check all
corners of a struct: accessing any padding bytes would need to trap,
etc.

What would this look like on the command line side of things? Would this
be a new mode, like -ftrivial-auto-var-init=trap, but initially "trap"
would just do the same as "zero", until it was improved to actually trap?

How are you going to efficiently check that something wasn’t initialized at runtime? In a way that results in better codegen than just doing pattern initialization? I’m happy to see a solution but I don’t see how this can be done in a way that doesn’t involve metadata and checks. If you could do this at compile-time, you’d just issue a warning rather than let the issue hang around for someone to discover at runtime.

Also not clear to me what the OS is expected to do with this trap. We have a number of information leak vulnerabilities where force initialization kills the bug silently. If you have a non-recoverable trap you are now turning these bugs in to kernel crashes which is sort of a crappy user experience compared to just silently fixing the bug and allowing the OS to work as normal. As it is right now, we can just ignore the issues because they have no security or reliability impact which is great because it saves us time and money not having to service things, and customers don’t have to install a code update either.

Joe

How are you going to efficiently check that something wasn't initialized
at runtime? In a way that results in better codegen than just doing
pattern initialization? I'm happy to see a solution but I don't see how
this can be done in a way that doesn't involve metadata and checks. If
you could do this at compile-time, you'd just issue a warning rather
than let the issue hang around for someone to discover at runtime.

I share this skepticism. :wink:

Also not clear to me what the OS is expected to do with this
trap. We have a number of information leak vulnerabilities where force
initialization kills the bug silently. If you have a non-recoverable
trap you are now turning these bugs in to kernel crashes which is sort
of a crappy user experience compared to just silently fixing the bug and
allowing the OS to work as normal. As it is right now, we can just ignore
the issues because they have no security or reliability impact which is
great because it saves us time and money not having to service things,
and customers don't have to install a code update either.

I don't think the intention is for it to be non-recoverable. (e.g.
earlier language was "if the execution continues, it would read zero")

How are you going to efficiently check that something wasn’t initialized at runtime? In a way that results in better codegen than just doing pattern initialization? I’m happy to see a solution but I don’t see how this can be done in a way that doesn’t involve metadata and checks. If you could do this at compile-time, you’d just issue a warning rather than let the issue hang around for someone to discover at runtime.

At least in Clang, what powers warnings versus what powers optimizations are quite different - there are lots of things we could trap on that we couldn’t warn on. (because the warning would have to describe to the user what codepath was taken, what values were needed to take those paths, etc - but trap wouldn’t)

Also not clear to me what the OS is expected to do with this trap. We have a number of information leak vulnerabilities where force initialization kills the bug silently. If you have a non-recoverable trap you are now turning these bugs in to kernel crashes which is sort of a crappy user experience compared to just silently fixing the bug and allowing the OS to work as normal. As it is right now, we can just ignore the issues because they have no security or reliability impact which is great because it saves us time and money not having to service things, and customers don’t have to install a code update either.

This is the thing the folks concerned about “forking the language” are trying to avoid - not wanting that code to be considered correct/OK/not needing of any changes/corrections.

We still make fixes to up level code. We just don’t have to fix existing in-market code.

For the end-users who consume the binaries you are generating, the behavior I am describing is preferable to having to install an update to fix a fault that could have been silently fixed. People don’t like installing updates and often don’t. It also saves a lot of money on our end to not be required to service something that is working. We can verify the code is working correctly, make the change only in our active development branch for the next version of the product, and move on.

Joe

Ah - thanks for the clarification

A major part of the purpose of point 4 of the policy is to prevent creation of language dialects, such as have unfortunately been created by (for example) the -fno-exceptions and -fno-rtti flags. If people can rely on uninitialized variables behaving as if initialized to zero, then they will write code that assumes that to be the case, and such code will not work in compilers / compilation modes that don't provide that language dialect. If people cannot rely on uninitialized variables behaving as if initialized to zero, then we mitigate the risk of creating a language dialect.

The compiler can do anything it wants for something the specification
says is undefined behavior. If the programmer wants to write a
portable program, they should not rely such behavior and diagnostics
such as compiler warnings and the sanitizers may assist with the goal.

If you argue that it is possible to write programs that compile fine
with one compiler, but miscompiles with another, this has always been
the case To quote Chris Lattner [1]:

It is also worth pointing out that both Clang and GCC nail down a few behaviors that the C standard leaves undefined. The things I'll describe are both undefined according to the standard and treated as undefined behavior by both of these compilers in their default modes.

I don't think we can (or should) prevent users to shoot themselves
into the foot.
The one concern I would have that we introduce another "compile mode"
that yields different output. But so is -O0, -O1, O2, -O3, -Oz, -Os,
-g, -march, -fno-unroll-loops, -fno-vectorize, ... and we never seemed
to have an issue with those.

Michael

[1] What Every C Programmer Should Know About Undefined Behavior #1/3 - The LLVM Project Blog

How are you going to efficiently check that something wasn’t initialized at runtime? In a way that results in better codegen than just doing pattern initialization? I’m happy to see a solution but I don’t see how this can be done in a way that doesn’t involve metadata and checks. If you could do this at compile-time, you’d just issue a warning rather than let the issue hang around for someone to discover at runtime.

Consider a case such as:

int f(int &r) { return r; }
int g() {
int n;
return f(n);
}

We certainly won’t warn on this during compilation, but it’s easy for us to turn this into a trap after inlining.

Also not clear to me what the OS is expected to do with this trap. We have a number of information leak vulnerabilities where force initialization kills the bug silently.

Do you really mean “kills the bug”? I would certainly believe you have a number of information leak vulnerabilities where zero-init fixes the vulnerability (and we should definitely provide tools to harden programs against such vulnerabilities), but the program is still using an uninitialized value and still has a bug. The idea that this compiler change fixes or removes the bug is precisely the language dialect problem that I’m concerned about. Developers must still think that reading an uninitialized value is a bug (even if it’s not a vulnerability any more) or they’re writing a program in a language dialect where doing that is not a bug.

How are you going to efficiently check that something wasn’t initialized at runtime? In a way that results in better codegen than just doing pattern initialization? I’m happy to see a solution but I don’t see how this can be done in a way that doesn’t involve metadata and checks. If you could do this at compile-time, you’d just issue a warning rather than let the issue hang around for someone to discover at runtime.

Consider a case such as:

int f(int &r) { return r; }
int g() {
int n;
return f(n);
}

We certainly won’t warn on this during compilation, but it’s easy for us to turn this into a trap after inlining.

Also not clear to me what the OS is expected to do with this trap. We have a number of information leak vulnerabilities where force initialization kills the bug silently.

Do you really mean “kills the bug”? I would certainly believe you have a number of information leak vulnerabilities where zero-init fixes the vulnerability (and we should definitely provide tools to harden programs against such vulnerabilities), but the program is still using an uninitialized value and still has a bug. The idea that this compiler change fixes or removes the bug is precisely the language dialect problem that I’m concerned about. Developers must still think that reading an uninitialized value is a bug (even if it’s not a vulnerability any more) or they’re writing a program in a language dialect where doing that is not a bug.

I think part of the disconnect comes from security and kernel folks having this something like this value scale:

Correct code >> buggy but not vulnerable/leaky code >> crashy but not vulnerable/leaky code >> code with infoleak >> vulnerable code

Everyone agrees that we want to avoid infoleaks and vulnerable code, and this mitigation helps with both. Everyone agrees correct code is better.

Where there’s disagreement in this sub-thread seems to be whether it’s OK to have buggy code that doesn’t crash, versus buggy code that crashes.

Yeah, this is another "different communities mean different things"
terminology glitch. For the security folks, "bug" tends to stand in for
"security bug" or "security flaw". But yes, as you say, the "bug"
(misuse of the C language) is present, but the "security flaw" gets
downgraded to "just a bug" in the zero-init case. :slight_smile:

I’ve consulted with folks in security, compilers, and developers of security-sensitive codebases. A few points:

  • They like that automatic variable initialization provides a security mitigation for a significant percentage of observed zero-day exploits, at extremely low cost, with little chance of regression.
  • They like that pattern initialization is a “smoking gun” when seen in a crash log. The size+performance costs have decreased in the last year, but they’d like to see it improve further.
  • They like the lower size+performance cost of zero initialization as well as the safety it offers for e.g. size variables (because a size of 0xAA…AA is “infinite” which is bad for bounds checking, whereas zero isn’t). They don’t like that zero is often a valid pointer sentinel value (i.e. initializing pointers to zero can be used in unexpected data flow). They don’t like the silly long compiler flag.
  • We’ve deployed automatic variable initialization in a significant amount of code. The vast majority of our deployment uses pattern initialization. A small number uses zero, of which you’ll note XNU. We’ve only chosen zero in cases where size or performance were measured issues.
  • Automatic variable initialization which might sometimes trap (as Richard suggests) is a proposal we’d like to see implemented, but we’d like to see it under its own separate flag, something like UBSan does with developers choosing trapping or logging modes. The main reason is that pure trapping with zero-init will make deployment significantly harder (it’s no longer a low-risk mitigation), and it’ll make updating our compiler significantly harder (because it might change where it generates traps over time). We also think that trapping behavior would need good tooling, for example through opt remarks, to help find and fix parts of the code where the compiler added traps. A logging mode would ease some of this burden. As well, we’re not convinced on the size+performance cost of either tapping nor logging, complicating the adoption of the mitigation.
  • We don’t think the design space has been explored enough. We might want to differentiate initialization more than just “floating point is 0xFF…FF, everything else is 0xAA…AA”. For example:
  • We could pattern-init pointers (maybe with compile-time random different patterns), and zero-init scalars. This has a good mix of performance and security upsides.
  • We could key off heuristics to choose how to initialize, such as variable names, function names, or some machine learning model (and for those who know me: I’m not joking).
  • We could use a variety of runtime pseudo-random sources to initialize values.
  • We could add a new IR “undef” type, or different late-stage treatment of “undef”, to apply initialization after optimizations have found interesting facts about the program.

We’d like to see work continue on improving this mitigation, optimizations around it, and other similar mitigations.

JF: To Richard’s point about raising this at the C++ committee level, has that already been explored? I got the impression that Richard mostly wanted to see an attempt to standardize the behavior across implementations. If an attempt has already been made and there is evidence of disagreement, then it seems like we may already have fulfilled point 4 of the clang language extension policy, and we can go forward with renaming the flag and making it permanent. It seems clear that there is adequate appetite for this feature.

I suppose Richard is the C++ project editor, he would know if an attempt has been made, but I do not personally have any visibility into the WG21 proceedings or communications.

I think that standardizing in the C++ language is independent to the
implementation discussed here. In particular, having zero
initialization standardized (like for currently globals) would not
allow clang to emit warnings or memory sanitizer report uses of
uninitialized memory any more since with such a change projects will
legitimately start relying on it. While -ftrivial-auto-var-init=zero
just changes clang's choice of undefined behavior.

Michael

I’ve consulted with folks in security, compilers, and developers of security-sensitive codebases. A few points:

  • They like that automatic variable initialization provides a security mitigation for a significant percentage of observed zero-day exploits, at extremely low cost, with little chance of regression.
  • They like that pattern initialization is a “smoking gun” when seen in a crash log. The size+performance costs have decreased in the last year, but they’d like to see it improve further.
  • They like the lower size+performance cost of zero initialization as well as the safety it offers for e.g. size variables (because a size of 0xAA…AA is “infinite” which is bad for bounds checking, whereas zero isn’t). They don’t like that zero is often a valid pointer sentinel value (i.e. initializing pointers to zero can be used in unexpected data flow). They don’t like the silly long compiler flag.
  • We’ve deployed automatic variable initialization in a significant amount of code. The vast majority of our deployment uses pattern initialization. A small number uses zero, of which you’ll note XNU. We’ve only chosen zero in cases where size or performance were measured issues.
  • Automatic variable initialization which might sometimes trap (as Richard suggests) is a proposal we’d like to see implemented, but we’d like to see it under its own separate flag, something like UBSan does with developers choosing trapping or logging modes. The main reason is that pure trapping with zero-init will make deployment significantly harder (it’s no longer a low-risk mitigation), and it’ll make updating our compiler significantly harder (because it might change where it generates traps over time). We also think that trapping behavior would need good tooling, for example through opt remarks, to help find and fix parts of the code where the compiler added traps. A logging mode would ease some of this burden. As well, we’re not convinced on the size+performance cost of either tapping nor logging, complicating the adoption of the mitigation.

I threw together an implementation here: https://reviews.llvm.org/D79249

It’s pretty quick and dirty but it seems to do the right thing on at least a small selection of test cases. I’ve not tried it on any nontrivial codebases yet. (I’m not sure in what ways it doesn’t work, but at least the InstCombine approach seems likely to fight with other InstCombines.)

Just a few points of my own on the topic of trap-vs-init:

  • Once we agree that we want to harden against uninitialized uses, we’re out of the security space entirely. The question of whether we would prefer a crash or a program somehow keeping going after hitting undefined behavior (absent a security bug) is a software engineering question, not a security question, if we agree that they provide the same security mitigation, so if the only users you asked are security-focused ones, you have sampling bias.
  • As I understand it, automatic initialization to zero or to pattern-with-high-bits-set were chosen, in part, because they are very likely to lead to clean crashes. Given that, it doesn’t really make sense to me to be concerned about the “trap” risk of the mitigation introducing new crashes, since crashing on bad programs was already part of the goal.
  • The performance of the trapping mode is certainly unproven, but if the trapping mode doesn’t introduce new branches and the “potentially trap” markers are removed early enough to not get in the way of other optimizations, it’s not obvious to me that there should be any systematic effect.
  • The size of the trapping mode is likewise unproven, but if it only ever replaces a branch destination with a trap, it seems plausible to me that it could reduce code size compared to the zeroing mode.
  • In the presence of a bug, “crash early, crash often” is, in (empirically and subjectively) most software domains, the right answer – continuing after your program’s invariants are not met is not sound software engineering practice. But whether we do continue in such cases is exactly the difference between “zero” and “zero-or-maybe-trap”. Programs that really need to be robust against things going wrong, and recover in some way, typically install a SIGSEGV and SIGABRT handler anyway. It’s probably better to jump to those handlers rather than continue with broken invariants and risk hitting more problems (maybe security problems) later on.
  • Per http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html warning or producing opt remarks when the optimizer inserts a trap is not likely to be all that useful. (I think opt remarks would be OK, but I also think it’s going to be hard to expose them in a way that helps the user understand what happened well enough to know if they’re false positives.) Places where traps are added don’t necessarily need to be fixed; the case in question might be unreachable due to program invariants in a way the optimizer can’t determine, and in those cases, the remarks will just be non-useful noise. Perhaps a project could keep track of newly-introduced remarks and ask the developer to take a look at them though?

If we want a separate flag to control whether / how much we use such a trapping mode, I think that could be reasonable, subject to having a good answer to the language dialect concern I expressed previously (eg, maybe there’s never a guarantee that we won’t insert a trap, even though with the flag on its lowest setting that is actually what happens).

  • We don’t think the design space has been explored enough. We might want to differentiate initialization more than just “floating point is 0xFF…FF, everything else is 0xAA…AA”. For example:
  • We could pattern-init pointers (maybe with compile-time random different patterns), and zero-init scalars. This has a good mix of performance and security upsides.
  • We could key off heuristics to choose how to initialize, such as variable names, function names, or some machine learning model (and for those who know me: I’m not joking).
  • We could use a variety of runtime pseudo-random sources to initialize values.
  • We could add a new IR “undef” type, or different late-stage treatment of “undef”, to apply initialization after optimizations have found interesting facts about the program.

We’d like to see work continue on improving this mitigation, optimizations around it, and other similar mitigations.

+1. =)