[RFC] automatic variable initialization

Hello security fans!

I’ve just uploaded a patch proposing opt-in automatic variable initialization. I’d appreciate comments on the overall approach, as well as on the specific implementation.

Here’s the patch:
https://reviews.llvm.org/D54604

And here’s the description:

Automatic variable initialization

Add an option to initialize automatic variables with either a pattern or with
zeroes. The default is still that automatic variables are uninitialized. Also
add attributes to request pattern / zero / uninitialized on a per-variable
basis, mainly to disable initialization of large stack arrays when deemed too
expensive.

This isn’t meant to change the semantics of C and C++. Rather, it’s meant to be
a last-resort when programmers inadvertently have some undefined behavior in
their code. This patch aims to make undefined behavior hurt less, which
security-minded people will be very happy about. Notably, this means that
there’s no inadvertent information leak when:

  • The compiler re-uses stack slots, and a value is used uninitialized.
  • The compiler re-uses a register, and a value is used uninitialized.
  • Stack structs / arrays / unions with padding are copied.

This patch only addresses stack and register information leaks. There’s many
more infoleaks that we could address, and much more undefined behavior that
could be tamed. Let’s keep this patch focused, and I’m happy to address related
issues elsewhere.

To keep the patch simple, only some undef is removed for now, see
replaceUndef. The padding-related infoleaks are therefore not all gone yet.
This will be addressed in a follow-up, mainly because addressing padding-related
leaks should be a stand-alone option which is implied by variable
initialization.

There are three options when it comes to automatic variable initialization:

  1. Uninitialized

This is C and C++'s default. It’s not changing. Depending on code
generation, a programmer who runs into undefined behavior by using an
uninialized automatic variable may observe any previous value (including
program secrets), or any value which the compiler saw fit to materialize on
the stack or in a register (this could be to synthesize an immediate, to
refer to code or data locations, to generate cookies, etc).

  1. Pattern initialization

This is the recommended initialization approach. Pattern initialization’s
goal is to initialize automatic variables with values which will likely
transform logic bugs into crashes down the line, are easily recognizable in
a crash dump, without being values which programmers can rely on for useful
program semantics. At the same time, pattern initialization tries to
generate code which will optimize well. You’ll find the following details in
patternFor:

  • Integers are initialized with repeated 0xAA bytes (infinite scream).
  • Vectors of integers are also initialized with infinite scream.
  • Pointers are initialized with infinite scream on 64-bit platforms because
    it’s an unmappable pointer value on architectures I’m aware of. Pointers
    are initialize to 0x000000AA (small scream) on 32-bit platforms because
    32-bit platforms don’t consistently offer unmappable pages. When they do
    it’s usually the zero page. As people try this out, I expect that we’ll
    want to allow different platforms to customize this, let’s do so later.
  • Vectors of pointers are initialized the same way pointers are.
  • Floating point values and vectors are initialized with a vanilla quiet NaN
    (e.g. 0x7ff00000 and 0x7ffe000000000000). We could use other NaNs, say
    0xfffaaaaa (negative NaN, with infinite scream payload). NaNs are nice
    (here, anways) because they propagate on arithmetic, making it more likely
    that entire computations become NaN when a single uninitialized value
    sneaks in.
  • Arrays are initialized to their homogeneous elements’ initialization
    value, repeated. Stack-based Variable-Length Arrays (VLAs) are
    runtime-initialized to the allocated size (no effort is made for negative
    size, but zero-sized VLAs are untouched even if technically undefined).
  • Structs are initialized to their heterogeneous element’s initialization
    values. Zero-size structs are initialized as 0xAA since they’re allocated
    a single byte.
  • Unions are initialized using the initialization for the largest member of
    the union.

Expect the values used for pattern initialization to change over time, as we
refine heuristics (both for performance and security). The goal is truly to
avoid injecting semantics into undefined behavior, and we should be
comfortable changing these values when there’s a worthwhile point in doing
so.

Why so much infinite scream? Repeated byte patterns tend to be easy to
synthesize on most architectures, and otherwise memset is usually very
efficient. For values which aren’t entirely repeated byte patterns, LLVM
will often generate code which does memset + a few stores.

  1. Zero initialization

Zero initialize all values. This has the unfortunate side-effect of
providing semantics to otherwise undefined behavior, programs therefore
might start to rely on this behavior, and that’s sad. However, some
programmers believe that pattern initialization is too expensive for them,
and data might show that they’re right. The only way to make these
programmers wrong is to offer zero-initialization as an option, figure out
where they are right, and optimize the compiler into submission. Until the
compiler provides acceptable performance for all security-minded code, zero
initialization is a useful (if blunt) tool.

I’ve been asked for a fourth initialization option: user-provided byte value.
This might be useful, and can easily be added later.

Why is an out-of band initialization mecanism desired? We could instead use
-Wuninitialized! Indeed we could, but then we’re forcing the programmer to
provide semantics for something which doesn’t actually have any (it’s
uninitialized!). It’s then unclear whether int derp = 0; lends meaning to 0,
or whether it’s just there to shut that warning up. It’s also way easier to use
a compiler flag than it is to manually and intelligently initialize all values
in a program.

Why not just rely on static analysis? Because it cannot reason about all dynamic
code paths effectively, and it has false positives. It’s a great tool, could get
even better, but it’s simply incapable of catching all uses of uninitialized
values.

Why not just rely on memory sanitizer? Because it’s not universally available,
has a 3x performance cost, and shouldn’t be deployed in production. Again, it’s
a great tool, it’ll find the dynamic uses of uninitialized variables that your
test coverage hits, but it won’t find the ones that you encounter in production.

What’s the performance like? Not too bad! Previous publications 0 have cited
2.7 to 4.5% averages. We’ve commmitted a few patches over the last few months to
address specific regressions, both in code size and performance. In all cases,
the optimizations are generally useful, but variable initialization benefits
from them a lot more than regular code does. We’ve got a handful of other
optimizations in mind, but the code is in good enough shape and has found enough
latent issues that it’s a good time to get the change reviewed, checked in, and
have others kick the tires. We’ll continue reducing overheads as we try this out
on diverse codebases.

Is it a good idea? Security-minded folks think so, and apparently so does the
Microsoft Visual Studio team 1 who say “Between 2017 and mid 2018, this
feature would have killed 49 MSRC cases that involved uninitialized struct data
leaking across a trust boundary. It would have also mitigated a number of bugs
involving uninitialized struct data being used directly.”. They seem to use pure
zero initialization, and claim to have taken the overheads down to within noise.
Don’t just trust Microsoft though, here’s another relevant person asking for
this 2. It’s been proposed for GCC 3 and LLVM 4 before.

What are the caveats? A few!

  • Variables declared in unreachable code, and used later, aren’t initialized.
    This goto, Duff’s device, other objectionable uses of switch. This should
    instead be a hard-error in any serious codebase.
  • Volatile stack variables are still weird. That’s pre-existing, it’s really
    the language’s fault and this patch keeps it weird. We should deprecate
    volatile 5.
  • As noted above, padding isn’t fully handled yet.

I don’t think these caveats make the patch untenable because they can be
addressed separately.

Should this be on by default? Maybe, in some circumstances. It’s a conversation
we can have when we’ve tried it out sufficiently, and we’re confident that we’ve
eliminated enough of the overheads that most codebases would want to opt-in.
Let’s keep our precious undefined behavior until that point in time.

How do I use it:

  1. On the command-line:

-ftrivial-auto-var-init=uninitialized (the default)
-ftrivial-auto-var-init=pattern
-ftrivial-auto-var-init=zero

  1. Using an attribute:

int dont_initialize_me __attribute((trivial_auto_init(“uninitialized”)));
int zero_me __attribute((trivial_auto_init(“zero”)));
int pattern_me __attribute((trivial_auto_init(“pattern”)));

I disagree with this. I think this is essentially defining a new
dialect of C++, which I have massive concerns about. Additionally, as
much as we might claim it's a transitional measure, we all know that's
not how it'll be used in practice.

Tim.

Hi Tim!

I suspected that you’d disagree with zero-initialization. :slight_smile:

I tried to outline why I think it’s important:

  - Security-minded people think the need it. They might be right.
  - We need data to prove that they don’t need it.

Forgive my use of silly rhetoric: I’m sure you agree that more data is good! Here’s what I propose, since you cared enough to voice your opinion: let’s worth together to narrow the performance gap. I’ve got a handful of optimization opportunities, and you know the backend better than I do. Once we’ve addressed the issues we know about, I’m sure other types of codebases will surface other performance issues, let’s address those too.

Once people who think they need zero initialization are proven wrong though compiler optimizations, we can deprecate (and eventually remove / repurpose) the flag. If they’re not proven wrong… we’ll have learned something.

Sounds fair?

No, it doesn't. It's putting the entire burden on backend optimizers,
with the goal of removing zero-init at some unspecified future date.
It's nothing even remotelty approaching a compromise.

The fragmentation issues need to be considered up front.

Tim.

Very exciting, and long overdue. Thanks for doing this!
Countless security bugs would have been mitigated by this, see below.

Agree with the rationale: UUMs remain bugs, and we need to try hard to not let developers rely on auto-initialization.
(e.g. in future patches we may decide to change the patterns, or to make them different between the runs, etc)
All the old goodness (msan, -Wuninitialized, static analyses) is still relevant.

I am separately excited with this work because it is essentially a precursor to efficient support for ARM’s memory tagging extension (MTE).
if we can make enough compiler optimizations to auto-initialize locals with low overhead, then MTE stack instrumentation will come for ~ free.
http://llvm.org/devmtg/2018-10/talk-abstracts.html#talk16

Does -Wuninitialized still work with -ftrivial-auto-var-init=pattern|zero?

In later patches we may need to have flags to separately control auto-init of scalars, PODs, arrays of data, arrays of pointers, etc.
because in some cases we could achieve 90% of benefit at 10% of cost.

I think that zero-init is going to be substantially cheaper than pattern-init, but happy to be wrong.

Here are some links to bugs, vulnerabilities and full exploits based on uses of uninitialized memory.
The list is not exhaustive by any means, and we keep finding them every day.
The problem is, of course, that we don’t find all of them.

  • Linux kernel: KMSAN trophies, more trophies, CVEs

  • Chrome: 700+ UMRs Chromium found by fuzzing

  • Android: userspace: CVE-2018-9345/CVE-2018-9346, CVE-2018-9420, CVE-2018-9421, CVE-2017-13252; kernel: CVE-2017-9075, CVE-2017-9076, 12% of all bugs (as of 2016).

  • OSS: 700+ bugs in various OSS projects found by fuzzing

  • Project Zero (P0) findings: ~139 total.

  • Mozilla: 100+ bugs

  • “Detecting Kernel Memory Disclosure with x86 Emulation and Taint Tracking” (Sections 3.5 and 6.1.2)

  • Leaks of sensitive information

  • Linux kernel:

  • P0#1431: disclosure of large chunks of kernel memory.

  • https://alephsecurity.com/vulns/aleph-2016005: Android, uninitialized kernel memory leak over USB

  • Windows kernel:

  • CVE-2018-8493

  • P0#1276 (CVE-2017-8685) - a continuous leak of 1kB from the Windows kernel stack, discovered by diffing win32k.sys between Windows 7 and Windows 10. It enabled an attacker to e.g. perform system-wide keyboard sniffing to some extent. Mentioned in P0 blog post about bindiffing.

  • P0#1352 (CVE-2017-11817) - a leak of ~4kB of uninitialized Windows kernel pool memory to NTFS metadata upon mounting the file system, without requiring user interaction. Made it possible to “exfiltrate” kernel memory from a powered-on but locked Windows machine through the USB port.

  • P0#1500 (CVE-2018-1037) - 3 kB of uninitialized user-mode heap memory leaking from Microsoft build servers into a small percentage of .pdb symbol files publicly available through the Microsoft Symbol Server.

  • P0#1267 (CVE-2017-8680) - disclosure of a controlled number of uninitialized bytes from the Windows kernel pool.

  • P0#176, #248, #259, #277, #281 (CVE-2015-0089, many other CVEs) - a disclosure of uninitialized user/kernel-mode heap memory in the OpenType glyph outline VM program, which affected the Windows kernel, user-mode DirectWrite and WPF components, Adobe Reader, and Oracle Java. Discussed in detail in a P0 blog post.

  • User space:

  • *bleed continues

  • Leaks of pointers (allows further attacks)

  • Windows kernel:

  • P0#825 (CVE-2016-3262) - rendering of uninitialized heap bytes as pixels in EMF files parsed by user-mode Microsoft GDI+. Considered a WontFix by Microsoft until it turned out that Office Online was vulnerable and could leak memory from Microsoft servers, at which point they fixed the bug.

  • P0#480 (CVE-2015-2433) - a 0-day Windows kernel memory disclosure that was discovered in the Hacking Team dump in July 2015, and was independently found by ex-P0 member Matt Tait. It was used in an exploit chain to defeat KASLR and reveal the base address of win32k.sys.

  • P0#1153, #1159, #1191, #1268, #1275, #1311 - various examples of relatively long (~100+ bytes) continuous disclosure of Windows kernel memory, which could be easily used to de-aslr the kernel, leak stack cookies etc.

  • MacOS kernel:

  • CVE-2017-2357, CVE-2017-{13836, 13840, 13841, 13842},

  • P0#1410

  • (more of such)

  • User space:

  • P0#711: Android, uninitialized heap memory which could help break ASLR in a privileged process

  • Privilege escalation / code execution

  • Linux kernel: unauthorized access to IPC objects

  • Windows kernel:

  • CVE-2016-0040

  • P0#177 (CVE-2015-0090) - off-by-one in the OpenType glyph outline VM in the Windows kernel, which led to arbitrary read/write thanks to accessing uninitialized pointers. Successfully exploited for privilege escalation on Windows 8.1 64-bit, as shown here.

  • MacOS kernel:

  • CVE-2017-2358

  • P0#618 (CVE-2016-1721): a local user may be able to execute arbitrary code with kernel privileges

–kcc

I’m super excited about all of the non-zeroing options here. I’d actually like to mention some more options that I want to see explored (in future work):

  1. An enhancement to the pattern suggestion:

I’d like variables to be initialized to one of N patterns (using the very nice pattern scheme you outline). The N patterns need to include zero. The selection of the pattern needs to be very hard to rely on. My suggestion would be to rotate between a shuffled list of patterns for each initialization in the function (even better to do this in LLVM after inlining etc). And shuffle the list of patterns using various inputs: the version of the compiler, some user-provided input (random seed?), and the (mangled) name of the function.

The reason I want this is that I think even all 0xAA can be relied upon inadvertently by programmers. As an example: it will reliably initialize booleans to true.

I would like to see something like this as the default instead of the all-0xAA option.

  1. An extension to this pattern (maybe call it “dynamic-pattern”) would be to read the pattern (or some of the N patterns) from a buffer initialized at program start time.

While #2 may imply some overhead, it may be lower than expected – copying memory is in some weird cases faster than materializing patterns and then setting memory. And it may have some advantages.

All of that said:

Responding to Kostya and Chandler inline:

Very exciting, and long overdue. Thanks for doing this!
Countless security bugs would have been mitigated by this, see below.

Agree with the rationale: UUMs remain bugs, and we need to try hard to not let developers rely on auto-initialization.
(e.g. in future patches we may decide to change the patterns, or to make them different between the runs, etc)

Agreed. Chandler has good suggestions along those lines in his reply.

All the old goodness (msan, -Wuninitialized, static analyses) is still relevant.

I am separately excited with this work because it is essentially a precursor to efficient support for ARM’s memory tagging extension (MTE).
if we can make enough compiler optimizations to auto-initialize locals with low overhead, then MTE stack instrumentation will come for ~ free.
http://llvm.org/devmtg/2018-10/talk-abstracts.html#talk16

Does -Wuninitialized still work with -ftrivial-auto-var-init=pattern|zero?

AFAIK, yes. When you run this:

clang -cc1 test/Sema/uninit-variables.c -fblocks -Wuninitialized -Wconditional-uninitialized -ftrivial-auto-var-init=pattern

You get the same 41 initialization warnings as you do without -ftrivial-auto-var-init={pattern|zero} (i.e. -verify passes).

In later patches we may need to have flags to separately control auto-init of scalars, PODs, arrays of data, arrays of pointers, etc.
because in some cases we could achieve 90% of benefit at 10% of cost.

Maybe? I’m hoping that we can quantify the cost and drive it close enough to zero that you’ll be wrong :slight_smile:
Adding the flags and collecting the data as you suggest won’t be hard, but likely not worth doing before we’ve spent some time driving down costs.

I think that zero-init is going to be substantially cheaper than pattern-init, but happy to be wrong.

I suspect that you’re right for now, and as I discussed with Tim I’d like to get to a point where you’re happily wrong.

Here are some links to bugs, vulnerabilities and full exploits based on uses of uninitialized memory.
The list is not exhaustive by any means, and we keep finding them every day.
The problem is, of course, that we don’t find all of them.

Neat! It’s as if you had that list already, and were waiting to send it out.

  • Linux kernel: KMSAN trophies, more trophies, CVEs

  • Chrome: 700+ UMRs Chromium found by fuzzing

  • Android: userspace: CVE-2018-9345/CVE-2018-9346, CVE-2018-9420, CVE-2018-9421, CVE-2017-13252; kernel: CVE-2017-9075, CVE-2017-9076, 12% of all bugs (as of 2016).

  • OSS: 700+ bugs in various OSS projects found by fuzzing

  • Project Zero (P0) findings: ~139 total.

  • Mozilla: 100+ bugs

  • “Detecting Kernel Memory Disclosure with x86 Emulation and Taint Tracking” (Sections 3.5 and 6.1.2)

  • Leaks of sensitive information

  • Linux kernel:

  • P0#1431: disclosure of large chunks of kernel memory.

  • https://alephsecurity.com/vulns/aleph-2016005: Android, uninitialized kernel memory leak over USB

  • Windows kernel:

  • CVE-2018-8493

  • P0#1276 (CVE-2017-8685) - a continuous leak of 1kB from the Windows kernel stack, discovered by diffing win32k.sys between Windows 7 and Windows 10. It enabled an attacker to e.g. perform system-wide keyboard sniffing to some extent. Mentioned in P0 blog post about bindiffing.

  • P0#1352 (CVE-2017-11817) - a leak of ~4kB of uninitialized Windows kernel pool memory to NTFS metadata upon mounting the file system, without requiring user interaction. Made it possible to “exfiltrate” kernel memory from a powered-on but locked Windows machine through the USB port.

  • P0#1500 (CVE-2018-1037) - 3 kB of uninitialized user-mode heap memory leaking from Microsoft build servers into a small percentage of .pdb symbol files publicly available through the Microsoft Symbol Server.

  • P0#1267 (CVE-2017-8680) - disclosure of a controlled number of uninitialized bytes from the Windows kernel pool.

  • P0#176, #248, #259, #277, #281 (CVE-2015-0089, many other CVEs) - a disclosure of uninitialized user/kernel-mode heap memory in the OpenType glyph outline VM program, which affected the Windows kernel, user-mode DirectWrite and WPF components, Adobe Reader, and Oracle Java. Discussed in detail in a P0 blog post.

  • User space:

  • *bleed continues

  • Leaks of pointers (allows further attacks)

  • Windows kernel:

  • P0#825 (CVE-2016-3262) - rendering of uninitialized heap bytes as pixels in EMF files parsed by user-mode Microsoft GDI+. Considered a WontFix by Microsoft until it turned out that Office Online was vulnerable and could leak memory from Microsoft servers, at which point they fixed the bug.

  • P0#480 (CVE-2015-2433) - a 0-day Windows kernel memory disclosure that was discovered in the Hacking Team dump in July 2015, and was independently found by ex-P0 member Matt Tait. It was used in an exploit chain to defeat KASLR and reveal the base address of win32k.sys.

  • P0#1153, #1159, #1191, #1268, #1275, #1311 - various examples of relatively long (~100+ bytes) continuous disclosure of Windows kernel memory, which could be easily used to de-aslr the kernel, leak stack cookies etc.

  • MacOS kernel:

  • CVE-2017-2357, CVE-2017-{13836, 13840, 13841, 13842},

  • P0#1410

  • (more of such)

  • User space:

  • P0#711: Android, uninitialized heap memory which could help break ASLR in a privileged process

  • Privilege escalation / code execution

  • Linux kernel: unauthorized access to IPC objects

  • Windows kernel:

  • CVE-2016-0040

  • P0#177 (CVE-2015-0090) - off-by-one in the OpenType glyph outline VM in the Windows kernel, which led to arbitrary read/write thanks to accessing uninitialized pointers. Successfully exploited for privilege escalation on Windows 8.1 64-bit, as shown here.

  • MacOS kernel:

  • CVE-2017-2358

  • P0#618 (CVE-2016-1721): a local user may be able to execute arbitrary code with kernel privileges

I’m super excited about all of the non-zeroing options here. I’d actually like to mention some more options that I want to see explored (in future work):

  1. An enhancement to the pattern suggestion:

I’d like variables to be initialized to one of N patterns (using the very nice pattern scheme you outline). The N patterns need to include zero. The selection of the pattern needs to be very hard to rely on. My suggestion would be to rotate between a shuffled list of patterns for each initialization in the function (even better to do this in LLVM after inlining etc). And shuffle the list of patterns using various inputs: the version of the compiler, some user-provided input (random seed?), and the (mangled) name of the function.

At a high level this is totally doable and pretty neat, I like it.

Details:

  • Why do you think that it’s important to do after inlining?
  • It seems like you want a “pattern memset” intrinsic, and we’d let it survive early optimizations? It would have to take in the type so it can treat pointers / FP different from other types. We’d then move the initialization logic from clang to whatever LLVM lowering pass. I guess we’d pass in some nonce too, which clang derives as you’ve suggested (user input, compiler version, mangled name).
  • Added benefit: non-clang frontends could use this.
  • I want to avoid making the builds non-reproducible, or rather I’d like users to opt-in to this.
  • I’m worried that this makes incremental software updates much harder, because the random values change so much. We probably need a way to “stabilize” the randomness.
  • Security-wise, this is still fairly predictable in that an attacker can disassemble their binary to see which values are where. Agreed it’s less reliable than infinite scream, and it likely is different for different builds. For a JIT it would be great.

The reason I want this is that I think even all 0xAA can be relied upon inadvertently by programmers. As an example: it will reliably initialize booleans to true.

I would like to see something like this as the default instead of the all-0xAA option.

“Infinite scream” :wink:

  1. An extension to this pattern (maybe call it “dynamic-pattern”) would be to read the pattern (or some of the N patterns) from a buffer initialized at program start time.

While #2 may imply some overhead, it may be lower than expected – copying memory is in some weird cases faster than materializing patterns and then setting memory. And it may have some advantages.

Totally agreed. Even just (random) byte-read + broadcast + store should be relatively cheap. I expect that we’d create some weak global that’s initialized with a low init_priority. The linker would make sure there’s just one of those. Each initialization could read a byte from that buffer chosen at compile time, different from the byte other initializations use.

All of that said:

Sounds fair?

No, it doesn’t. It’s putting the entire burden on backend optimizers,
with the goal of removing zero-init at some unspecified future date.
It’s nothing even remotelty approaching a compromise.

The fragmentation issues need to be considered up front.

FWIW, I agree about the zero case. I’m deeply concerned about fragmentation here.

But I also really want to be able to get the data and measurements needed to address performance problems with non-zero initialization.

I would love to see a way to get the zero initialization behavior for performance testing, but not expose this as a supported flag to users. I can imagine many ways to do that. Tim, would that address your concerns? In that way, we could actually refuse to support the zero behavior long term by making it much more apparent that it is only intended to gather data.

I expected this to be the main sticking point, and I agree there’s a bunch of ways we can hide the option, or purposefully break in the future. I’m open to suggestions on which approach seems more palatable to people. I absolutely want zero-init for performance measurements in the near-medium term, though.

Some very good points about mixing in 0s with the general case there.

FWIW, I agree about the zero case. I'm deeply concerned about fragmentation here.

But I *also* really want to be able to get the data and measurements needed to address performance problems with non-zero initialization.

Yes, having a mechanism to get data does seem essential, but we need
to do everything we can to stop it being abused.

I would love to see a way to get the zero initialization behavior for performance testing, but *not* expose this as a supported flag to users. I can imagine many ways to do that. Tim, would that address your concerns?

Yes; personally I think something stronger than a simple -cc1 flag is
probably needed to achieve that though. A (non-suppressible?) warning
comes to mind; mostly because I haven't come up with anything more
obnoxious yet.

Cheers.

Tim.

Would it be that drastic to have this require a code change/compiler rebuild to enable? It could be designed so the change is small/easy (changing a constant) but that the default compilers we all ship around (& especially not the official releases) don’t allow access to this functionality.

Anyone wanting to gather data would have to make this small change, rebuild their compiler, build their target with this feature & gatehr results from there.

Hey Tim,

IMO, it's *not* very different but we were forced to support those for
compatibility with GCC. I don't think people properly considered the
implications of either of those when adding them. Where possible we've
been trying to educate users and move them to sanitizers so that they
can fix their code.

Also, each additional dialect axis we add makes the problem exponentially worse.

Cheers.

Tim.

  1. Zero initialization

Zero initialize all values. This has the unfortunate side-effect of
providing semantics to otherwise undefined behavior, programs therefore
might start to rely on this behavior, and that’s sad. However, some
programmers believe that pattern initialization is too expensive for them,
and data might show that they’re right. The only way to make these
programmers wrong is to offer zero-initialization as an option, figure out
where they are right, and optimize the compiler into submission. Until the
compiler provides acceptable performance for all security-minded code, zero
initialization is a useful (if blunt) tool.

I disagree with this. I think this is essentially defining a new
dialect of C++, which I have massive concerns about. Additionally, as
much as we might claim it’s a transitional measure, we all know that’s
not how it’ll be used in practice.

How is it different that other options like -fno-strict-aliasing or trap on sign integer wrap?
These are all options that are “defining new dialect of C++”, yet are useful for some customers.
Why is it that bad to support a “new dialect” in these conditions?

IMO, it’s not very different but we were forced to support those for
compatibility with GCC. I don’t think people properly considered the
implications of either of those when adding them. Where possible we’ve
been trying to educate users and move them to sanitizers so that they
can fix their code.

Also, each additional dialect axis we add makes the problem exponentially worse.

FWIW, I think there are other distinctions:

  1. The strict aliasing difference is much smaller IMO, and in practice causes fewer issues. This may be happenstance because we relatively rarely optimize based on this, but the fact remains that I think this divergence will be much more rapid and significant.

  2. I think wrapping math is relatively rarely depended on, and quite easy to fix by casting to unsigned.

On the flip side, I think automatic initialization is a much more slippery and steeper slope.

But perhaps the biggest issue is what Tim mentions at the end – more dialects makes this worse, so the existence of initial dialects doesn’t seem to argue for more being OK.

Also, we forgot to mention the biggest one: -fno-exceptions. Despite being very valuable to some users (including myself), it is … extremely painful to have this divergence.

-Chandler

Ø These are all options that are “defining new dialect of C++”, yet are useful for some customers.
Why is it that bad to support a “new dialect” in these conditions?

Ø Also, we forgot to mention the biggest one: -fno-exceptions. Despite being very valuable to some users (including myself), it is … extremely painful to have this divergence.

Yes, new dialects are bad and, yes, exceptions are the worst. Herb Sutter explains this in tremendous detail in his C++ proposal for replacing exceptions: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0709r0.pdf

-Troy

Would it be that drastic to have this require a code change/compiler rebuild to enable? It could be designed so the change is small/easy (changing a constant) but that the default compilers we all ship around (& especially not the official releases) don’t allow access to this functionality.

Anyone wanting to gather data would have to make this small change, rebuild their compiler, build their target with this feature & gatehr results from there.

This will cripple our ability to do measurements because in many cases we can only build things with whatever is the production compiler.
I’d rather just rename the flag to something like -ftrivial-auto-var-init=zero-SCARY-WARNING-ABOUT-VOID-WARRANTY-GOES-HERE

Reminds me of -fheinous-gnu-extensions. :wink:

Then you might as well maintain a patchset outside the main repository and require patching the sources.
What is time consuming and discouraging is not the complexity of the changes, but the fact that one has to rebuild the compiler in the first place, and make any changes at all.

It would also make it much harder to build only part of a complex environment with the feature enabled - for example, building the underlying libraries with the default compiler, and the tools on top with the patched compiler.

.Andrea

Would it be that drastic to have this require a code change/compiler rebuild to enable? It could be designed so the change is small/easy (changing a constant) but that the default compilers we all ship around (& especially not the official releases) don’t allow access to this functionality.

Anyone wanting to gather data would have to make this small change, rebuild their compiler, build their target with this feature & gatehr results from there.

Then you might as well maintain a patchset outside the main repository and require patching the sources.

For the zero-out case, that’s what I’m suggesting. But I’m also suggesting the design of the non-zero-fill could be chosen to make that patchset relatively small/easy to maintain (essentially an intentional source-level extension point).

What is time consuming and discouraging is not the complexity of the changes, but the fact that one has to rebuild the compiler in the first place, and make any changes at all.

It would also make it much harder to build only part of a complex environment with the feature enabled - for example, building the underlying libraries with the default compiler, and the tools on top with the patched compiler.

That’s kind of the goal - to increase the cost of usage somewhat so it’s less easy for it to become a baked in/depended on feature in the future.

  • Dave

Pretty much - and I don’t think any choice of spelling is bad enough that it’d make it substantially less likely that people would end up depending on the behavior in their non-buggy codepaths. Once the flag is written into someone’s build system, there it is… even if it’s absurdly long/verbose/angry/whatever, generally it’ll slip under the radar after it’s written into the build system.

  • Dave

One more data point: among the bugs found by MSAN in Chrome over the past few years 449 were uninitialized heap and 295 were uninitialized stack.
So, the proposed functionality would prevent ~40% (i.e. quite a bit!) of all UUMs in software like Chrome.

Would it be that drastic to have this require a code change/compiler rebuild to enable? It could be designed so the change is small/easy (changing a constant) but that the default compilers we all ship around (& especially not the official releases) don’t allow access to this functionality.

Anyone wanting to gather data would have to make this small change, rebuild their compiler, build their target with this feature & gatehr results from there.

Then you might as well maintain a patchset outside the main repository and require patching the sources.
What is time consuming and discouraging is not the complexity of the changes, but the fact that one has to rebuild the compiler in the first place, and make any changes at all.

It would also make it much harder to build only part of a complex environment with the feature enabled - for example, building the underlying libraries with the default compiler, and the tools on top with the patched compiler.

+1

I just lurk here, but I think the proposed functionality would be greatly appreciated by C/C++/Obj-C developers on macOS, where MemorySanitizer is not supported and valgrind can't even launch TextEdit. If I'm not mistaken, it would be the *only* tool on macOS to catch UUMs.

Cheers,

Sean