[RFC] Diagnosing use of VLAs in C++

Variable length arrays (VLAs) are a feature of C99 and we treat them as a conforming language extension in C++. The feature reuses typical array declaration syntax, as in:

void func(int n) {
  int array[n]; // This is a VLA
}

void other(int n, int array[n]); // This is a variably modified type,
                                 // also a kind of VLA but without stack allocation.

Use of a VLA will perform a stack allocation to create an array whose size is determined by the argument passed to the function. However, the reuse of constant array declaration syntax coupled with the potential for a user-controlled stack allocation, make use of VLAs a security concern. For example:

https://nvd.nist.gov/vuln/detail/CVE-2015-5147
https://nvd.nist.gov/vuln/detail/CVE-2020-11203
https://nvd.nist.gov/vuln/detail/CVE-2021-3527

Given the recent advice from certain government agencies to avoid C and C++ due to poor security practices and a lack of coverage with tooling (Consumer Reports, National Security Agency) and our recent efforts to improve security related diagnostics, I am proposing we diagnose use of this extension in C++ more aggressively than we have previously. Specifically, I would like to warn about the extension by default in -std=c++NN language modes and add the warning to -Wall for -std=gnu++NN language modes. We currently issue diagnostics for this under -Wvla-extensions; I’m not proposing changing the behavior of what code gets diagnosed, just whether -Wvla-extensions is enabled by default or not.

I have put up a patch for the changes at https://reviews.llvm.org/D156565 and there has been some positive feedback, but I wanted to see if there were wider concerns from the community before moving forward. Note, I have filed an issue with GCC to consider making the same changes. That issue is not resolved, but discussion has died down over this past month and does seem to have some support as well but it is not guaranteed they’ll make the same changes. The primary concern with enabling the diagnostic by default is that it’s a conforming language extension, so you can enable warnings for it with -pedantic, and we don’t typically issue a diagnostic by default for non-problematic use of an extension (aka, a congratulatory diagnostic). While I agree with this logic in general, I think use of VLAs in C++ should still be discouraged by default in favor of more idiomatic features (like std::vector, etc) given the ease of accidental usage coupled with the security concerns.

If the community doesn’t have concerns with moving forward, I would like to land these changes early in the 18.x cycle so that we have a long bake time in case there’s surprising fallout from enabling the diagnostic.

8 Likes

I’m in favor of the change. Refactoring can result in a declaration that previously declared an array to be unintentionally changed to a VLA and even senior programmers declare VLAs unintentionally occasionally. Elevating awareness of possible unintended use makes sense to me.

3 Likes

This sounds good to me. In our code, most instances seem unintentional (1477195 - chromium - An open-source project to help move the web forward. - Monorail).

1 Like

One place where this might give a little trouble is in code that uses VLAs as compile-time assertions:

#include <assert.h>

#define fancy_assert(X) do { \
  if (__builtin_constant_p(X)) { \
    char compile_time_assert[(X) ? 1 : -1]; \
    (void)compile_time_assert; \
  } else { \
    assert(X); \
  }\
} while(0)

void check_dynamic(bool v) {
    fancy_assert(v);
}

void check_static() {
    fancy_assert(true);
    fancy_assert(false);
}

These aren’t using VLAs for VLAs sake, but will break under -Werror if the warning is turned on by default in those language modes.

I’m not aware of an alternative means of enabling this pattern (maybe we should have a builtin to make it a first-class construct?) That said, this is easy enough to work around with _Pragma("clang diagnostic ignored \"-Wvla\""), so I don’t see a big issue.

1 Like

Thanks, that’s an interesting use case! I believe those usages do exist in the wild and they would be caught out by the behavior change, though I suspect many uses can be replaced by static_assert because they’re less fancy than this. Having a first-class replacement for that would help people migrating, but wouldn’t help the diagnostic from being chatty in the first place.

That said, I’m not certain whether this would be too chatty or not. We do have the suppression mechanism for it, but perhaps this is a case where we’d want to recognize the pattern (specifically, creating a VLA whose extent is specified by a conditional expression where one branch is a negative integer literal) and issue a diagnostic under a separate group. Then we can suggest replacing:

int old_style_static_assert[p ? 1 : -1];

with

static_assert(p);

with that diagnostic, which might be a kindness for users in language modes that support the construct.

3 Likes

If you’ll forgive a slight tangent (I recently encountered this exact pattern)…one reason this construct seems dangerous is neither C nor C++ warns if someone forgot the __builtin_constant_p check:

extern int foo;

#define fancy_assert(X) do { \
    char compile_time_assert[(X) ? 1 : -1]; \
    (void)compile_time_assert; \
} while(0)

void check_static() {
    fancy_assert(foo == 1);
    fancy_assert(foo == 2);
}

Admittedly I’m not exactly sure what clang would warn on outside of VLA usage if this happened, but the above compiles (without -Wvla).

In C++ -Wvla does helpfully provide a note alongside the VLA warning:

<source>:9:18: note: read of non-const variable 'foo' is not allowed in a constant expression
    fancy_assert(foo == 1);
                 ^
<source>:1:12: note: declared here
extern int foo;
           ^

But it’s an easy mistake to make, especially if macros are involved in the expressions being asserted. (Out of curiosity I searched GitHub and found several examples of macros with this issue, e.g. postgres.)

Naturally, static_assert does fail in this situation (error: static assertion expression is not an integral constant expression).

With that scenario in mind and back to the topic at hand, I agree that pointing people to using static_assert instead would be a welcome kindness. :slight_smile:

I’m surprised that coroutines weren’t mentioned in this discussion.

As i know, its impossible to determine frame size for coroutine using VLAs and because of that it must be ill-formed, but i tryed to check:

And… Its internal compiler error:

1 Like