[RFC] Control -Wvla for conformant parameters

Hello,

It appears that -Wvla tends to be used for two different purposes,

- Code compatibility with pre- and post-C99 code
- Good practices since VLA can have security implications

-Wvla seems to only serve the former one satisfactorily now. For the latter
purpose, conformant array parameters would still generate VLA diagnosis
although they degrade into pointers and have no runtime implications. There is
currently no way to inhibit such false positives.

  $ clang -Wvla -x c -std=c99 - <<< "void f(unsigned n, int a[n]);"
  <stdin>:1:26: warning: variable length array used [-Wvla]
  void f(unsigned n, int a[n]);
                          ^

I propose that a separate warning against conformant parameters be extracted
from -Wvla, so that it is possible to negate it. I would be more than glad if
I could attempt to work out the implementation should this proposal be
accepted.

Best regards,
Hu Jialun

Hello,

It appears that -Wvla tends to be used for two different purposes,

- Code compatibility with pre- and post-C99 code
- Good practices since VLA can have security implications

-Wvla seems to only serve the former one satisfactorily now. For the latter
purpose, conformant array parameters would still generate VLA diagnosis
although they degrade into pointers and have no runtime implications. There is
currently no way to inhibit such false positives.

        $ clang -Wvla -x c -std=c99 - <<< "void f(unsigned n, int a[n]);"
        <stdin>:1:26: warning: variable length array used [-Wvla]
        void f(unsigned n, int a[n]);
                                ^

I propose that a separate warning against conformant parameters be extracted
from -Wvla, so that it is possible to negate it. I would be more than glad if
I could attempt to work out the implementation should this proposal be
accepted.

I think this makes a fair amount of sense. In the standard, VLAs are
split into two concepts: a variably-modified type and a variable
length array. (Basically, the type of a VLA is a variably modified
type, so you get a VM type in function parameters but a VLA in local
object declarations.) A variably modified type is not the same level
of concern as a variable length array, so I think it makes sense to
give a distinction with our warning groups.

One interesting trouble is with the diagnostic names. Users are used
to -Wvla warning them about VMs already, so removing warnings they may
have wanted to get seems like a bad thing. Also, VM types cannot exist
without VLA support in the compiler (aka `__STDC_NO_VLA__`), and -Wvla
tells you all of the places you're forming a type that may not be
portable to other compilers. So my gut instinct is to leave -Wvla
alone and make a new subset grouping for VMs. But the downside to this
is: the names are exactly backwards. All VLAs are a VM, but not all
VMs are a VLA, so we'd want -Wvm to be the superset and -Wvla to be
the subset. I'm not certain what the best way forward here is, but I
suspect we can pick a defensible path later,

~Aaron

It appears that -Wvla tends to be used for two different purposes,

  • Code compatibility with pre- and post-C99 code
  • Good practices since VLA can have security implications

-Wvla seems to only serve the former one satisfactorily now. For the latter
purpose, conformant array parameters would still generate VLA diagnosis
although they degrade into pointers and have no runtime implications. There is
currently no way to inhibit such false positives.

$ clang -Wvla -x c -std=c99 - <<< “void f(unsigned n, int a[n]);”
:1:26: warning: variable length array used [-Wvla]
void f(unsigned n, int a[n]);
^

FWIW, I personally do consider this code to have a “security implication,” in the moral-hazard sense. People frequently pass “array parameters” thinking that that’s what’s happening, when in fact they’re silently decaying to pointers. Particularly with VLAs, but also with non-VLA arrays, this can be a big deal.
https://quuxplusone.github.io/blog/2020/03/12/contra-array-parameters/

void helper(char mcs_mask[IEEE80211_HT_MCS_MASK_LEN]) {
for (int i=0; i < sizeof(mcs_mask); ++i) { // totally wrong
mcs_mask[i] &= SOME_MASK;
}
}

So I disagree with your “only serve the former [code compatibility bullet point]”; I think the current situation is fine and dandy.
But I see your point about there being two technically-different features at the machine level, and maybe it would be nice to put them under different warnings. A backward-compatible way to do that would be

  • Change -Wvla from a single warning to a warning group
  • Add -Wvla-parameter as a new warning; include it in the -Wvla warning group

Alternatively, and closer to your original suggestion:

  • Add -Warray-parameter as a new warning, for things like
    void f(int n, int a[n]);
    void f(int n, int a[2]);
    which are equally error-prone. In fact, the Linux kernel issue mentioned above uses the non-VLA syntax, and currently gets no warning from Clang at all, even in -Weverything mode.

  • People who enable -Wvla should probably also enable -Warray-parameter. Clang might enable it under -Wextra (with a special carveout for main(int, char *[])), or just leave it off-by-default.

I think this makes a fair amount of sense. In the standard, VLAs are
split into two concepts: a variably-modified type and a variable
length array.

(I’m coming from the C++ side with no particular C-standardese expertise.) FWIW, I don’t see a benefit to introducing the term “variably modified type” all the way up in the command-line warning-option interface. I think it may make sense to mention the term in the wording of the diagnostic message, just not in its name.
IIUC, Clang currently misdiagnoses
https://godbolt.org/z/6x7399eqP

return sizeof(int (*)(int[n])); // -Wvla warns here (wrongly)
because int (*)(int[n]) is not a variably modified type — is that right? So it would be cool to clean up some of this code at the same time.

–Arthur

>
> It appears that -Wvla tends to be used for two different purposes,
> - Code compatibility with pre- and post-C99 code
> - Good practices since VLA can have security implications
>
> -Wvla seems to only serve the former one satisfactorily now. For the latter
> purpose, conformant array parameters would still generate VLA diagnosis
> although they degrade into pointers and have no runtime implications. There is
> currently no way to inhibit such false positives.
>
> $ clang -Wvla -x c -std=c99 - <<< "void f(unsigned n, int a[n]);"
> <stdin>:1:26: warning: variable length array used [-Wvla]
> void f(unsigned n, int a[n]);
> ^

FWIW, I personally do consider this code to have a "security implication," in the moral-hazard sense. People frequently pass "array parameters" thinking that that's what's happening, when in fact they're silently decaying to pointers. Particularly with VLAs, but also with non-VLA arrays, this can be a big deal. https://quuxplusone.github.io/blog/2020/03/12/contra-array-parameters/

    void helper(char mcs_mask[IEEE80211_HT_MCS_MASK_LEN]) {
        for (int i=0; i < sizeof(mcs_mask); ++i) { // totally wrong
            mcs_mask[i] &= SOME_MASK;
        }
    }

So I disagree with your "only serve the former [code compatibility bullet point]"; I think the current situation is fine and dandy.

FWIW, I suspect we can be smarter here. As in your example above, if
we notice that the array index is a constant (either a literal or via
a macro as above), we could suggest a fix-it to turn it into a static
array extent, or a fixit to turn it into a pointer to an array of
specific size, etc.

But I see your point about there being two technically-different features at the machine level, and maybe it would be nice to put them under different warnings. A backward-compatible way to do that would be
- Change -Wvla from a single warning to a warning group
- Add -Wvla-parameter as a new warning; include it in the -Wvla warning group

Yup, I think this is the important bit -- there are two features here,
and folks (including WG14 members) have wished there was more
granularity in terms of what gets diagnosed.

Alternatively, and closer to your original suggestion:
- Add -Warray-parameter as a new warning, for things like
    void f(int n, int a[n]);
    void f(int n, int a[2]);
  which are equally error-prone. In fact, the Linux kernel issue mentioned above uses the non-VLA syntax, and currently gets no warning from Clang at all, even in -Weverything mode.
- People who enable -Wvla should probably also enable -Warray-parameter. Clang might enable it under -Wextra (with a special carveout for `main(int, char *[])`), or just leave it off-by-default.

"Error prone" will require some data to back it up. Some users
consider both of those signatures to be perfectly fine at expressing
their intent, others find them to be horribly confusing. Whatever we
do, we should run it over a large corpus of C code to see what the
diagnostic behavior looks like.

I think this makes a fair amount of sense. In the standard, VLAs are
split into two concepts: a variably-modified type and a variable
length array.

(I'm coming from the C++ side with no particular C-standardese expertise.) FWIW, I don't see a benefit to introducing the term "variably modified type" all the way up in the command-line warning-option interface. I think it may make sense to mention the term in the wording of the diagnostic message, just not in its name.

I don't have a particular care about the name of the diagnostic
itself. I was mostly pointing out that there are separable features.
In fact: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2778.pdf

IIUC, Clang currently misdiagnoses
https://godbolt.org/z/6x7399eqP
    return sizeof(int (*)(int[n])); // -Wvla warns here (wrongly)
because `int (*)(int[n])` is not a variably modified type — is that right? So it would be cool to clean up some of this code at the same time.

I'd have to stare at the C standard a while to be sure, but I think
that is a variably modified type, but is not a VLA. Regardless, I
agree that there's likely terminology that needs to be cleaned up as
part of this.

~Aaron

IIUC, int (*)(int[n]) is the same type as int (*)(int*) — and yes, I should have thought to spell that out clearly in my first message. :slight_smile:

–Arthur

>
> IIUC, Clang currently misdiagnoses
> https://godbolt.org/z/6x7399eqP
> return sizeof(int (*)(int[n])); // -Wvla warns here (wrongly)
> because `int (*)(int[n])` is not a variably modified type — is that right? So it would be cool to clean up some of this code at the same time.

I'd have to stare at the C standard a while to be sure, but I think
that is a variably modified type, but is not a VLA. Regardless, I
agree that there's likely terminology that needs to be cleaned up as
part of this.

IIUC, `int (*)(int[n])` is the same type as `int (*)(int*)` — and yes, I should have thought to spell that out clearly in my first message. :slight_smile:

You nerd sniped me into checking. :smiley: It's a VLA... which decays into
a pointer... so everyone is correct.

C2x (N2596) 6.7.6.2p4 specifies that the type of the parameter is a
variable length array type.
C2x (N2596) 6.7.6.3p6 is what does the array adjustments that convert
it from an array to a pointer.

I should note, GCC also diagnoses that as a VLA.

~Aaron