RFC: Make Multiple Unsequenced Modifications an Error by Default

Currently, Clang is able to correctly diagnose multiple, unsequenced modifications to the same variable (see below code and warning).

int foo(int a) {
    return (++a) * (++a);
}
// warning: multiple unsequenced modifications to 'a' [-Wunsequenced]

Because this is undefined behavior that is shown to be diagnosed in some cases, I suggest that the default diagnostic level of this be turned from warning to error. This will force developers to either explicitly acknowledge and accept the UB (or preferably fix it), which should be the approach we take to UB in our code bases.

Pinging @cjdb, as we had discussed this and wanted to get more thoughts from the community.

3 Likes

Provided -Wunsequenced only diagnoses undefined behaviour, I’m very much in favour of having -Werror=unsequenced by default.

Just a quick test, this produces the same warning:

int foo(int a, int b) {
    return a + b;
}

int main() {
    int a = 3;

    std::cout << foo(++a, ++a) << std::endl;

    return 0;
}

Here, we’ve moved from undefined to unspecified behavior.

Right, that shouldn’t be an error by default (even if I’d like it to be). Should we split the cases up?

Definitely agree that I’d prefer it to be an error by default, but it shouldn’t be an error by default. I’m fine with splitting the cases up. Is there precedent for having the same diagnostic be an error in one case and a warning in another?

1 Like

I’m in favor of exploring whether we can turn this into a warning that defaults to an error. We will have to ensure the false positive rate is acceptable, so I would recommend trying to build a significant corpus of code to see what breaks as a result and whether the breakages are expected/reasonable. (I would sure hope that not a lot of code breaks as a result of this, but I’ve been surprised in the past.)

2 Likes

I’d also like that to be an error by default, but I don’t think it’s practical. I think the cases would have to be split up. The unspecified behavior is certainly a portability warning (and possibly a correctness one), but the undefined behavior is pretty squarely a correctness diagnostic.

Just for clarity since I read this wrong the first time: you’re saying that the current -Werror=unspecified by default isn’t practical, not splitting them up is impractical?

Sorry for introducing confusion! I’m saying that unsequenced modifications resulting in unspecified behavior (rather than undefined behavior) are not probably practical to diagnose as an error by default (due to how much existing code I would expect to break as a result). I think splitting -Wunsequenced into the undefined and unspecified cases would be a good idea so we can make the undefined variant of it default to an error.

So the end result I’m imagining would be to add two new diagnostic groups under -Wunsequenced, one for the undefined behavior case (error by default) and one for unspecified behavior (warning by default).

2 Likes

This seems perfectly reasonable. As for cases where we look for false positives, do we want to consider unspecified behavior triggering this diagnostic as a false positive (such as my example above), or just cases that are well defined that trigger the diagnostic as a false positive.

I’m envisioning two separate diagnostic groups, one for undefined behavior, one for unspecified behavior. I’d consider it a false positive to diagnose something that is unspecified under the undefined flag or to diagnose something that is undefined under the unspecified flag. (Or something that is neither undefined nor unspecified under either flag.)

1 Like

I’ve gone ahead and made a few cases to baseline any changes against in Godbolt. Compiler Explorer

I’ve got 2 true positive cases of undefined behavior showing up, a false positive for the diagnostic appearing in C++17 (see below), and an true positive case of unspecified behavior showing up.

void false_positive_17() {
    int i = 2;
    auto fn = [](int a, int b) {
        
    };

    fn(i = -2, i = -2); // multiple unsequenced modifications to 'i' [-Wunsequenced]
}

If someone could clarify for me, if we called fn(i = -2, i = -3), would this be defined or undefined behavior? Having a bit of trouble parsing the wording of the standard for this.