Patches for enabling clang's noundef analysis by default

Hello all,

I would like to get feedback for two patches that enable clang’s noundef analysis flag by default.
The two patches are: https://reviews.llvm.org/D105169 , https://reviews.llvm.org/D108453 .
They are splitted for readability, but they will be merged into one commit and pushed if accepted.

The noundef analysis flag was added by D81678 <https://reviews.llvm.org/D81678> in the past.
Its goal is to mark arguments and return values in C/C++ as noundef if legal.

Besides its performance benefit to sanitizers, which was the main motivation, attaching noundef is beneficial because it allows quite a few optimizations that are unsound w.r.t. undef or poison (they are usually guarded with isGuaranteedNotToBeUndefOrPoison check).

Since the fact that arg/ret values are noundef is derived from the source language (C/C++)'s specification, the information is permanently lost unless explicitly attached by clang (as other dereferenceable/align/… attributes from C/C++ do).

Previously, the flag was not activated by default because it required a lot of tests to be updated.
This can raise conflicts with downstream patches (discussed in this thread: D82317 <https://reviews.llvm.org/D82317>)

To avoid conflicts, I’d like to update requested tests by simply adding -disable-noundef-analysis at // RUN: %clang_cc1 ... rather than updating their texts (which is what D108453 is already doing).
For the requested directories or tests, I’ll simply add the flag to the RUN: command.

This will reduce the opportunity of raising conflicts with downstreams.

Any questions or concerns about enabling the flag are appreciated.
The two patches are written by Hyeongyu Kim, and I’m a messenger for the patches. :slight_smile:

Sincerely,
Juneyoung

I can add one thought regarding why this direction makes sense and why it doesn't constrain optimizations.

Traditionally we don't want to mark too many things as UB as it restricts code movement and thus limits optimizations. That's why we have poison for e.g. signed arithmetic overflow rather than using UB as allowed by the C standard.
For function calls, however, optimizers are already super constrained: in general we cannot move them around because they may not return, they may throw an exception, they may modify memory, do a syscall, and so on. So adding another restriction to function calls isn't a big deal; optimizers don't touch them that much anyway.

This proposal adds more UB, as it turns undef/poison into UB on function calls, but that doesn't limit optimizations. So it seems like a good tradeoff: we learn some values can't be undef/poison "for free". Plus that UB can be dropped if needed; dropping noundef is legal! So there are really no downsides.
That's why I believe this is a good direction for clang. I'm sure other frontends will also find this 'noundef' useful.

Nuno