Exposing the diagnostic engine to C++

cjdb · August 28, 2023, 5:25pm

⚙ D157572 [clang] Add `[[clang::library_extension]]` attribute (author: @philnik) aims to add an attribute that lets standard libraries mark symbols as extensions. Based on the thread, it seems like the intention is to identify entities that have been backported, rather than to allow libc++ to identify support genuine extensions.

@AaronBallman has suggested that we take a more general approach to the matter, which would allow any librarian to warn on misuse of their library. This would be similar to __attribute__((diagnose_if(...))), but would also allow for the author to provide a warning group.

Given the amount of open design space here, I think it’s best to migrate the discussion from a Phabricator patch, into a Discourse thread, which will give it more visibility. Here are a few thoughts of mine:

D157572 is currently very specific: something more general would be ideal, but we should be prepared to see an ecosystem where user-defined diagnostics will have varying levels of quality
any new attribute we add should be in the C++11 attribute form, rather than the GNU attribute form (either disqualifying __attribute__((diagnose_if, or requiring it to be spelt as [[clang::diagnose_if as well)
there should be a way to error as well (I think this brings __attribute__((enable_if along for the ride)
we should consider whether these diagnostics have licence to add their own warning groups, or if they’re only allowed to opt into existing ones

Aaron also has pointed out that there are some knobs on diagnostics that might want to consider surfacing as well. I don’t have strong opinions here, other than that we should default to the existing defaults, and that named parameters would be preferable to [[clang::warn_if(condition, "text", "-Wgroup", true, true, false, true)]].

bcardosolopes · August 28, 2023, 6:26pm

Nice, this would be great!

Any specific reason why that shouldn’t just be a usual -Werror for a given warning / diag group?

+1

cjdb · August 28, 2023, 6:36pm

IIUC enable_if can diagnose preconditions at compile-time in ways that static_assert can’t. If that’s the case, then unconditionally erroring on those might be more appropriate.

constexpr int div(int const num, int const denom) noexcept
[[clang::error_if(denom == 0, "dividing by zero is an undefined operation")]]
{
  RUNTIME_PRECONDITION(denom != 0);
  return num / denom;
}

vvassilev · August 28, 2023, 6:44pm

Thanks for bringing this up, it seems a very good idea.

Have you thought about using this approach in the context of template instantiations? If we have such a facility could a library author provide a better template instantiation diagnostics?

tahonermann · August 28, 2023, 7:05pm

The proposed functionality appears to overlap with the std::constexpr_print() and std::constexpr_error() facilities Barry Revzin has proposed in P2758 (Emitting messages at compile time)

Endill · August 29, 2023, 7:51am

I like the idea. Library authors trying to give their users a good experience would love this.
Whatever we do, we should be explicit about what we promise our users, so that they know what can be relied upon, and what we reserve for ourselves to change in the future. Ideally this interface should be thoroughly checked during compilation.

+1

AaronBallman · August 29, 2023, 12:18pm

This is a hard request to fulfil – this attribute requires access to the parameter list so it can name parameters, and the only syntactic place that’s allowed for [[]]-style attributes is after the parameter list, which makes the attribute apply to the function type rather than declaration. It’s possible that this is actually the correct design (e.g., perhaps we want function pointer assignment to fail if the types aren’t similarly diagnosed) but this is also a pretty invasive approach in various ways – it would impact template specialization behavior, overloading, etc. Without WG21 or WG14 providing a way to write a function declaration attribute which can access the parameter list, I think we should continue to use __attribute__((diagnose_if)) with its current spelling.

Concretely, what I’m proposing is changing the current attribute from:

attr: diagnose_if ( expr , string-literal , string-literal )

which takes the condition to be tested, a string literal message to display to the user, and a string literal (enumerator) specifying warning vs error, into something more along these lines:

cat_enum : warning | error
category : string-literal | cat_enum
option_enum: ShowInSystemHeader | DefaultIgnore | DefaultError | SFINAEFailure
option-list: option_enum | option-list option

attr: diagnose_if ( expr , constant-expr , category ) |
      diagnose_if ( expr , constant-expr , category , string-literal ) |
      diagnose_if ( expr , constant-expr , category , string-literal , option-list )

which continues to take the condition to be tested, message to display to the user (now accepting any kind of C-string-producing constant expression), and category (spelled as either an enumerator or a string literal), but adds a string literal to specify the diagnostic group and a trailing list of enumeration options for the diagnostic. These trailing options are a subset of the ones we expose in our diagnostic .td files (and I’m not strongly tied to the spelling of the options so long as the enumerators are clearly named and documented).

I believe we can extend this interface in the future to cover other situations like diagnostic notes and diagnostic remarks, but those can be handled at a later date so we can focus on warnings and errors.

As a concrete example, this can be used like:

int isalpha(int ch)
  __attribute__((diagnose_if(ch > (2 << sizeof(char) * (CHAR_BIT - 1)),
                                             "can only pass 7-bit ASCII values",
                                              warning, "conversion", DefaultError)));

which would produce a warning diagnostic under the -Wconversion warning group which defaults to being an error whenever the function is passed a value that’s outside of the range of what ASCII encodes.

Some design considerations:

The trailing options can be accepted in any order, which also allows us to extend that list in the future should we need to.
I was thinking that the user can only specify existing warning groups. If they specify a warning group we do not support, I think the user should get an “unknown warning group” warning on the attribute, but the attribute otherwise continues to work, just issued under the generic warning group for diagnose_if. At some point in the future, we can consider extending the grammar to support InGroup ( string-literal ) as the warning group so the user can produce ad hoc groups, perhaps with an ability to optionally specify a parent diagnostic group.
We would need to make sure that diagnostic suppression mechanisms work with the attribute (command line, pragmas, fatal error count cut-offs, etc).
We may want to consider allowing users to pass in arguments to the diagnostic for diagnostic formatting similar to our C++ interface and uses of %select, etc, but for right now, I think accepting a constant expression is sufficient for configurability of the message itself.
This currently only supports warning and error diagnostics, but we may want to someday extend this to allow users to emit notes; I’m not proposing that right now because notes are typically generated at other locations along with another diagnostic and we have no cross-language way to specify where to emit such notes from the context of the error|warning attribute.

Thank you for sharing that! There is overlap but I don’t think this proposal would meet our needs (as such, I do not support the WG21 proposal in its current form). Specifically, we need the ability to control things the standard traditionally won’t talk about, like diagnostic groups, diagnostic categories, and the various control knobs we need. Further, that proposal is entirely unusable in C. We should keep an eye on this in WG21, but I don’t think we need to follow that direction at this time.

cjdb · August 29, 2023, 6:01pm

I was going to hold off on asking for this till the initial design had landed. I’d really like this for a bunch of stuff though!

Agreed. I think this should wait until the restructuring happens. I have a very vague idea of how to achieve this, but it shouldn’t be given more thought until both of these efforts are completed.

jankorous · August 29, 2023, 6:49pm

This is an important aspect for us at Apple - glad you’re thinking of it! The users need to have a way to suppress diagnostics coming from APIs outside of their control.

jyknight · September 22, 2023, 3:17pm

I think this is generally a great idea, but I’m worried about letting libraries add diagnostics to existing warning groups. It feels like that’s just going to lead to trouble, when users actually want the compiler emitted diagnostics, but not some random library-defined diagnostic that happened to use the same diagnostic group.

I wonder if it would be better to have all the diagnostics always use a consistent prefix. E.g. something like -Wlibrary-. Then, the group-name passed to diagnose_if can be any string the author likes, and the prefix will be prepended to it. So, for your example, the diagnose_if(..., warning, "conversion", DefaultError) would be emitted under the tag -Wlibrary-conversion, and be controllable (enabled, disabled, downgraded to warning) in the usual manner via that name.

cor3ntin · September 24, 2023, 1:55pm

This proposal offers a model for when side effects at compile time should be observed. And there are some exploration there to do.

Some warning should be emitted for template declarations, some on instantiation. Libc++ folks recently asked for a warning that is only emitted when a variable that is constant initialized is potentially odr used in a special context, which probably can’t be model by something user facing.
We also had requests about warning on defining deprecated template instantiations which are incorrectly silenced when the point of instantiation is a system header.

Adding a bunch of ways to control diagnostic groups and such is neat (although I am very concerned about the potential for abuse there, and we don’t actually have a way to define a “library”), but we also need to understand when the feature is useful as there are many places diagnostics can be useful, during constant evaluation and upon declarations being only two of them.

I do hope we do not try to reimplement format in there, or some sort of clang specific reflection/string interpolation mechanism for this feature.

AaronBallman · September 25, 2023, 7:38pm

I think it depends on whether we trust users or not, really. It’s certainly a feature that’s open to abuse, and so having some form of escape hatch makes some degree of sense. But at the same time, as a library author, it would be incredibly frustrating to not be able to add diagnostics under an existing diagnostic group when such a group is the best possible choice. There’s not much need to update -Wvla from a library, but I can definitely imagine wanting to update -Waddress for pointer-related diagnostics or -Wconversion for conversion-related diagnostics, etc.

What about this general idea but with a twist? Instead of prefixing automatically, we consider any user-defined warning group to automatically be a subgroup under the other warning group.

For example: some library adds a new diagnostic under -Wconversion, but a user wants to turn it off without turning off builtin -Wconversion diagnostics, so they use -Wno-conversion-from-user-defined-diagnostics, where -from-user-defined-diagnostics is automatically supported on all warning groups. Same idea applies as what you were thinking: -Werror -Wno-error=conversion-from-user-defined-diagnostics would work, etc.

What concerns me about automatically adding a prefix is how much stuff depends on the name of the warning group. e.g., pragmas to disable -Wconversion shouldn’t have to be updated to also disable -Wlibrary-conversion, same for command line arguments, etc. If we treat the user-defined warnings as a subgroup of the parent, then -Wno-conversion will behave the way a user expects by disabling both the builtin and the user-defined ones, instead of the user having to do -Wno-conversion -Wno-library-conversion, but there’s still control to do -Wconversion -Wno-conversion-from-user-defined-diagnostics.

(Note, I picked a suffix out of a hat and am not suggesting that exact phrasing; I think we can find a way to make it shorter.)

WDYT?

rnk · October 2, 2023, 5:31pm

I can’t speak for James, but I share his concern. I’d be fine with your solution. As long as there is a way to separate library and compiler diagnostics, that works for me.

I don’t have great ideas for a clean spelling, but for brainstorming purposes, here are some ideas:

-Wno-library=conversion: Similar to -Wno-error=, we could make -Wno-library=. I like the stronger separator between the group name and the “library” indicator.
-wlibrary=conversion: Same idea, extending -w, but this is probably worse than extending -Wno-.
-Wno-conversion,library: Different separator, blech.
Other ideas for “library”: “user-defined”, “extension”

cjdb · October 2, 2023, 5:38pm

We probably also want -Wno-library-error=conversion or -Wno-error-library=conversion.

Topic		Replies	Views
RFC: Improving Clang’s Diagnostics Clang Frontend	44	6892	April 28, 2023
RFC: c++ diagnostics Clang Frontend	2	122	April 5, 2010
Diagnostic Improvements Clang Frontend	9	99	November 18, 2008
[Feature request] Enable libclang to report specific diagnostic error Clang Frontend	0	78	March 21, 2014
Proposal: Add diagnostic enumeration to libclang Clang Frontend	9	97	January 10, 2012

Exposing the diagnostic engine to C++

Related topics