RFC: Making explicit the relationship between modules, checks and check aliases

This may have been brought up before…

Currently there exists an implicit relationship between the names of clang-tidy modules (e.g. in the module registry, the names are “ZZZ-module” and all the checks registered in that module are named “ZZZ-…”).

No explicit relationship is given between a check and its aliases; rather the aliases are simply registered in the module containing the alias referencing the type of the check from the other module’s namespace, e.g.

CheckFactories
    .registerCheck<clang::tidy::readability::BracesAroundStatementsCheck>(
        "google-readability-braces-around-statements");

Without manually chasing the type BracesAroundStatementsCheck in your IDE it’s difficult to see where a check’s aliases are present and the relationship between check names and module names is only discoverable by scanning registration text strings.

Can we make these relationships more explicit? Namely:

  • Currently modules all name themselves “ZZZ-module”; I suggest that the “ZZZ” part of the module be modeled explicitly as part of the ClangTidyModule.
  • Registering a check in a module prefixes the module’s name to the check before it enters the registry, or at the very least verifies that it uses the same prefix as the module’s prefix.
  • The name of a check is encapsulated in the check type. Currently the name of a check is hard-coded in the module’s addCheckFactories method.
  • The aliases of a check are encapsulated in the check type. Currently the names of aliases are hard-coded in the addCheckFactories method of the module containing the alias.

Explicitly modeling the alias relationship has the advantage that you can avoid running the same check multiple times when check wildcards are used, provided they don’t have different configured options. (Configuring an alias differently than the check for which it is an alias seems goofy, but allowed in the current config file mechanism.)

The implication is that the addCheckFactories method becomes a sequence of statements like this:

BracesAroundStatementsCheck::register(this, CheckFactories);

for each check that is registered in the module. The module’s this is used to obtain the module’s name prefix for use when registering the check/alias.

The specific syntax/calls are simply proposed from the idea that occurred to me today and not necessarily the result of any BDUF thinking.

Thoughts?

2 Likes

Personally I would change only one thing.
Make aliases a real aliases instead of double check registration.
This mean that if I enable alias Y to check X, then X check should be created and enabled. And when I run --list-checks, then I should see there X, not Y.

In check factories it could be like:
.registerAlias(“readability-my-x-check”, “cppguidelines-my-y-alias”)
.registerAlias(“readability-my-x-check”, “cert-my-next-alias”);

As for moving name of check from Factory to for example Check class (like in constructor argument to base class), it can make one issue, that you may need to create all checks, this mean also call constructors, that could parse options, that could not be configured for check, and cause checks to fail/crash.
So best way would be to leave names outside of check class.

But if this registration would be per-check, like in this “register” static function, and all aliases also would be there, then this could be fine.

Basically that would be tiny change…

Check names (and aliases) are static properties of the class, not of an instance. So getting the name of a check doesn’t mean instantiating a check. It’s essentially meta-data for a check. It never changes at runtime.

This mean that if I enable alias Y to check X, then X check should be created and enabled. And when I run --list-checks, then I should see there X, not Y.

I think users will find this confusing because they shouldn’t be forced to understand the aliasing relationship between checks.

1 Like

I’ts already confusing when you have same check enabled multiple times under different names, different modules with different configurations.
Simply check aliases should be aliases, not “new checks”.
This mean that they should not be printed with --list-checks, or they could be printed with things like --list-checks-details, that could print info like some short description, aliases, configuration options.

What you propose is fine, it doesn’t impact end user. I would just do one more step, and cleanup current mess with aliases.

1 Like

With regards to aliases, do remember that sometimes they are not pure aliases. They are indeed “new checks”, which can differ in default configuration settings, set by the module configuration.

The definition of a check has 2 parts:

  • The check implementation.
  • The check options.

Only when both parts are identical can we say that an alias is a true alias. And this is why it’s so hard to get rid of them :slight_smile:

1 Like

IMO, an alias with different defaults should be explicitly modeled as such and not through alternative mechanisms.

Would be good to move with this…
I would see this more like this:

In module file just leave:

CheckFactories.registerCheck<SomeCheck>();

On check level add static method:

struct CheckInfo
{
   StringRef Name; // including module prefix
   std::vector<StringRef> Aliases; // including module prefix
};
static CheckInfo getCheckInfo();

Then whatever check is enabled via check name or it alias, we enable it always as under main check name.
Same goes for check configuration, we take first one found, with preference of check name config.
Same for NOLINT, we allow all, check name or alias (backward compatibility).
For current aliases with different config we register them as new check (derive from original class & re-implement getCheckInfo).

Method can be different, it’s fine if it would be a register but main thing is to check class to control everything, and aliases become well just aliases, not new checks, therefore check would be enabled if itself is enabled or any aliases are enabled. Aliases wouldn’t be listed under --list-checks, there we would list only original check name. we could have something like --list-aliases to just list aliases, or --checks-info to list some more detailed, maybe include simple check description or supported options.

Anyway I think that We should do something with this problem, because its very problematic to manage multiple check aliases.

Hi all! I’m new to llvm and have done few contributions in lldb and now moving to clang. I’m coming here from this issue:

I’m willing to work on the problem discussion above. Any general advice or guidance for me on how to kick-start my journey into solving this problem, I’d be grateful for the assistance. Thanks in advance!

This it’s a long standing problem in clang-tidy, with quite a few patches trying to fix it without success.

I don’t believe this is solved by throwing code at the problem. Check aliases are at the core of clang-tidy, and some rearchitecting will be needed to change that.

Thus, what i believe is needed is some proper design proposal that we iterate on and agree upon. Once that’s in place, coding will be the easier part.

1 Like

Yes, explicit modeling of a check and it’s named aliases is what’s needed to avoid running the same check repeatedly under different names.