Yet another static analysis tool

Hi, everyone!
I’m working in a fin-tech company called Itiviti (now Broadridge Trading & Connectivity Solutions). And in Itiviti, we care about code quality a lot, so, we already use a bunch of static analyzers such as clang-tidy, PVS, etc. But none of these tools had enough knowledge about our codebase and they couldn’t catch well-known bugs specific to the patterns that are frequently used in our code. So two years ago we started to develop a clang plugin the main goal of which was to close this gap in static analysis.

At first, all checks were specific to our codebase, but as the tool was developed, we started to add more general-purpose checks, e.g. catching inefficient patterns while working with maps. Later this plugin was used in the ITMO university as a homework checking tool in a C++ course.

As it stands now, the tool is built enough for it becoming an open-source project: GitHub link. We have started this project with almost zero knowledge about Clang tooling and LLVM in general, so we are asking you to take a look at it and see if it sparks your interest. We will be equally happy to receive feedback whether good, bad or ugly. Any thoughts are much appreciated!

Thanks in advance.

2 Likes

Hi!

I agree that some of those checks could be really useful for the general audience. In case you are interested in giving back to the LLVM community, the most impactful would be to open patches to add these checks to Clang Tidy. Tidy is already integrated with many IDEs, and the upstreaming process could be a great way to get feedback for each individual check from the reviewers. In the past, at one of my employers we also had a custom plugins with checks that we considered generally useful and ended up contributing them back to Tidy. We found the upstreaming process really rewarding and often ended up with a better quality check after the reviews.

1 Like

I agree with @Xazax-hun , If these checks will help the general C++ programmers , finding bad patterns in the general code. We really have to add these checks in the Clang Tidy, thus some patches will really appreciateble , and there are lot of folks (including me) would love to take part in the review/Convo.

Hi,

Looking over your list of checks these would be great additions to clang-tidy. Recently the documentation for contributing to clang-tidy was significantly extended, so we hope that lowers the bar for contributions.

Some of your checks seem style related, such as warning on functions declared static in a translation unit. The LLVM convention is to do exactly this – declare local functions static instead of inside an anonymous namespace. Other code bases have different style preferences, of course.

Hello!
I’m sorry, it is completely off-topic, but why? Is there a special reason for it?

The reasoning goes that it is easier to spot that the function is local to the TU when marked static, than when it is buried inside a large namespace block that is off-screen when looking at the function in question.

1 Like

A lot of these checks look universally useful indeed. Yes, clang-tidy is the natural habitat of such checks.

I see you went with visitors instead of ASTMatchers. It’s purely a matter of taste, you probably could have saved quite a bit of code if you went for ASTMatchers but in some cases visitors are definitely necessary due to higher flexibility. If you decide to put these checks into clang-tidy you’ll have to at least make a thin layer of matchers but it’s up to you how deeply do you want to convert.

If a check is specific to your proprietary codebase, there’s a good chance that it can be generalized via attributes. Eg., if you have a check that says

“don’t call top_secret_function()

it can often be generalized to

“don’t call any functions annotated as [[clang::dont_call_me]]

which is nicely upstreamable and may eventually attract attention of a lot of other contributors and be generally useful for everybody. (Of course this example is oversimplified and such attribute already exists).

2 Likes

I’m sorry, it is completely off-topic, but why? Is there a special reason for it?

See LLVM Coding Standards — LLVM 18.0.0git documentation, to quote:

The problem with anonymous namespaces is that they naturally want to encourage indentation of their body, and they reduce locality of reference: if you see a random function definition in a C++ file, it is easy to see if it is marked static, but seeing if it is in an anonymous namespace requires scanning a big chunk of the file.

Because of this, we have a simple guideline: make anonymous namespaces as small as possible, and only use them for class declarations.

1 Like

Also note that, in general, a suggestion to trade static for an anonymous namespace can cause subtle code breakage and shouldn’t be done blindly. See e.g. https://git.libreoffice.org/core/+/f213092d5a0b784df1b12ae62b7c5dcfb7d27280^! “loplugin:external: Extend comment about the perils of moving a function” for some inspiration.

For the benefit of readers: it can break ADL because ADL is namespace sensitive and you’re changing the namespace containing the function.