Who is using Static Analyzer? Where are our users?

As an open-source analyzer, our primary goal is to improve precision and usability. However, another question worth considering is how the analyzer is used in industry. I cannot find the page introducing the users and the influence of our tool. Do we need to set up such a doc page? And if you are a user of CSA, either during coding or in the CI process, could you please leave a message here telling us about your project?

Here’s my current understanding:

  • There are IDEs that integrate the static analyzer, so all or some users of these IDEs may potentially use it:
    • Xcode (Apple’s IDE, the necessary choice for macOS and iOS apps.)
      • So that’s a lot of people, but hard to tell how many actually enable static analysis.
        • There are multiple WWDC videos attracting attention to it.
      • Static analysis is off by default, users can enable it on every rebuild in project settings or make one-off runs.
      • It’s enabled by default during build in some “project templates”, mostly driver projects where security is important.
    • QtCreator (a cross-platform open-source IDE, designed for Qt but also generally a good C++ IDE)
      • Enabled by default, together with clang-tidy.
        • Unlike static analyzer, most clang-tidy checks are disabled by default.
          • But they ship a few custom Qt-specific checks that are enabled.
      • Analysis results displayed “as you type”, with surprisingly low latency, I’m a big fan!
  • There’s a few large corporations that use the static analyzer internally:
    • Apple, through Xcode.
    • Ericsson, through CodeChecker.
      • They seem to have a healthy partnership with a university in Hungary who provides them interns to work on the static analyzer, which I regularly meet on Phabricator.
    • I heard Samsung considered it as part of its internal static analysis solution, but I haven’t heard from them in a while.
    • Google is probably using it together with clang-tidy, but I don’t know how much.
  • There’s a hard-to-estimate amount of individual users:
    • People who install scan-build through linux repositories.
      • A lot of them don’t even know what clang static analyzer is, they just know that scan-build is an open-source static analysis tool they can use.
      • It might be possible to find numbers of downloads from linux repos.
    • A lot of clang-tidy users enable clang-analyzer-* checks. Text output is not ideal but we do receive a lot of valuable bug reports from them.
    • I don’t know what the situation with CodeChecker is, how many individual users use it at home, maybe Ericsson folks have some insights on it.
      • Is it provided in linux repos?
  • While talking to people in github bug reports, I’ve met a few maintainers of popular open-source projects who said they use it on a regular basis and are happy with it, but I don’t remember the exact details.
  • I’m embarrassingly oblivious about the situation with github automation. Like, how easy it is to integrate the static analyzer into your github pull request automation workflow? If it’s easy enough, maybe a lot of people are already using it?
2 Likes

There’s also always been some action from CodeSonar/GrammaTech. They contributed the SARIF output mode to the static analyzer, so I suspect they’re consuming our output for their needs, but I’ve no idea what happens next. And they’ve been sending some really good patches lately!

I use it to find bugs and cleanup opportunities in OpenZFS. Here is an example of a cleanup opportunity (which exposes a bug in Clang’s static analyzer):

Here is an example of a bug:

I hope to eventually eliminate all of the reports so that it could be integrated into our CI in a way that the only reports generated are new reports. Failing that, I will probably find a way (possibly through codechecker) to have our CI only report new reports. For now, I am slowly working on eliminating the reports by modifying the source code. Often, even when it does not report an actual bug, changing the code to eliminate the defect report is an overall improvement, so the rest of the project’s contributors respond positively to those changes.

I know of three other projects using Clang’s static analyzer:

zstd’s CI is using scan-build according to their documentation:

lz4 also uses scan-build in its CI, which is no surprise given that the same person made both zstd and lz4:

FreeBSD uses the static analyzer via make analyze:

Many appreciations to your replies. However, I am asking for the names of our users here, rather than detailed issues.

For false positives and functionality improvements, you can submit issues to the LLVM repo here. And we will try to fix them in future versions. Besides, you can also apply temporary code fixes during CI as a workaround [1].

[1] van Tonder, Rijnard, and Claire Le Goues. “Tailoring programs for static analysis via program transformation.” In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering , pp. 824-834. 2020. (https://doi.org/10.1145/3377811.3380343)

At SonarSource, we also make heavy use of the clang-frontend as a library. The same applies to the static analyzer and its internals like the ExplodedGraph.

@ryao thanks for the writeup! Happy users make me happy!

Note that in scan-build HTML report filenames (report-XXXXXX.html), the XXXXXX part is a stable hash that you can use to identify the report. It’s stable across runs (assuming more or less the same build environment) and even somewhat resilient across source code changes. So you can find newly introduced reports by simply diffing your ls listings. This is a somewhat recent change, they used to be random. I’m pretty sure CodeChecker uses the same hash to match the reports. Also HTML reports contain more machine-readable info about the problem encoded into them as magic <!-- comments -->; in particular, that’s what scan-build uses to generate index.html in the first place, and you can regenerate index.html at any time (say, after deleting known reports) via scan-build --generate-index-only ..

The Linux kernel is wired to run the static analyzer. IIRC clang-tidy is used as the driver; it can run clang-analyzer checks.

1 Like