Taking ownership of /clang/test/Analysis

In a github comment @Xazax-hun mentioned that the directory /clang/test/Analysis “has tests that are independent of the CSA”. I became curios and investigated this, which showed that practically all testcases “belong to us” and we can take ownership of the directory.

Most tests (i.e. files that contain “RUN” lines) contained either %clang_analyze_cc1 calls or %clang_cc1 calls with a separately added -analyze flag. (By the way should we standardize all these tests to use %clang_analyze_cc1?)

I listed the “exceptional” test files with

for i in $(find . -type f); do
  if grep -q 'RUN' $i && ! grep -q 'clang_analyze_cc1\|-analyze\b' $i; then
   echo $i;
  fi;
done                     

and got the following list:

  • PR9741.cpp
  • analyzer-list-configs.c
    • tests -analyzer-config-help
  • cxx-for-range-cfg.cpp
    • added in 2012
    • tests another crash in CFG construction
  • dead-stores.c
    • analyzer test that uses %check_analyzer_fixit
  • func-mapping-test.cpp
    • uses %clang_extdef_map – if I understand correctly that also “belongs to” the analyzer
  • method-arg-decay.m
    • strange testcase from 2008 which invokes clang like %clang_cc1 -analyzer-checker=core -verify %s
    • does -analyzer-checker automatically imply -analyze ??
  • preprocessor-setup.c
    • tests behavior of __clang_analyzer__
  • show-checker-list.c
    • tests -analyzer-checker-help
  • uninit-asm-goto.cpp
    • relatively recent test file from 2020…2023
    • tests -Wuninitialized
    • apparently not related to the static analyzer
  • uninit-sometimes.cpp
    • old test file from 2012…13
    • tests -Wsometimes-uninitialized
    • apparently not related to the static analyzer
  • builtin_signbit.cpp
  • analyzer-checker-option-help.c
    • tests -analyzer-checker-option-help

plus 15 .dot files under exploded-graph-rewriter/ and 9 .test files under scan-build/. (Those .dot and .test files test subprojects that “belong to” the static analyzer, so I think they clearly deserve to be here.)

Based on this I’ll find more suitable places for the three test files (uninit-asm-goto.cpp, uninit-sometimes.cpp, builtin_signbit.cpp) that are not related to the static analyzer – probably by contacting the contributors who have recent commits connected to them.

However, I think we can already start “taking ownership” of the directory as these three exceptions don’t represent a significant external interest. @Xazax-hun (or somebody else who knows the process) could you configure the github bots to tag commits editing files under /clang/test/Analysis with clang:static analyzer?

Moreover, what do you think about standardizing use of %clang_analyze_cc1 instead of %clang_cc1 ... -analyze or the strange solution in method-arg-decay.m? Should I create a commit that ensures that the analyzer is started consistently in all the test files?

2 Likes

Thanks for looking into this! I will try to get the bots updated.

PR open: Include test folder in the Clang Static Analyzer team mentions by Xazax-hun · Pull Request #127810 · llvm/llvm-project · GitHub

Quick status update:

  • Github bots assign clang:static analyzer to commits that edit /clang/test/Analysis.
  • The test file builtin_signbit.cpp was relocated to a more suitable place; the relocation of the two uninit- testcases is under review.

Remaining action points:

  1. Should I standardize the use of %clang_analyze_cc1 instead of %clang_cc1 -analyze?
  2. Should we rename the clang/test/Analysis directory to clang/test/StaticAnalyzer or would this cause too much trouble e.g. in downstream forks?

Should I standardize the use of %clang_analyze_cc1 instead of %clang_cc1 -analyze?

Yes. Thanks.

Should we rename the clang/test/Analysis directory to clang/test/StaticAnalyzer or would this cause too much trouble e.g. in downstream forks?

I’m not exactly sure how many tests we have for CFG, and liveness.
For example: live-stmts.cpp.
The would need to be relocated to a suitable place too to some other place.

Should I standardize the use of %clang_analyze_cc1 instead of %clang_cc1 -analyze?

Yes. Thanks.

I dug into the Python scripts that manage the execution of these lit tests, and I realized that %clang_cc1 -analyze is not completely equivalent to %clang_analyze_cc1 because %clang_analyze_cc1 expands to %clang_cc1 -analyze %analyze and then later the logic in clang/test/Analysis/analyzer_test.py will replace %analyze with either “-analyzer-constraints=range” or “-analyzer-constraints=z3 -DANALYZER_CM_Z3”. (So if the test uses %clang_cc1 -analyze, this range-or-z3 logic wouldn’t work.)

FYI the Z3 logic is already broken. Probably beyond repair.

FYI the Z3 logic is already broken. Probably beyond repair.

That is what I suspected. Then – by the way – WDYT about removing this “tests can be executed with either range or z3 constraint manager” logic from the repository?

(The code quality of these Python scripts is also very questionable – if in the future somebody wanted to have this feature in their testing, they would be better off with writing their own 50 lines of python instead of fixing the existing 100 lines of spaghetti.)

I’m for removing that effectively dead code. No bots check it, nor would pass the tests anyway if you would force them to run.

I accidentally found this discussion after forgetting about it for a month; and I decided that I will try to eliminate the dead code that tries to support -analyzer-constraints=z3 and standardize all tests to use either %clang_analyze_cc1 or %clang_cc1 -analyze.

However, if I remove the constraint manager handling trickery, then %clang_analyze_cc1 won’t have any benefit over %clang_cc1 -analyze, so I’m leaning toward switching every testcase to %clang_cc1 -analyze and removing the alias %clang_analyze_cc1 (because I don’t think that it provides any benefit, so I think it’s better to be more transparent).

@steakhal What would you think about this direction?

I find the change from %clang_analyze_cc1 to %clang_cc1 -analyze not really justified at this point. They are about the same length. I think I’d opt for less merge-conflicts for downstream users because the benefit is not quite there to offset the downsides.

Ok, I’ll standardize %clang_analyze_cc1 then. I vaguely remembered that both options appear in many test files, but it turns out that %clang_analyze_cc1 has a huge majority (928:30), so keeping it is clearly a better choice.