Taint analysis


I was playing with experimental taint analyzer and found a simple case
where taint checker fails:

void test_bad()
  char s[80];
  sprintf(s, "%s", "aaa");
  fscanf(stdin, "%s", s);
  printf(s); // expected-warning {{Uncontrolled Format String}}

If sprintf is commented out, diagnostic is produced as expected.

Full testcase attached.

Dmitri Gribenko

taint-checker-fail.c (675 Bytes)

You should get the correct behavior if you include the header for 'stdin' and 'sprintf' instead of declaring it yourself.

Here is a bit of background if you are interested.

Each variable has a symbol associated with it, which represents the value of the variable. Function calls might change the values of global variables. The way the analyzer represents it is by replacing the symbols corresponding to global variables with new symbols.(Note that by default, the analyzer is intraprocedural.) However, that rule can be / and is relaxed to say that calls do not invalidate specific globals defined in system headers, like 'stdin' (see commit 147569).

In order to recognize 'stdin' as one of tainted sources, we rely on the fact that it is the symbol first bound to an extern declaration with FILE* type etc. If 'stdin' is not recognized as a system global, the corresponding symbol will get reset after the call to 'sprintf' (or any other non-system call).