[clang-tidy] a simple strncmp checker

Hello cfe-dev,

I saw a presentation from someone using another static analysis tool (http://www.slideshare.net/Andrey_Karpov/200-open-sourceprojectslater) and in that presentation they mentioned finding a relatively simple error with strncmp. They searched for calls to strncmp where one or both of the string parameters were shorter than the limit provided. That is, we have something like:

bar = strncmp(foo, “asdf”, 6);

That this was flagged as an error by the static analyzer. I did a simple regex search through one of the larger code bases at work and found several errors of this type.

I’ve got a basic patch to clang-tidy to add this as a checker.

A few questions:

  1. Is this type of checking sound? If so, should I put it up for review?
  2. In the above example, would strncmp(foo, “asdf”, 5) be an error? That is, should we include the null byte at the end of the string?
  3. One possible valid use case of strncmp is to do prefix checking. In that case the limit n would be less than one or both of the strings. However, in my completely unscientific sample at work I saw what appear to be copy-paste errors where the string was changed but not the size with the resulting strncmp only checking a prefix of the given string. Should we warn in these cases as well?
  4. My clang-tidy checker is implemented with ASTMatchers for very simplistic checking. Would it be better to implement this for context-sensitive checking? If so, can someone point me to an existing checker to study?

Thanks,

Bob Kuo

In article <CAGuPgHKheLELEbvtsa0U9F8SxnRXydxCFyXiRhtfL8FHkEN0CA@mail.gmail.com>,
    Bob Kuo via cfe-dev <cfe-dev@lists.llvm.org> writes:

something like:

    bar = strncmp(foo, "asdf", 6);

That this was flagged as an error by the static analyzer. I did a simple
regex search through one of the larger code bases at work and found several
errors of this type.

Nice!

All my opinions (who else's opinions would they be?) follow...

1) Is this type of checking sound? If so, should I put it up for review?

Yes!

2) In the above example, would strncmp(foo, "asdf", 5) be an error? That
is, should we include the null byte at the end of the string?

IMO, it should not be an error. Perhaps a configurable parameter to
warn/error on this edge case.

If I was reading the code strncmp(foo, "asdf", 5) I would interpet
that to mean a check against the entire string "asdf\0". With 4 instead
of 5, I would interpret that to mean "asdf" was compared as a prefix.

3) One possible valid use case of strncmp is to do prefix checking. In
that case the limit n would be less than one or both of the strings.
However, in my completely unscientific sample at work I saw what appear to
be copy-paste errors where the string was changed but not the size with the
resulting strncmp only checking a prefix of the given string. Should we
warn in these cases as well?

IMO, only if I ask for such warnings via a configuration parameter.

4) My clang-tidy checker is implemented with ASTMatchers for very
simplistic checking. Would it be better to implement this for
context-sensitive checking? If so, can someone point me to an existing
checker to study?

For new clang-tidy checks, I recommend starting simple and working up
to more complicated situations based on feedback from actual usage.
My "simplify boolean expression" check benefitted greatly from
applications on the clang code base and submitting those changes as
diffs for review and incorporating feedback from those diffs into the
tool.

This code is not erroneous. It’s mildly suspicious, but given the second operand is a constant, only vaguely so. A style warning would be reasonable. No. Possibly when one of the arguments is a constant, but not otherwise.