Greetings Clang Front End Developers!
My name is Benjamin Bales, and I am the CTO and co-founder of
QbitLogic, a startup company in Atlanta. We are interested in
sponsoring feature bounties to further develop the clang static
analyzer. Can someone recommend me a point of contact with whom I can
open a dialogue? Let me know. Thanks!
nice to hear! Would you consider clang-tidy as second part of clangs
static analysis framework as worthwhile, too?
All the best, Jonas
Nice to hear from you. We could certainly add clang-tidy to our
roadmap. The primary reason we are interested in improving the clang
static analyzer is because we build our commercial product, CodeAI
(www.mycode.ai), on top of it. I am hiring an internal work force to
enhance the checkers, but I am interesting in exploring open source
alternatives to push forward this work. I have some ideas, but I
think it would be best if we could arrange a phone call next week
sometime to discuss things. Are you the maintainer of this project?
no. I dont have any major position, i would just potentially benefit
from the bounties
I dont know who has the power to decide about financial questions, but
the LLVM deciders are most likely involved in it!
All the best, Jonas
Yeah, I guess, The LLVM Foundation might be the right contact for the business/commercial side of things (not sure).
On the technical side - I'm currently reviewing a large portion of patches for the analyzer. I'd be happy to review and accept any patches, regardless of where they came from - under the usual terms of the LLVM developer policy (you should totally have a look at it). Most importantly, please discuss any non-trivial work before you start coding, and split it up into small incremental patches that gradually improve an experimental feature until it's ready to be turned on by default - this is called "developing under the flag". For example, a new checker will stay in the alpha package during development, and an experimental analyzer core feature will be disabled by an -analyzer-config option. The main point here is to speed up reviews and make sure you don't need to re-do anything - because it's extremely hard to do anything right with LLVM in isolation. Of course, I cannot guarantee that any particular patch is going to be accepted, at least not before I see it.
Thanks for your insightful responses! We’ve decided for now to first learn more about how development is done on the analyzer. We are primarily interested in contributing to the development of new checkers, particularly buffer issues (e.g. buffer overflow). Aside from looking at http://clang-analyzer.llvm.org/potential_checkers.html and http://clang-analyzer.llvm.org/checker_dev_manual.html, is there anything else we need to be aware of before we begin development? Also, what is the procedure for getting new checkers approved to add to the list of potential checkers?
all code that shall land must pass review, which is done via Phabricator:
I think you could get inspired by this check: (final commit) and (review process). Getting started with clang-tidy might be simpler see for an introduction. Both clang and clang-tidy are good for different things, but its better an expert explains that, i think is a good overview of the basics.
The Analyzer's documentation is a bit fragmented, but not entirely lacking. The "Writing checker in 24 hours" video is a must-see (https://youtu.be/kdxlsP5QVPw).
Apart from what you've already found and what's inevitably present in the global llvm doxygen (i recommend googling clang class names, at least at first), there's also in-tree text documents explaining overall design of some parts: https://github.com/llvm-mirror/clang/tree/master/docs/analyzer - if you're interested in buffer overflows you should definitely read the one about RegionStore.
There's also my out-of-tree workbook at https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf which reflects the current state of things (moderately up-to-date, some things have changed slightly), but my opinions on what was a good idea and what should work differently have changed since then.
There were two attempts to implement buffer overflow checks (namely ArrayBoundChecker and ArrayBoundCheckerV2), but none of them ever became feature-complete enough to be enabled by default yet. Not only there are false positives, but warning messages are very hard to understand - they often require intermediate notes along the execution path that led to the bug (which are normally implemented via checker-specific BugReporterVisitor objects) and in this case they're relatively tricky to implement. I strongly suspect that any sensible work on buffer overflows should start with proposing a good bug report visitor for one of these checkers and then understanding its reports and seeing what parts are broken based on that. Generally, buffer overflows are hard to find statically; even though we seem overally suitable for that purpose and have some useful facilities, details may get very tricky.
Thanks for your insightful responses! We've decided for now to first learn more about how development is done on the analyzer. We are primarily interested in contributing to the development of new checkers, particularly buffer issues (e.g. buffer overflow). Aside from looking at http://clang-analyzer.llvm.org/potential_checkers.html and http://clang-analyzer.llvm.org/checker_dev_manual.html, is there anything else we need to be aware of before we begin development? Also, what is the procedure for getting new checkers approved to add to the list of potential checkers?
The list of potential checkers is stored in the clang source tree, much like the whole website, eg.: https://github.com/llvm-mirror/clang/blob/master/www/analyzer/potential_checkers.html
So you can post improvements as code reviews.
Currently this list is not in a fantastic shape and requires cleanup because since the last time it received a major update we've realized that not all of these checkers are something that the analyzer is really good at. So please discuss before you start working on any of them. Be careful about checkers that require checking an invariant on all paths through a piece of code (eg. SameResLogicalExpr - "the condition is true on all paths") because the analyzer is not guaranteed to explore all paths, and we don't have a good infrastructure for figuring out if any paths were dropped, and we cannot provide good user experience in our additional bug report notes for such bugs like we do for normal single-path bugs by explaining the path to the user.