I’m new to the mailing list (been lurking a few weeks). I just compiled the code and started looking at it.
At this point in life, I’m working with software analysis. In the past I’ve written interpreters for Domain Specific Languages, done code analysis, and written a lot of C and C++.
My desire it to help clang be able to provide for software analysis what Parasoft’s C++Test does, but to make a better scripting language than the “symbolic” language the C++Test provides. Most importantly, clang’s licensing will allow anyone to make use of it without costing several body parts. I’d love to be able to write out some language analysis rules that would be the equivalent of:
“flag any places where there is a missing (copy constructor | assignment operator) when there is in any of (base class | contained class | base-of-contained class) a non-POD data type.”
Once a set of predicates were written in such a scripting language, the set could be evaluated against a stored form of the AST, and those parts of software analysis which can be automated could then be put into the test process. Even those parts which are harder and require a CFG could be aided by such an approach, instead of hand-coding each query into a distinct binary.
Question 1) Does this sound interesting? I’d be working on such features only when I had no specific tasking at work to do - such a tool would aid my work once it were mature, so I’m allowed to use my “down time” to work on it. [And my family time is far too valuable. ]
I have a question about the parser design - it looks like it was written by hand: was it?
For C, that’s not so hard, as the C spec is pretty straightforward. But for C++, a hand-written parser seems to me to be a bit more difficult - especially with C++0x’s changes.
Question 2) Is there any interest in using a lex/yacc type of approach?
I’ve both written by-hand parsers and used parser-generators to create front ends - and the latter makes for considerably less work when the language is well-defined.
Question 3) How does clang know when it’s being targeted at C vs C++? There are some ares in which valid C is invalid C++, so the parser (if exact for both languages) would either have to know the difference, would have to switch, or perhaps is C++ being treated as a superset of C?