HandWritten Compiler FrontEnd

I have little experience with compiler frontends. I am curious why is
clang hand written, instead of automatically generated by compiler
compilers such as lex and yacc. what are some advantages of a hand
written compiler frontend.

Thanks

Xin

A couple reasons off the top of my head:

1. Parsing C++ (C/ObjC to a lesser extent) is extremely difficult with
a generated parser since the languages are context-dependent. Also,
with C++ things get really nasty because there are some syntactic
constructs that have to be disambiguated by the compiler in rather
elaborate ways; see lib/Parse/ParseTentative.cpp for the most
significant ones (consider the disambiguation that has to happen to
resolve the ambiguity described in the comment in
<http://clang.llvm.org/doxygen/ParseTentative_8cpp_source.html#l00440&gt;\).

2. It's really hard (if not impossible) to give good diagnostics with
a compiler compiler. Good diagnostics are extremely important for
Clang.

3. Clang's parser simultaneously handles C/C++/ObjC/ObjC++ and a huge
variety of different language options for each one, all of which need
to be properly recognized or diagnosed if valid/invalid given the
current language and settings. This would be difficult to do with a
compiler compiler.

4. You can hack on it without knowing any fancy compiler compiler's
domain-specific language for expressing parsing and the theory behind
it (LR, LALR, LL(k), etc.). All you need to know is C++. This is not
to say that hacking on it is "easy" (for a variety of reasons, one
being that it is poorly documented), but you can at least look at the
code and understand what it is doing.

5. This is similar to the last one, but worth stating separately: it
is easy to reason about performance, and it's easy to optimize using
"normal" techniques for optimizing C++. Clang is very fast.

--Sean Silva

and the generated parser code is ugly :frowning: