Recursive Descent Parser

Hi,

I am a newbie to clang and LLVM.

I realize that the clang uses Recursive descent parsing for compiling a c file.

I was wondering, if there is any way to print the production rules or any kind of parser related details during the parsing of a C file by clang.

Thanks.

Bye,

Raghavan V

I don’t think there is. Have a look at clang::ParseAST. it keeps track of the callstack in case of crash. It also keeps track of some statistics but I think that’s all there is.

I suppose that someone could write a utility using Clang to dump the call graph starting at the base rule.

I’m not sure what Raghvan is trying to achieve but I don’t think the graph would be very helpful. Grammar productions are available in Annex A of the standard (if that’s all he’s after) but AFAIK clang’s parser doesn’t map 1:1 to them.

Hi,

I just wanted to see how the recursive descent parser functions get activated. This is more for showing students how a recursive descent parser works in a production compiler rather than checking correctness.

I had done a similar thing for the 'cc1' in gcc by compiling 'cc1' using the '-finstrument-functions' option. This gives me control each time a function gets called or exits. Using that I was able to generate something like this.

# The input C source file
$ cat -n test3.c
     1 int var1,var2;

# The output from my instrumented 'cc1' compiler when I compile the above file.

{ enter c_parser_translation_unit
   { enter c_parser_external_declaration
      { enter c_parser_declaration_or_fndef
         { enter c_parser_declspecs
            { enter c_parser_consume_token
               Token No:1 Lexeme:'int' Type:CPP_NAME
            } exit c_parser_consume_token
         } exit c_parser_declspecs
         { enter c_parser_declarator
            { enter c_parser_direct_declarator
               { enter c_parser_consume_token
                  Token No:2 Lexeme:'var1' Type:CPP_NAME
               } exit c_parser_consume_token
               { enter c_parser_direct_declarator_inner
               } exit c_parser_direct_declarator_inner
            } exit c_parser_direct_declarator
         } exit c_parser_declarator
         { enter c_parser_consume_token
            Token No:3 Lexeme:',' Type:CPP_COMMA
         } exit c_parser_consume_token
         { enter c_parser_declarator
            { enter c_parser_direct_declarator
               { enter c_parser_consume_token
                  Token No:4 Lexeme:'var2' Type:CPP_NAME
               } exit c_parser_consume_token
               { enter c_parser_direct_declarator_inner
               } exit c_parser_direct_declarator_inner
            } exit c_parser_direct_declarator
         } exit c_parser_declarator
         { enter c_parser_consume_token
            Token No:5 Lexeme:';' Type:CPP_SEMICOLO
         } exit c_parser_consume_token
      } exit c_parser_declaration_or_fndef
   } exit c_parser_external_declaration
} exit c_parser_translation_unit

It maps closely with the following productions from the standard at
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf. The section numbers given in the Table 3 1 correspond to the sections in the same document.

translation-unit : external-declaration 6.9.1
external-declaration : declaration 6.9.1
external-declaration : function-definition 6.9.1
declaration : declaration-specifiers init-declarator-listopt ; 6.7.1
declaration-specifiers : type-specifier declaration-specifiersopt 6.7.1
type-specifier : int 6.7.2
init-declarator-list : init-declarator 6.7.1
  : init-declarator-list , init-declarator 6.7.1
init-declarator : declarator 6.7.1
declarator : pointeropt direct-declarator 6.7.6
direct-declarator : identifier 6.7.6

Thanks.

Bye,
Raghavan V

If you only care about C: Long ago, clang’s Parser talked to an abstract “Action” interface, and Sema was only one possible implementation of it. There used to be also a ParserPrintActions that could be requested via -parse-print-callbacks. This got deleted in http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20100719/032534.html , so if you check out anything older than r109391 you can play with that. It might do what you want.