Clang tutorial

Hi everyone,
is there any clang tutorial which explain how to embed the parser into another C/C++ program?

I started with this: clang tutorial

but with the new build of Clang my program doesn't work anymore because it cannot no longer parse some headers!

I guess the best option is to have a look to clang-cc.cpp but it's rather huge, is there any stripped version?

thanks, Simone

Here's a highly simplified version of clang-cc.cpp that contains the
bare essentials to get things working. It takes C code from stdin,
and writes LLVM assembly to stdout. Hopefully, it's a decent starting
point.

-Eli

clang-cc-simple.cpp (3.02 KB)

I am basically 10 days ahead of you. I have modified the tutorial files from amnoid.de.
You can also go through Steve Naroff presentations at Clang website. if you are looking into AST traversal you have to look into ASTConsumers especially the ASTdumper is a good example. If you are looking for rewriting you can look at the ObjC rewriter.
You can use the file that Eli sent or modify clang-cc top level to act as your top level.
Clang relies on llvm command line interface to parse command line arguments. which you can easily extend to parse your specific switches.

The doxygen documentation will be useful as you go on.

Have fun.

Moataz

PPContext.h (1.34 KB)

tut01_pp.cpp (112 Bytes)

tut02_pp.cpp (803 Bytes)

Hello,
thanks for the example but this is what I am currently doing in my program.

The problem is that every time I include some header like stdio.h in my code I get this error from the parser:

Adding /home/spellegrini/shared/openmpi-1.3.2-gcc433/include
** Parsing file: bench/example.c **
initialize
In file included from bench/example.c:7:
/home/spellegrini/shared/llvm/lib/clang/1.0/include//stdarg.h:29:9: error: unknown type name '__builtin_va_list'

This was not happening with previous versions of CLANG, so what has been changed? What's the problem with this __builtin_va_list??
How can I make clang able to parse it without complaining?

cheers, Simone

Eli Friedman wrote:

My best guess is that you're missing a call to InitializePreprocessor.

-Eli

Eli Friedman wrote:

  

Hello,
thanks for the example but this is what I am currently doing in my program.

The problem is that every time I include some header like stdio.h in my code
I get this error from the parser:

Adding /home/spellegrini/shared/openmpi-1.3.2-gcc433/include
** Parsing file: bench/example.c **
initialize
In file included from bench/example.c:7:
/home/spellegrini/shared/llvm/lib/clang/1.0/include//stdarg.h:29:9: error:
unknown type name '__builtin_va_list'

This was not happening with previous versions of CLANG, so what has been
changed? What's the problem with this __builtin_va_list??
How can I make clang able to parse it without complaining?
    
My best guess is that you're missing a call to InitializePreprocessor.

your guess was correct! :slight_smile:

but now I have another error when clang is parsing the standard headers:

** Parsing file: bench/example.c **
initialize
In file included from bench/example.c:8:
In file included from /usr/include/stdio.h:72:
In file included from /usr/include/libio.h:32:
/usr/include/_G_config.h:50:25: warning: field '__cd' with variable sized type 'struct __gconv_info' not at the end of a struct or class is a GNU extension
0 my_app 0x000000000063dca0
1 my_app 0x000000000063e181
2 libpthread.so.0 0x000000300090c5b0
3 m_app 0x00000000005909ff
...
25 libc.so.6 0x000000300021c40b __libc_start_main + 219
26 my_app 0x000000000040d6fa std::ios_base::Init::~Init() + 90
Stack dump:
0. Program arguments: ./build/my_app -I/home/spellegrini/shared/openmpi-1.3.2-gcc433/include bench/example.c
1. /usr/include/_G_config.h:52:5: current parser token '__combined'
2. /usr/include/_G_config.h:45:9: parsing struct/union body
3. /usr/include/_G_config.h:48:3: parsing struct/union body '::'
Segmentation fault (core dumped)

so the warning should be just a warning, why the parser is exiting with a segmentation fault?

here it is the backtrace:

(gdb) backtrace
#0 0x00000000005909ff in Lexer (this=0x7fbfff7140, fileloc={ID = 91216}, features=@0x0,
    BufStart=0x8fb780 "/* This file is needed by libio to define various configuration parameters.\n These are always the same in the GNU C library. */\n\n#ifndef _G_config_h\n#define _G_config_h 1\n\n/* Define types for libio"...,
    BufPtr=0x8fbc4a "__cd;\n struct __gconv_step_data __data;\n } __combined;\n} _G_iconv_t;\n\ntypedef int _G_int16_t __attribute__ ((__mode__ (__HI__)));\ntypedef int _G_int32_t __attribute__ ((__mode__ (__SI__)));\ntypede"..., BufEnd=0x8fc1f0 "") at Lexer.cpp:116
#1 0x000000000059445f in clang::Lexer::MeasureTokenLength (Loc={ID = 91216}, SM=@0x7fbfffec90, LangOpts=@0x0) at Lexer.cpp:235
#2 0x00000000005bdc88 in clang::TextDiagnosticPrinter::EmitCaretDiagnostic (this=0x8c2610, Loc={ID = 91216}, Ranges=0x7fbfff74b0, NumRanges=0, SM=@0x7fbfffec90,
    Hints=0x0, NumHints=0, Columns=170) at TextDiagnosticPrinter.cpp:296
#3 0x00000000005bebaf in clang::TextDiagnosticPrinter::HandleDiagnostic (this=0x8c2610, Level=clang::Diagnostic::Warning, Info=@0x7fbfff7710)
    at TextDiagnosticPrinter.cpp:706
#4 0x00000000005fc0ed in clang::Diagnostic::ProcessDiag (this=0x7fbfffe658) at Diagnostic.cpp:476
#5 0x000000000042dbe4 in clang::DiagnosticBuilder::Emit (this=0x7fbfff7a00) at /home/spellegrini/llvm/tools/clang/lib/Parse/../../include/clang/Basic/Diagnostic.h:468
#6 0x000000000042a8c0 in ~SemaDiagnosticBuilder (this=0x7fbfff7a00) at Sema.cpp:319
#7 0x0000000000434382 in clang::Sema::ActOnFields (this=0x7fbfffde10, S=0x8c5150, RecLoc={ID = 91181}, RecDecl={Ptr = 0x92dcf0}, Fields=0x7fbfff9b10, NumFields=2,
    LBrac={ID = 91190}, RBrac={ID = 91261}, Attr=0x0) at SemaDecl.cpp:4060
#8 0x00000000005d2ad2 in clang::Parser::ParseStructUnionBody (this=0x7fbfffe410, RecordLoc={ID = 91181}, TagType=13, TagDecl={Ptr = 0x92dcf0}) at ParseDecl.cpp:1392
#9 0x00000000005d8f48 in clang::Parser::ParseClassSpecifier (this=0x7fbfffe410, TagTokKind=clang::tok::kw_struct, StartLoc={ID = 91181}, DS=@0x7fbfffcf20,
    TemplateInfo=@0x7fbfffa490, AS=clang::AS_none) at ParseDeclCXX.cpp:598
#10 0x00000000005cf4c3 in clang::Parser::ParseDeclarationSpecifiers (this=0x7fbfffe410, DS=@0x7fbfffcf20, TemplateInfo=@0x7fbfffa490, AS=clang::AS_none)
    at ParseDecl.cpp:931
#11 0x00000000005cf86c in clang::Parser::ParseSpecifierQualifierList (this=0x7fbfffe410, DS=@0x7fbfffcf20) at ParseDecl.cpp:454
#12 0x00000000005d20e0 in clang::Parser::ParseStructDeclaration (this=0x7fbfffe410, DS=@0x7fbfffcf20, Fields=@0x7fbfffae40) at ParseDecl.cpp:1236
#13 0x00000000005d25ec in clang::Parser::ParseStructUnionBody (this=0x7fbfffe410, RecordLoc={ID = 91143}, TagType=12, TagDecl={Ptr = 0x8fd7d0}) at ParseDecl.cpp:1337
#14 0x00000000005d8f48 in clang::Parser::ParseClassSpecifier (this=0x7fbfffe410, TagTokKind=clang::tok::kw_union, StartLoc={ID = 91143}, DS=@0x7fbfffdae0,
    TemplateInfo=@0x7fbfffdb60, AS=clang::AS_none) at ParseDeclCXX.cpp:598
#15 0x00000000005cf4c3 in clang::Parser::ParseDeclarationSpecifiers (this=0x7fbfffe410, DS=@0x7fbfffdae0, TemplateInfo=@0x7fbfffdb60, AS=clang::AS_none)
    at ParseDecl.cpp:931
#16 0x00000000005d327a in clang::Parser::ParseSimpleDeclaration (this=0x7fbfffe410, Context=0, DeclEnd=@0x7fbfffdc20, RequireSemi=true) at ParseDecl.cpp:270
#17 0x00000000005d3517 in clang::Parser::ParseDeclaration (this=0x7fbfffe410, Context=0, DeclEnd=@0x7fbfffdc20) at ParseDecl.cpp:250
#18 0x00000000005c73d9 in clang::Parser::ParseExternalDeclaration (this=0x7fbfffe410) at Parser.cpp:428
#19 0x00000000005c74a5 in clang::Parser::ParseTopLevelDecl (this=0x7fbfffe410, Result=@0x7fbfffe5b0) at Parser.cpp:334
#20 0x0000000000429962 in clang::ParseAST (PP=@0x8c6780, Consumer=0x7fbffff2d0, Ctx=@0x7fbfffee80, PrintStats=false, CompleteTranslationUnit=true) at ParseAST.cpp:64
#21 0x000000000040fd62 in main (argc=3, argv=0x7fbffff4e8) at main.cpp:446

In file included from bench/example.c:8:
In file included from /usr/include/stdio.h:72:
In file included from /usr/include/libio.h:32:
/usr/include/_G_config.h:50:25: warning: field '__cd' with variable sized
type 'struct __gconv_info' not at the end of a struct or class is a GNU
extension

You might want to use "Diags.setSuppressSystemWarnings(true);" to get
rid of warnings from system headers, since they're generally not
useful.

(gdb) backtrace
#0 0x00000000005909ff in Lexer (this=0x7fbfff7140, fileloc={ID = 91216},
features=@0x0,
BufStart=0x8fb780 "/* This file is needed by libio to define various
configuration parameters.\n These are always the same in the GNU C
library. */\n\n#ifndef _G_config_h\n#define _G_config_h 1\n\n/* Define
types for libio"...,
BufPtr=0x8fbc4a "__cd;\n struct __gconv_step_data __data;\n }
__combined;\n} _G_iconv_t;\n\ntypedef int _G_int16_t __attribute__
((__mode__ (__HI__)));\ntypedef int _G_int32_t __attribute__ ((__mode__
(__SI__)));\ntypede"..., BufEnd=0x8fc1f0 "") at Lexer.cpp:116
#1 0x000000000059445f in clang::Lexer::MeasureTokenLength (Loc={ID =
91216}, SM=@0x7fbfffec90, LangOpts=@0x0) at Lexer.cpp:235
#2 0x00000000005bdc88 in clang::TextDiagnosticPrinter::EmitCaretDiagnostic
(this=0x8c2610, Loc={ID = 91216}, Ranges=0x7fbfff74b0, NumRanges=0,
SM=@0x7fbfffec90,

I think you want something like the
"DiagClient.setLangOptions(&LangInfo);" call in my example.

-Eli