Adding new keywords to the lexer

Hello,

I am attempting to modify Clang to add a simple language extension which requires, as a first step, adding a few additional keywords to the lexer. I have modified:

./include/clang/Basic/TokenKinds.def

To add my keywords, e.g:

KEYWORD(myfirstkeyword , KEYALL)

KEYWORD(mysecondkeyword , KEYALL)

After recompiling, I try to run clang with this simple C++ program to verify that clang still works on source files that do not use these keywords:

#include

using namespace std;

int main(int argc, char** argv){

cout << “Hello, Clang!” << endl;

return 0;

}

$ clang++ hello.cpp

And the command fails with the following assertion / stack trace:

Assertion failed: (isAnnotation() && “Used AnnotVal on non-annotation token”), function setAnnotationValue, file /Users/nickw/llvm/tools/clang/lib/Parse/…/…/include/clang/Lex/Token.h, line 209.

0 clang 0x0000000101a0f0d0 PrintStackTrace(void*) + 38

1 clang 0x0000000101a0f7bd SignalHandler(int) + 254

2 libSystem.B.dylib 0x00007fff80af51ba _sigtramp + 26

3 clang 0x000000010040704d bool llvm::isa<clang::ObjCCompatibleAliasDecl, clang::NamedDecl*>(clang::NamedDecl* const&) + 21

4 clang 0x0000000101a0f00d raise + 27

5 clang 0x0000000101a0f01d abort + 14

6 clang 0x0000000101a0f0aa PrintStackTrace(void*) + 0

7 clang 0x000000010035c3b2 clang::Token::setAnnotationValue(void*) + 156

8 clang 0x000000010034a46e clang::Parser::setExprAnnotation(clang::Token&, clang::ActionResult<clang::Expr*, true>) + 66

9 clang 0x000000010034521c clang::Parser::ParseStatementOrDeclaration(clang::ASTOwningVector<clang::Stmt*, 32u>&, bool) + 1344

10 clang 0x00000001003488a6 clang::Parser::ParseCompoundStatementBody(bool) + 932

11 clang 0x0000000100348e2d clang::Parser::ParseCompoundStatement(clang::ParsedAttributes&, bool) + 131

12 clang 0x000000010034561c clang::Parser::ParseStatementOrDeclaration(clang::ASTOwningVector<clang::Stmt*, 32u>&, bool) + 2368

13 clang 0x000000010034a4b2 clang::Parser::ParseStatement() + 66

14 clang 0x00000001003473e2 clang::Parser::ParseIfStatement(clang::ParsedAttributes&) + 626

15 clang 0x00000001003456a1 clang::Parser::ParseStatementOrDeclaration(clang::ASTOwningVector<clang::Stmt*, 32u>&, bool) + 2501

16 clang 0x00000001003488a6 clang::Parser::ParseCompoundStatementBody(bool) + 932

17 clang 0x0000000100348d27 clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) + 211

18 clang 0x00000001003568fd clang::Parser::ParseFunctionDefinition(clang::Parser::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&) + 2291

19 clang 0x000000010030df99 clang::Parser::ParseDeclGroup(clang::Parser::ParsingDeclSpec&, unsigned int, bool, clang::SourceLocation*, clang::Parser::ForRangeInit*) + 499

20 clang 0x0000000100353b91 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsingDeclSpec&, clang::AccessSpecifier) + 959

21 clang 0x0000000100353c09 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::AccessSpecifier) + 95

22 clang 0x0000000100354562 clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::Parser::ParsingDeclSpec*) + 2364

23 clang 0x0000000100320b9c clang::Parser::ParseInnerNamespace(std::__debug::vector<clang::SourceLocation, std::allocatorclang::SourceLocation >&, std::__debug::vector<clang::IdentifierInfo*, std::allocatorclang::IdentifierInfo* >&, std::__debug::vector<clang::SourceLocation, std::allocatorclang::SourceLocation >&, unsigned int, clang::SourceLocation&, clang::SourceLocation&, clang::ParsedAttributes&, clang::SourceLocation&) + 162

24 clang 0x0000000100321866 clang::Parser::ParseNamespace(unsigned int, clang::SourceLocation&, clang::SourceLocation) + 2844

25 clang 0x00000001003159ba clang::Parser::ParseDeclaration(clang::ASTOwningVector<clang::Stmt*, 32u>&, unsigned int, clang::SourceLocation&, clang::Parser::ParsedAttributesWithRange&) + 658

26 clang 0x00000001003541da clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::Parser::ParsingDeclSpec*) + 1460

27 clang 0x00000001003548a2 clang::Parser::ParseTopLevelDecl(clang::OpaquePtrclang::DeclGroupRef&) + 256

28 clang 0x0000000100304b30 clang::ParseAST(clang::Sema&, bool) + 340

29 clang 0x000000010007ec71 clang::ASTFrontendAction::ExecuteAction() + 233

30 clang 0x00000001002d1977 clang::CodeGenAction::ExecuteAction() + 793

31 clang 0x000000010007ed7e clang::FrontendAction::Execute() + 262

32 clang 0x000000010006272e clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 710

33 clang 0x0000000100010219 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 787

34 clang 0x0000000100001cb0 cc1_main(char const**, char const**, char const*, void*) + 897

35 clang 0x000000010000b852 main + 500

36 clang 0x0000000100001434 start + 52

37 clang 0x0000000000000025 start + 4294962213

Stack dump:

  1. Program arguments: /Users/nickw/llvm-run/bin/clang -cc1 -triple x86_64-apple-macosx10.6.8 -emit-obj -mrelax-all -disable-free -main-file-name hello.cpp -pic-level 1 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core2 -target-linker-version 123.2 -resource-dir /Users/nickw/llvm-run/bin/…/lib/clang/3.0 -fdeprecated-macro -ferror-limit 19 -fmessage-length 80 -stack-protector 1 -fblocks -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /var/folders/5m/5m7eYxMpHliIX3JMlBEiH++++TI/-Tmp-/cc-1XdoBa.o -x c++ hello.cpp

  2. /usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++locale.h:70:2: current parser token ‘__sav’

  3. /usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++locale.h:53:1 <Spelling=/usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++config.h:80:38>: parsing namespace ‘std’

  4. /usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++locale.h:65:3: parsing function body ‘__convert_from_v’

  5. /usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++locale.h:65:3: in compound statement (’{}’)

  6. /usr/include/c++/4.2.1/i686-apple-darwin10/x86_64/bits/c++locale.h:69:7: in compound statement (’{}’)

clang: error: unable to execute command: Illegal instruction

clang: error: clang frontend command failed due to signal 2 (use -v to see invocation)

I added some additional debug output, and the token in question at which this fails is the “unknown” token. I have grepped around the clang codebase to find if there is any other areas I have to modify when adding new keywords and I have tried everything I can think of at this point. I have tried several variations of modifying TokenKinds.def, and it doesn’t seem to matter what the keywords I add to TokenKinds.def are – when I add two new keywords, it breaks clang’s ability to compile a simple test source like the above.

I greatly appreciate any help / insights on this. Thanks!

Nick

Never mind. I figured it out. I had to change the size of Token’s Kind field in Token.h to accommodate my added keywords.

Nick