How to make clang compile any other language ? I want it to support ACSL (specification language of C).)

Hie Everyone.
I want to to produce LLVM-IR for ACSL programs using CLANG. I am new to
CLANG and don’t know much about it. What I am looking for is how to learn about architecture of CLANG so that I can make changes in its source code so as to make it support for my ACSL constructs. This language is a specification language
of C. The specifications of the program are written in annotated comments. I want to produce LLVM-IR for such type of code using clang. For this I think I have to add some code in the existing source code of clang. So, what I want to know is learn about clang so as to know what changes to make in it and where. I am writing one simple ACSL program :

/* Program verification examples from the book "Software Foundations"
   [http://www.cis.upenn.edu/~bcpierce/sf/](http://www.cis.upenn.edu/~bcpierce/sf/)

   Example: Reduce to Zero
*/

int reduce_to_zero (int x)
/*
  @requires { x >= 0 }
  @ensures { result = 0 }
*/
{
    while (x != 0) {
        /*
          @invariant { x >= 0 }
          @variant { x }
        */
        x = x - 1;
    }
    return x;
}

The “requires” annotation ensures that the caller of the function reduce_to_zero() has value of the parameter >= 0 and the “ensures” annotation ensures that the function reduce_to_zero() has result = 0 before it ends its execution.
This is one of the programs for which I want to produce LLVM-IR.
I had studied internal manual on clang but din’t find it much useful as
concerned to my work. I was reading the documentation of clang but I don’t think its a great start for beginner’s like me.
I had done a course on compiler design as part of my academics and know the basics of lexing, parsing, semantic analysis and compiler architecture. Apart from this, this is my first practical experience with compilers.
This is a part of my academic research work(B.Tech) on compilers which I might continue further as I like this part.
Please help me with this as I am seriously interested in working with clang.

Thanks in advance.

Surbhi

  1. You have to spend considerable amount of time playing with Clang (for that matter with any software) in order to know its internals.

  2. Based on my understanding of your previous mail, as a first step, it looks like, you have to preserve comments in the original C program which Clang preprocessor probably throws away.

  3. Then, probably you have to come-up with a mechanism to parse comments, to represent them as AST nodes, and further processing of these ASTs as per your requirements.

  4. In my opinion, it could involve considerable amount work to tune Clang before it accepts your ASCL programs. Being said that, I am not an expert in Clang. Please consider other’s opinion.

libclang contains API's for introspect comments. So obviously they are preserved.

http://clang.llvm.org/doxygen/group__CINDEX__COMMENT.html

Hie Mahesha.

Thanks for this information. I want to know that from where should I get information about Clang’s preprocessor. To be specific I want to know where are the tokens specified in the code. I have searched the clang’s documentation, but not able to find the right code for me to start.

Thanks

Surbhi

Hie Mahesha.
Thanks for this information. I want to know that from where should I get
information about Clang's preprocessor.

Preprocessor.h is a useful entry point for the preprocessor.
PPCallbacks is a useful extension point.

To be specific I want to know where
are the tokens specified in the code. I have searched the clang's
documentation, but not able to find the right code for me to start.

Searching for "Lex" should find you the code that generates the tokens.

-- James

Hie all.

I wanted to know is that any way we can check for the tokens generated by clang for a given test file. Actually I had done some changes in the tokenKinds.def file and added some more keywordss to it. Now, I want to check whether it is able to tokenize those new keywords or not. Is there any command which may help ?

Thanks in advance.

Hie all.

I wanted to know is that any way we can check for the tokens generated by clang for a given test file. Actually I had done some changes in the tokenKinds.def file and added some more keywordss to it. Now, I want to check whether it is able to tokenize those new keywords or not. Is there any command which may help ?

Thanks in advance.

Hie all.
I wanted to know is that any way we can check for the tokens
generated by clang for a given test file. Actually I had done some
changes in the tokenKinds.def file and added some more keywordss to
it. Now, I want to check whether it is able to tokenize those new
keywords or not. Is there any command which may help ?
Thanks in advance.

If you run with:
  -Xclang -dump-tokens

then you'll get debugging information from the lexer on every token that is recognized.

-Hal

Thanks Hal.
I am now facing other problem. While i defined my new keywords in TokenKinds.def, and checked for the generated tokens on a test file by the command :
-Xclang -dump-tokens test_file.c
I got to know that it is treating my keywords as identifiers. What can I do so that they can be treated as keywords and not identifiers.
Thanks in advance.
Surbhi

Thanks Hal.
I am now facing other problem. While i defined my new keywords in
TokenKinds.def, and checked for the generated tokens on a test file
by the command :
-Xclang -dump-tokens test_file.c
I got to know that it is treating my keywords as identifiers. What
can I do so that they can be treated as keywords and not
identifiers.

Did you predicate these tokens on language options that are not enabled?

In any case, I recommend that you put a breakpoint in IdentifierTable::AddKeywords (to make sure that your keywords are being added) and Preprocessor::LookUpIdentifierInfo and try to figure out what's happening.

-Hal

Hie Sir.

I found a new problem. It seems as if any of the changes done in TokenKinds.def file are not visible anywhere. To check this I removed the line which defines the keyword ‘return’. The line which I removed is as follows:

KEYWORD(return , KEYALL)

After this I run this command on my test_file :
clang -Xclang -dump-tokens test_file.c
and it seems to correctly tokenize ‘return’ keyword. This means the changes which I had done in tokenKinds.def file are not reflected anwhere. What’s the reason behind this ?

Thanks

Surbhi

Hie Sir.
I found a new problem. It seems as if any of the changes done in
TokenKinds.def file are not visible anywhere. To check this I
removed the line which defines the keyword 'return'. The line which
I removed is as follows:

KEYWORD(return , KEYALL)

After this I run this command on my test_file :
clang -Xclang -dump-tokens test_file.c
and it seems to correctly tokenize 'return' keyword. This means the
changes which I had done in tokenKinds.def file are not reflected
anwhere. What's the reason behind this ?

How are you attempting to rebuild? Did you try running 'make clean'?

-Hal

Hi Sir.

Actually I was trying running make clean to delete all the already built objects but its running into an error.

Makefile.config:257: *** missing separator. Stop

I dont know how to procced.

One more thing I want to ask.Do I have to do make clean and then followed by make, everytime I do any change in the source code ?

Thanks
Surbhi

Thanks Sir.

Actually i was opening the Makefile with gedit editor and it was adding its own identation to the lines.When I opened it with vim editor I got rid of this error.

But now its not giving any error.It is just entering the test directory and leaving it in the next step.

I have made changes in the tokenKinds.def file and added some more keywords. So, i hace to recompile this file.How to do it ?

Thanks Sir.
Actually i was opening the Makefile with gedit editor and it was
adding its own identation to the lines.When I opened it with vim
editor I got rid of this error.
But now its not giving any error.It is just entering the test
directory and leaving it in the next step.
I have made changes in the tokenKinds.def file and added some more
keywords. So, i hace to recompile this file.How to do it ?

Run make from the top-level directory; the .def file is included from several other source files, and you need to make sure that they're all rebuilt and everything is relinked appropriately. Running make at the top level should do this.

-Hal