create source 2 source compiler with clang libraries

Hi everyone,

I would like to (initially) create a src-to-src compiler to compile some language extension I thought up for parallel programming. What I had in mind was to have to extend the C grammar, and have a compiler produce legal C (gnu/c99) out of it again. So I wanted to ask if Clang could help me do this in a intuitive/easy way?

I would like to create a simple program that does just a src to src transformation. As a starting point a program, that uses the Clang libraries that simply is able to lex, parse, build AST and typecheck, and generate the legal C/C++ by simply printing the AST nodes, would be great. Can you guys maybe point me in the right direction and maybe indicate the complexity of both creating the starting point and the src-to-src compiler?

Kind regards,
Chi

I would like to (initially) create a src-to-src compiler to compile some language extension I thought up for parallel programming. What I had in mind was to have to extend the C grammar, and have a compiler produce legal C (gnu/c99) out of it again. So I wanted to ask if Clang could help me do this in a intuitive/easy way?

Yes, definitely.

I should say that if you’re hoping to make a source-to-object compiler in the end, you may find that it’s not that much more complicated to start out with one. People often underestimate the complexity of writing source-to-source compilers that do any sort of interesting transformation, and it’s quite easy to run into the limits of the approach. Of course, this only works if your target architecture has an LLVM backend. That said, you can use a Clang-based parser with either approach.

I would like to create a simple program that does just a src to src transformation. As a starting point a program, that uses the Clang libraries that simply is able to lex, parse, build AST and typecheck, and generate the legal C/C++ by simply printing the AST nodes, would be great. Can you guys maybe point me in the right direction and maybe indicate the complexity of both creating the starting point and the src-to-src compiler?

It’s not really possible to modify Clang’s parser as a library. That’s okay — there are several successful projects based on local modifications to the Clang sources — but it does have implications for your project, because it means you should use a source-control system (like git) that makes it relatively painless to integrate patches from trunk, or even switch to the latest release sources (note that we’ll be making a new release relatively soon).

(You should definitely implement your parser by modifying Clang’s instead of trying to make your own that incidentally produces Clang’s ASTs. I presume that your goal is to allow programmers to decorate existing code with your language extension; parsing arbitrary C as it’s accepted by typical compilers is a very complicated task which many people have underestimated to their peril.)

The best way to familiarize yourself with how the Clang parser works is to launch Clang in a debugger and just track its operation as it parses some interesting bit of code.

John.

Quoting John McCall <rjmccall@apple.com>:

I would like to (initially) create a src-to-src compiler to compile some language extension I thought up for parallel programming. What I had in mind was to have to extend the C grammar, and have a compiler produce legal C (gnu/c99) out of it again. So I wanted to ask if Clang could help me do this in a intuitive/easy way?

Yes, definitely.

I should say that if you're hoping to make a source-to-object compiler in the end, you may find that it's not that much more complicated to start out with one. People often underestimate the complexity of writing source-to-source compilers that do any sort of interesting transformation, and it's quite easy to run into the limits of the approach. Of course, this only works if your target architecture has an LLVM backend. That said, you can use a Clang-based parser with either approach.

Thanks John,

A source-to-object compiler would be indeed nicer, but I might run into some portability problems as we might want to use it on a Microblaze platform or Tilera. I see you have experimental support for Microblaze, is there any development in the direction of the Tilera?

I would like to create a simple program that does just a src to src transformation. As a starting point a program, that uses the Clang libraries that simply is able to lex, parse, build AST and typecheck, and generate the legal C/C++ by simply printing the AST nodes, would be great. Can you guys maybe point me in the right direction and maybe indicate the complexity of both creating the starting point and the src-to-src compiler?

It's not really possible to modify Clang's parser as a library. That's okay - there are several successful projects based on local modifications to the Clang sources - but it does have implications for your project, because it means you should use a source-control system (like git) that makes it relatively painless to integrate patches from trunk, or even switch to the latest release sources (note that we'll be making a new release relatively soon).

(You should definitely implement your parser by modifying Clang's instead of trying to make your own that incidentally produces Clang's ASTs. I presume that your goal is to allow programmers to decorate existing code with your language extension; parsing arbitrary C as it's accepted by typical compilers is a very complicated task which many people have underestimated to their peril.)

The best way to familiarize yourself with how the Clang parser works is to launch Clang in a debugger and just track its operation as it parses some interesting bit of code.

I will =).

Not that I know of. That's indeed a pretty strong objection to using a source-to-object compiler; I guess you don't have much choice but to deal with the source-to-source issues.

Good luck!

John.