Parsing, pretty-printing, and compiling the result should work, right?

Hi there,

from some of the replies we got to bug reports we submitted I gather
that it is best if we ask the advice of this list concerning what we
are trying to do.

Our first objective is to understand whether clang can be the parser
for a program analyzer we are building. In order to ascertain that,
our plan was to build confidence on clang by:

1) Writing a pretty-printer that produces parsable C code from the AST.
2) Writing a driver that calls the parser, then the pretty-printer, and
    then invokes gcc over the pretty-printed program. In other words, this
    driver behaves exactly like plain gcc, with the only difference
    that it will compile the program produced by clang + pretty-printer
    instead of what gcc would normally obtain by running its own preprocessor.
3) Using this driver, compile as much code as possible (Linux, GNU software
    a go-go, everything we can put our hand on) and check whether the result
    runs as expected.

Unless we have grossly misinterpreted the objectives of the clang project,
this kind of exercise should be regared as valuable: right?
Point 2) is basically complete. Full realization of point 1) is currently
impeded by a number of problems for which we have pending bug reports.
For those we are unsure whether it is best to:

a) wait for a fix;
b) implement a workaround in the pretty-printer.

Solution b) is not very attractive, since the only solution we see for some
of the open problems would be to complicate the pretty-printer to a point
where it is no longer a pretty-printer (but something that, instead of
working sequentially, has to walk through the entire AST in order to gather
information that has been moved by clang to places that are far from where
it should be printed).

Please, let me know what you think about the above: perhaps we have made
some wrong choices and we do not want to waste our time or the time of anyone.
Thanks a lot,

    Roberto

For those we are unsure whether it is best to:

a) wait for a fix;
b) implement a workaround in the pretty-printer.

I think you missed another possibility:

c) fix it in clang and submit a patch

-Alexei

Hi there,

from some of the replies we got to bug reports we submitted I gather
that it is best if we ask the advice of this list concerning what we
are trying to do.

Hi Roberto,

I agree that this is a better place to discuss this than in individual bugzillas,

Our first objective is to understand whether clang can be the parser
for a program analyzer we are building. In order to ascertain that,
our plan was to build confidence on clang by:

1) Writing a pretty-printer that produces parsable C code from the AST.
2) Writing a driver that calls the parser, then the pretty-printer, and
   then invokes gcc over the pretty-printed program. In other words, this
   driver behaves exactly like plain gcc, with the only difference
   that it will compile the program produced by clang + pretty-printer
   instead of what gcc would normally obtain by running its own preprocessor.
3) Using this driver, compile as much code as possible (Linux, GNU software
   a go-go, everything we can put our hand on) and check whether the result
   runs as expected.

Ok, this sounds like a great project. I don't know of anyone doing anything similar, but it seems very useful to a variety of different people.

Unless we have grossly misinterpreted the objectives of the clang project,
this kind of exercise should be regared as valuable: right?

Absolutely!

Point 2) is basically complete. Full realization of point 1) is currently
impeded by a number of problems for which we have pending bug reports.
For those we are unsure whether it is best to:

a) wait for a fix;
b) implement a workaround in the pretty-printer.

This is a general problem with Open Source projects. You're interested in doing something that would be very valuable for Clang to do, however, Clang isn't quite there yet. Further, you are the first person to want to do this, and you're hitting problems.

Unfortunately, clang is developed by volunteers, and most of them have their own specific agenda that they prioritize. For example, mine is "make clang a production quality C compiler sooner rather than later". This means that you can't really "force" someone to fix a bug or implement a feature that you're lacking. Your options basically come down to:

1) fix it yourself. The various issues you have identified would all be great to fix in clang, so I'm sure that the patches would be welcome.
2) pay someone to fix it.
3) try to stir up interest in the problem, so that someone else feels compelled to look into it.
4) find a way to work around the problem.

I think your email has the effect of doing #3, as I think that many people would find this functionality useful. For #4, you might want to consider using the rewriter infrastructure, which does not suffer from the class of problem and is already in use for several projects.

-Chris

Chris Lattner wrote:

Point 2) is basically complete. Full realization of point 1) is currently
impeded by a number of problems for which we have pending bug reports.
For those we are unsure whether it is best to:

a) wait for a fix;
b) implement a workaround in the pretty-printer.

This is a general problem with Open Source projects. You're interested in doing something that would be very valuable for Clang to do, however, Clang isn't quite there yet. Further, you are the first person to want to do this, and you're hitting problems.

Unfortunately, clang is developed by volunteers, and most of them have their own specific agenda that they prioritize.

Hi there,

thanks to all who responded. Please note that we know how Open Source
and Free Software projects work, and my group is actually contributing
significant pieces of code to the community. So we are all volunteers.

For example, mine is "make clang a production quality C compiler sooner rather than later".

Good. But since you cannot compile what you cannot parse, our agendas
have a significant overlap: actually, mine is a strict subset of yours.
This was actually my argument when I convinced my group to use clang
("they certainly will want to correctly parse all the code out there").

This means that you can't really "force" someone to fix a bug or implement a feature that you're lacking.

Of course.

Your options basically come down to:

1) fix it yourself. The various issues you have identified would all be great to fix in clang, so I'm sure that the patches would be welcome.

The problem is that we do not have the manpower to do that. All the manpower
we have is invested in other (Free Software) projects.

2) pay someone to fix it.

ROTFL: Cut-throat savings | Nature
Do you know that many italian universities will have serious troubles
paying salaries one year from now? Really, this is not an option.

3) try to stir up interest in the problem, so that someone else feels compelled to look into it.

For at least two bug reports, I really hope this will be the case
because I really do not think a workaround exist.

4) find a way to work around the problem.

We will do that for all the remaining problems. This is actually a
tough decision for me: if 2749 and 3261 are not fixed in a reasonable
time frame we will have to drop clang and I will have wasted the
time of several people.
Cheers,

     Roberto

For example, mine is
"make clang a production quality C compiler sooner rather than later".

Good. But since you cannot compile what you cannot parse, our agendas
have a significant overlap: actually, mine is a strict subset of yours.
This was actually my argument when I convinced my group to use clang
("they certainly will want to correctly parse all the code out there").

Yep we do. However, while I'd love to see designated initializers happen sooner rather than later (and I'm thrilled that they may be happening soon!) I'd personally prioritize basic x86-64 codegen above designators.

4) find a way to work around the problem.

We will do that for all the remaining problems. This is actually a
tough decision for me: if 2749 and 3261 are not fixed in a reasonable
time frame we will have to drop clang and I will have wasted the
time of several people.

Hopefully they will get fixed then!

-Chris

Chris Lattner wrote:

4) find a way to work around the problem.

We will do that for all the remaining problems. This is actually a
tough decision for me: if 2749 and 3261 are not fixed in a reasonable
time frame we will have to drop clang and I will have wasted the
time of several people.

Hopefully they will get fixed then!

Hi there,

thanks to your support we are now able to parse all what we need to parse.
Now, for the program "parse, pretty-print, compile and compare the results",
we are dealing with pretty-printing. We are stuck due to the problem
explained in

     http://llvm.org/bugs/show_bug.cgi?id=3261

If this is not fixed, we believe a "local" pretty-printer (i.e., one
that does not have to reconstruct the visible portion of the environment)
is not possible. The solution proposed in Comment #2 in the bug report
would solve all our problems. What do you think?
All the best,

     Roberto

Implement it and try it out. If it works nicely for you, it seems like it'd be nice to get into the tree. I was fixing up -ast-print a bit and noticed the same issue.

DeclGroups were meant to solve this problem, if they were carried through to other parts of the AST.

  - Doug