No clang::Action no party

Dear all,
I noticed the old clang::Action interface has been wiped out from the new release of clang. This was a big shoot for me as I rely on the virtual dispatch of Actions and Sema to implement a mechanism to attach pragmas to statements.

My solution was rather simple and effective, a custom class extending Sema where a couple of methods of the Action interface were overloaded (and by reading this mailing list frequently I know this is something several people did for the same kind of problem). Now as the virtual dispatch has been removed my overloaded functions are not called anymore. I temporarily solved the problem by modifying the Sema interface and redeclaring virtual those methods I need to overload. But I would prefer to make my code work without patching clang.

I know you are rather picky with performance and virtual dispatch was too expensive but you have to give us at least another way to call an overloaded Sema. Could be templates an option here?

template <class Action = clang::Sema>
class Parser { };

By using this mechanism you still keep the static method dispatch and the capabilities of providing an overloaded behavior for Sema. Is there another way to have the behavior I am looking for?

thanks, Simone

Dear all,
I noticed the old clang::Action interface has been wiped out from the
new release of clang. This was a big shoot for me as I rely on the
virtual dispatch of Actions and Sema to implement a mechanism to attach
pragmas to statements.

My solution was rather simple and effective, a custom class extending
Sema where a couple of methods of the Action interface were overloaded
(and by reading this mailing list frequently I know this is something
several people did for the same kind of problem). Now as the virtual
dispatch has been removed my overloaded functions are not called
anymore. I temporarily solved the problem by modifying the Sema
interface and redeclaring virtual those methods I need to overload. But
I would prefer to make my code work without patching clang.

I know you are rather picky with performance and virtual dispatch was
too expensive but you have to give us at least another way to call an
overloaded Sema.

Performance wasn't actually the issue, here, and I highly doubt that we gained any measurable performance with the devirtualization itself.

The change was an architectural fix, because there will never be another "Action" that implements enough of C/C++ to be useful. Sema is the only one, and it can grow extension points to allow customization.

Could be templates an option here?

template <class Action = clang::Sema>
class Parser { };

By using this mechanism you still keep the static method dispatch and
the capabilities of providing an overloaded behavior for Sema.

Turning the parser into a template would be a maintenance nightmare. It would then force us to move all of the parser code into a header, making it a compile-time nightmare, too.

Is there
another way to have the behavior I am looking for?

Attaching pragmas to statements seems likely something that Clang should provide a "handler" mechanism for, just as it provides a handler mechanism for parsing the pragmas itself. The would be cleaner and more composable than inheriting from Sema.

  - Doug

  Dear all,
I noticed the old clang::Action interface has been wiped out from the
new release of clang. This was a big shoot for me as I rely on the
virtual dispatch of Actions and Sema to implement a mechanism to attach
pragmas to statements.

My solution was rather simple and effective, a custom class extending
Sema where a couple of methods of the Action interface were overloaded
(and by reading this mailing list frequently I know this is something
several people did for the same kind of problem). Now as the virtual
dispatch has been removed my overloaded functions are not called
anymore. I temporarily solved the problem by modifying the Sema
interface and redeclaring virtual those methods I need to overload. But
I would prefer to make my code work without patching clang.

I know you are rather picky with performance and virtual dispatch was
too expensive but you have to give us at least another way to call an
overloaded Sema.

Performance wasn't actually the issue, here, and I highly doubt that we gained any measurable performance with the devirtualization itself.

The change was an architectural fix, because there will never be another "Action" that implements enough of C/C++ to be useful. Sema is the only one, and it can grow extension points to allow customization.

Could be templates an option here?

template<class Action = clang::Sema>
class Parser { };

By using this mechanism you still keep the static method dispatch and
the capabilities of providing an overloaded behavior for Sema.

Turning the parser into a template would be a maintenance nightmare. It would then force us to move all of the parser code into a header, making it a compile-time nightmare, too.

Is there
another way to have the behavior I am looking for?

Attaching pragmas to statements seems likely something that Clang should provide a "handler" mechanism for, just as it provides a handler mechanism for parsing the pragmas itself. The would be cleaner and more composable than inheriting from Sema.

I actually do not agree with this statement. Clang actually does a very poor job (no offense) when it comes to pragmas. I actually spent some time in designing a framework to handle pragmas on top of clang and there are a couple of core changes which I need to do and for that I had to patch clang.

For example in OpenMP the standard allows C expressions to be written as part of the pragmas. For example someone can write:

#pragma omp parallel omp_threads(3*2+1)

For anyone interested in implement the full OpenMP standard this would require to re-implement a parser for C expressions (among other things), and this seems a duplication of work since the Clang parser already does it pretty well. In order to do that I had to make my pragma handler class friend of the clang::Parser; in this way I can directly use the private ParseExpression() method.

So, for short, custom pragma handlers need low level access to the Parser! (or some methods in the Parser should be made public).

Secondly when the pragma handler is called the attached statement has not yet being created by the parser so the association cannot be done by the handler but someone else has to take care of it (that's the reason why the Action interface was useful).
Either someone keeps a list of pragmas and do the matching at the end in the ASTConsumer, or for a more efficient solution which minimize the number of checks Sema should take care of it.

Anyway, if there is an interest in making pragma handling more flexible and powerful in clang I could contribute with some ideas and code (of course).

cheers, Simone

Dear all,
I noticed the old clang::Action interface has been wiped out from the
new release of clang. This was a big shoot for me as I rely on the
virtual dispatch of Actions and Sema to implement a mechanism to attach
pragmas to statements.

My solution was rather simple and effective, a custom class extending
Sema where a couple of methods of the Action interface were overloaded
(and by reading this mailing list frequently I know this is something
several people did for the same kind of problem). Now as the virtual
dispatch has been removed my overloaded functions are not called
anymore. I temporarily solved the problem by modifying the Sema
interface and redeclaring virtual those methods I need to overload. But
I would prefer to make my code work without patching clang.

I know you are rather picky with performance and virtual dispatch was
too expensive but you have to give us at least another way to call an
overloaded Sema.

Performance wasn't actually the issue, here, and I highly doubt that we gained any measurable performance with the devirtualization itself.

The change was an architectural fix, because there will never be another "Action" that implements enough of C/C++ to be useful. Sema is the only one, and it can grow extension points to allow customization.

Could be templates an option here?

template<class Action = clang::Sema>
class Parser { };

By using this mechanism you still keep the static method dispatch and
the capabilities of providing an overloaded behavior for Sema.

Turning the parser into a template would be a maintenance nightmare. It would then force us to move all of the parser code into a header, making it a compile-time nightmare, too.

Is there
another way to have the behavior I am looking for?

Attaching pragmas to statements seems likely something that Clang should provide a "handler" mechanism for, just as it provides a handler mechanism for parsing the pragmas itself. The would be cleaner and more composable than inheriting from Sema.

I actually do not agree with this statement. Clang actually does a very poor job (no offense) when it comes to pragmas. I actually spent some time in designing a framework to handle pragmas on top of clang and there are a couple of core changes which I need to do and for that I had to patch clang.

For example in OpenMP the standard allows C expressions to be written as part of the pragmas. For example someone can write:

#pragma omp parallel omp_threads(3*2+1)

For anyone interested in implement the full OpenMP standard this would require to re-implement a parser for C expressions (among other things), and this seems a duplication of work since the Clang parser already does it pretty well. In order to do that I had to make my pragma handler class friend of the clang::Parser; in this way I can directly use the private ParseExpression() method.

OpenMP is a somewhat extreme example, because it *does* involve full expressions and many deep tie-ins with the AST, ultimately affecting IR generation as well. I think it would be great if the pragma interface could be extended to handle OpenMP, because that means that many other pragma handlers would also be possible.

So, for short, custom pragma handlers need low level access to the Parser! (or some methods in the Parser should be made public).

Secondly when the pragma handler is called the attached statement has not yet being created by the parser so the association cannot be done by the handler but someone else has to take care of it (that's the reason why the Action interface was useful).
Either someone keeps a list of pragmas and do the matching at the end in the ASTConsumer, or for a more efficient solution which minimize the number of checks Sema should take care of it.

I'd love to see a general mechanism for this in Sema; most of the time, people want to attach pragmas to statements/expressions/declarations, and having a general way to do that would be great.

Anyway, if there is an interest in making pragma handling more flexible and powerful in clang I could contribute with some ideas and code (of course).

I think I better pragma handling mechanism, which makes it easier to tie in with the parser, would be a great benefit to Clang and to developers who want to extend Clang with new pragmas.

  - Doug

I actually do not agree with this statement. Clang actually does a very poor job (no offense) when it comes to pragmas. I actually spent some time in designing a framework to handle pragmas on top of clang and there are a couple of core changes which I need to do and for that I had to patch clang.

For example in OpenMP the standard allows C expressions to be written as part of the pragmas. For example someone can write:

#pragma omp parallel omp_threads(3*2+1)

For anyone interested in implement the full OpenMP standard this would require to re-implement a parser for C expressions (among other things), and this seems a duplication of work since the Clang parser already does it pretty well. In order to do that I had to make my pragma handler class friend of the clang::Parser; in this way I can directly use the private ParseExpression() method.

OpenMP is a somewhat extreme example, because it *does* involve full expressions and many deep tie-ins with the AST, ultimately affecting IR generation as well. I think it would be great if the pragma interface could be extended to handle OpenMP, because that means that many other pragma handlers would also be possible.

In order to make pragma processing more useful in clang there are actually two main aspects which need to be improved.

The first one is give to the user a way to specify new pragmas without reinventing the wheel all the times; and also offering him the capabilities to call directly the Clang parser to parse complex expressions without having these poor guys messing around with low-level implementation details of the lexer/parser. To solve this aspect we develop a mechanism which allows new pragma to be defined in a way Boost::Spirit does. We took some of the concepts but we write from scratch a parser generator which works very close with clang's lexer (and parser).

The main idea is the following, if you want to define a new pragma, let's say:
#pragma mypragma ((awesomeness = (yes | no)) | (digit (',' digit)*))

for example you want to write in your code stuff like this:
#pragma mypragma awesomeness = yes
or
#pragma mypragma 2,3,4

What we do is let the user specify the grammar in a declarative way buy letting him building up a parsing tree simply buy concatenating expressions:
auto matcher = (
                             kwd("awesomeness") >> equal >> l_paren >> ( kwd("yes") | kwd("no")) >> r_paren
                           >
                             (numeric_constant >> *(comma >> numeric_constant) )
                            ) >> eom

this object will be passed to a pragma handler, when the parser calls the handler the Preprocessor will be passed to the object which by consuming tokens will try to match the rule. Things can get more complicated that this, you have several other operators (!,+,~). If the pragma is matched the object will create a map for your where you can find all the parsed information in form of strings or Clang AST nodes.

The second part of the problem is the association of pragmas to AST nodes (nodes or definitions). We solved the problem by having the pragma handler calling a method we added to the (old) Action interface:

template <PragmaTy>
ActOnPragma(SourceLocation start, SourceLocation end, MatcherMap mmap);

When this method is called Sema will create an object of type PragmaTy and store it internally in a list of pending pragmas, i.e. pragma which didn't find yet the correct placement. Here it becomes a tricky issue related with the way Sema creates the ast nodes, in fact it's not always true that the next statement that will be created by Sema is the one that has to be attached to the more recent pending pragma. For example in the following case:

#pragma omp parallel
{
     int a = 0;
}

Sema will parse the pragma, then create a DeclStmt for int a... and at the end it will build the CompoundStmt which is the one we want to attach to the pragma.
In order to solve the problem we overloaded a couple of methods in Sema (like ActOnCompoundStmt or ActOnForStmt...), what we do is basically filtering the list of pending pragmas which are within the range of the statement and start the matching of those pragmas inside the range. Quite easy though.

The only tricky part is dealing with situation like the following:

{
     int a;
     #pragma omp barrier
}

where we modify the structure of the CompoundStmt by adding a NullStmt so we can match the pragma with it.

{
     int a;
     #pragma omp barrier
     ;
}

The matching algorithm is not that complex actually, the only question is that if it's computationally too expensive for satisfying clang requirements. I think this was the best solution I could come up keeping the minimum impact on Clang code base (it only requires to make 1 class friend with the parser) and efficient.

cheers, Simone P.