Adding a new pragma to Clang

Hi Folks,

I’m adding the pragma vectorize to Clang to print specific metadata on loops (later any lexical block), and I’m having some trouble identifying the steps.

I’ve added my pragma to ParsePragma, then added a vectorizer context to Sema to temporarily hold these values to be used on the next lexical block (one at a time, recent replace old, style).

But the only way I found to add this information into the AST was via traits, which doesn’t look correct. Can anyone point me in the right direction?

Who is the best person to review these changes?

cheers,
–renato

Hi Renato,

Pragmas in clang are monolithic with their own parse rules and state management so it's a bit of work. As such it's better to come up with a plan to support different kinds of optimization flags beyond just vectorization that may be needed in future, and ideally to reuse an existing format if a sane one exists.

One possibility is to re-use the #pragma clang diagnostic machinery, and along with it the command line parser. It has a lot of the work covered to track source locations, on/off states that you're better of not reimplementing.

It'd give us something like:

#pragma clang optimize push
#pragma clang optimize "-vectorize-loops"
     while (...) { }
#pragma clang optimize pop

Or closer to the Microsoft pragma*:

#pragma clang optimize("vectorize-loops", on)
     while (...) { }
#pragma clang optimize("vectorize-loops", off)

gcc* has some interesting pragmas in this space already. Would any of those do what you need, or did you have a different grammar in mind?

* http://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
* http://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html

Alp.

From: "Alp Toker" <alp@nuanti.com>
To: "Renato Golin" <renato.golin@linaro.org>, "Clang Dev" <cfe-dev@cs.uiuc.edu>
Cc: "Richard Smith" <richard@metafoo.co.uk>
Sent: Monday, January 6, 2014 9:12:04 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

Hi Renato,

Pragmas in clang are monolithic with their own parse rules and state
management so it's a bit of work. As such it's better to come up with
a
plan to support different kinds of optimization flags beyond just
vectorization that may be needed in future, and ideally to reuse an
existing format if a sane one exists.

One possibility is to re-use the #pragma clang diagnostic machinery,
and
along with it the command line parser. It has a lot of the work
covered
to track source locations, on/off states that you're better of not
reimplementing.

It'd give us something like:

#pragma clang optimize push
#pragma clang optimize "-vectorize-loops"
     while (...) { }
#pragma clang optimize pop

Or closer to the Microsoft pragma*:

#pragma clang optimize("vectorize-loops", on)
     while (...) { }
#pragma clang optimize("vectorize-loops", off)

gcc* has some interesting pragmas in this space already. Would any of
those do what you need, or did you have a different grammar in mind?

I think he'll need something very close to the OpenMP syntax.

-Hal

Wasn't there an Intel project that implemented something along the lines of what he needed already?

Pragmas in clang are monolithic with their own parse rules and state
management so it's a bit of work. As such it's better to come up with a
plan to support different kinds of optimization flags beyond just
vectorization that may be needed in future, and ideally to reuse an
existing format if a sane one exists.

Hi Alp,

I'd love to re-use pragma optimize, is it implemented already? I have to
say, this is well beyond my knowledge in Clang to do it myself, if not.

I'm already out of my way to get the vectorize pragmas in Clang, TBH... :frowning:

#pragma clang optimize push

#pragma clang optimize "-vectorize-loops"
    while (...) { }
#pragma clang optimize pop

This will only work for the first pragma, "vectorize on/off".

We have a list of other pragmas that need implementing, such as:
1. width(N)
2. unroll(N)
3. safe-program-order-dist(N)
4. safe-any-program-order
5. safe-any-order

The first two are available as metadata already in IR, the rest have not
been completely discussed and agreed on their semantics.

We should be able to combine any number of pragmas, with local context
overriding global, etc. But this is far too distant in the future. The main
target now is to get vectorize, width and unroll done at a loop level,
since this is the only thing that is used in IR today.

Or closer to the Microsoft pragma*:

#pragma clang optimize("vectorize-loops", on)
    while (...) { }
#pragma clang optimize("vectorize-loops", off)

One would hope that, regardless of the syntax, both representation on the
AST and therefore in IR, would be exactly the same.

cheers,
--renato

Or that.

I don't really mind *how* it's represented in C code, as long as it does
what we need in IR in the long run: lexical block level metadata.

cheers,
--renato

    I think he'll need something very close to the OpenMP syntax.

Or that.

I don't really mind *how* it's represented in C code, as long as it does what we need in IR in the long run: lexical block level metadata.

The C frontend syntax is the one that's going to ship and get used in tens of thousands of software projects, and which we might still be having to support 15 years later when the CPU architectures have all changed. It's quite likely in that lifetime that other vendors will try to do a compatible implementation too if it sees adoption.

So from my point of view the backend implementation, and even the AST representation if there is any, is a secondary almost trivial issue compared to specifying a new extension to the language. But that's my take as a parser maintainer :slight_smile:

Incidentally I don't think there needs to be an AST representation as such. This looks like it can be supported with a lookup map for statements / expressions, and synthesized attributes for functions.

The list of commands you posted in the previous email is a good starting point, but I'd like to see real code samples of what you want to work.

Can it annotate statements, declarations or both? Is it expected to survive through template instantiations? Do the directives require semantic analysis? Those are the questions that'll determine how this is implemented.

Hal's suggestion to plug into the OpenMP directives is also very interesting.

Alp.

From: "Alp Toker" <alp@nuanti.com>
To: "Renato Golin" <renato.golin@linaro.org>, "Hal Finkel" <hfinkel@anl.gov>
Cc: "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>
Sent: Monday, January 6, 2014 9:42:01 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

>
> I think he'll need something very close to the OpenMP syntax.
>
>
> Or that.
>
> I don't really mind *how* it's represented in C code, as long as it
> does what we need in IR in the long run: lexical block level
> metadata.

The C frontend syntax is the one that's going to ship and get used in
tens of thousands of software projects, and which we might still be
having to support 15 years later when the CPU architectures have all
changed. It's quite likely in that lifetime that other vendors will
try
to do a compatible implementation too if it sees adoption.

I recommend that, where practical, we use a syntax compatible with other implementations. For example, Intel's compiler implements some of these which I believe we'd like to have:

  ivdep
  loop_count
  vector/novector
  unroll/nounroll

http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-346BAAA5-CF2D-4A26-9194-CA840BFB34E5.htm

IBM's compiler also has a few of these (unroll, novector, etc.):

http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Fcompiler%2Fref%2Frupragen.htm

-Hal

So from my point of view the backend implementation, and even the AST
representation if there is any, is a secondary almost trivial issue
compared to specifying a new extension to the language. But that's my
take as a parser maintainer :slight_smile:

Incidentally I don't think there needs to be an AST representation as
such. This looks like it can be supported with a lookup map for
statements / expressions, and synthesized attributes for functions.

The list of commands you posted in the previous email is a good
starting
point, but I'd like to see real code samples of what you want to
work.

Can it annotate statements, declarations or both? Is it expected to
survive through template instantiations?

Yes.

Do the directives require
semantic analysis?

This is an interesting point; some of them might.

Those are the questions that'll determine how this
is
implemented.

Hal's suggestion to plug into the OpenMP directives is also very
interesting.

Just trying to figure out how to not reinvent the wheel yet again :wink:

-Hal

So from my point of view the backend implementation, and even the AST
representation if there is any, is a secondary almost trivial issue
compared to specifying a new extension to the language. But that's my take
as a parser maintainer :slight_smile:

Right... "I don't care" == "I'm not qualified to choose the best
representation, whatever front-end folks think it's best, I'm fine with
it". :smiley:

I agree with your rationale to the letter and whatever you guys think it's
the most future proof, we'll go with it.

Can it annotate statements, declarations or both? Is it expected to survive

through template instantiations? Do the directives require semantic
analysis? Those are the questions that'll determine how this is implemented.

Adding Arnold, as he was the one planning all this...

AFAIK, it's supposed to annotate statements only, not declarations. It
should survive anything the pre-processor can throw, but templates will be
a big problem if the types change between instantiations for enable, unroll
and width. If I got it right, safety pragmas should be on. A
safe-with-dist(N) will only apply if the type * lanes can fit into the
vector registers, so it should be safe to annotate a templated loop and let
the vectorizer decide if it's still safe or not.

The idea is to proceed in these steps:

1. Add enable/width/unroll to loops only. So far, only the loop vectorizer
will be looking at these. No nesting, since the loop-vectorizer can only
work on inner-most loops. Metadata added to outer loops will be ignored and
a warning should be printed to the user, I don't expect the front-end to
know all use cases and descend the loop tree automatically.

2. Add enable/width/unroll to lexical blocks (functions, ifs, blocks) but
not namespaces, structures, classes, declarations, definitions, etc. This
way, the SLP-vectorizer can also be controlled via these flags. Similar
metadata semantics as above.
2.a Work out if we want to nest pragmas. If so, an *if* inside a
*function* (both with pragmas) should have their metadata:
  * plain, and it'll be the job of the vectorizer to walk down the lexical
blocks to find additional parameters, or
  * merged, and the front-end should annotate in an local-override-global
manner and the vectorizer only reads the local metadata.
  * This will depend on how inlining will change the semantics of the code.
It's too early to think about this, I think.

3. Add safe pragmas with the same constraints as the ones above. This
should be done only *after* safety analysis is implemented in the
vectorizer, and should be the simplest part of the front-end change, since
all the work should've been done already by 1 and 2.

cheers,
--renato

From: "Alp Toker" <alp@nuanti.com>
To: "Renato Golin" <renato.golin@linaro.org>, "Hal Finkel" <hfinkel@anl.gov>
Cc: "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>
Sent: Monday, January 6, 2014 9:42:01 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

   I think he'll need something very close to the OpenMP syntax.

Or that.

I don't really mind *how* it's represented in C code, as long as it
does what we need in IR in the long run: lexical block level
metadata.

The C frontend syntax is the one that's going to ship and get used in
tens of thousands of software projects, and which we might still be
having to support 15 years later when the CPU architectures have all
changed. It's quite likely in that lifetime that other vendors will
try
to do a compatible implementation too if it sees adoption.

I recommend that, where practical, we use a syntax compatible with other implementations. For example, Intel's compiler implements some of these which I believe we'd like to have:

ivdep
loop_count
vector/novector
unroll/nounroll

http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-346BAAA5-CF2D-4A26-9194-CA840BFB34E5.htm

IBM's compiler also has a few of these (unroll, novector, etc.):

http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Fcompiler%2Fref%2Frupragen.htm

I think it is okay for clang to develop its own coherent syntax. If we take syntax from different compilers we get non-coherent syntax and users will probably still have to use macros because different compilers and compiler versions will support a different set of those directives (ivdep, simd, openmp, etc) with possibly a slightly different semantics, or requirements (for examples, Intel’s simd pragma requires the loop induction variable to be signed).

Renato is trying to add pragmas that control the vectorizer. For this, I think an intuitive syntax interface would be a "#pragma subject attribute(value), attribute(value), ..." structure.

#pragma vectorize enable/disable
for () {
  a[i] = ...
}

#pragma vectorize unroll(4)
for () {
  a[i] = ...
}

#pragma vectorize width(4)
for () {
  a[i] = …
}

#pragma vectorize safe(...)
for () {
  a[i] = …
}

http://llvm.org/bugs/show_bug.cgi?id=18086#c0 has some comments about what those attributes do.

So from my point of view the backend implementation, and even the AST
representation if there is any, is a secondary almost trivial issue
compared to specifying a new extension to the language. But that's my
take as a parser maintainer :slight_smile:

Incidentally I don't think there needs to be an AST representation as
such. This looks like it can be supported with a lookup map for
statements / expressions, and synthesized attributes for functions.

The list of commands you posted in the previous email is a good
starting
point, but I'd like to see real code samples of what you want to
work.

Can it annotate statements, declarations or both? Is it expected to
survive through template instantiations?

Yes.

Do the directives require
semantic analysis?

This is an interesting point; some of them might.

Those are the questions that'll determine how this
is
implemented.

Hal's suggestion to plug into the OpenMP directives is also very
interesting.

Just trying to figure out how to not reinvent the wheel yet again :wink:

I think openmp is a separate language extension that can reuse common infrastructure. I think it is beneficial for clang to have its own syntax tailored to the functionality of clang (for example: clang recognizes reductions and inductions; in opemmp you seem to have to annotate them as reduction and linear).

In my opinion, clang vector pragmas should be easy to understand and use and be tailored to the functionality that the clang/llvm infrastructure provides.

Thanks,
Arnold

From: "Arnold Schwaighofer" <aschwaighofer@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Alp Toker" <alp@nuanti.com>, "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>, "Renato
Golin" <renato.golin@linaro.org>, "Nadav Rotem" <nrotem@apple.com>
Sent: Tuesday, January 7, 2014 5:19:26 PM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

>> From: "Alp Toker" <alp@nuanti.com>
>> To: "Renato Golin" <renato.golin@linaro.org>, "Hal Finkel"
>> <hfinkel@anl.gov>
>> Cc: "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev"
>> <cfe-dev@cs.uiuc.edu>
>> Sent: Monday, January 6, 2014 9:42:01 AM
>> Subject: Re: [cfe-dev] Adding a new pragma to Clang
>>
>>
>>>
>>> I think he'll need something very close to the OpenMP syntax.
>>>
>>>
>>> Or that.
>>>
>>> I don't really mind *how* it's represented in C code, as long as
>>> it
>>> does what we need in IR in the long run: lexical block level
>>> metadata.
>>
>> The C frontend syntax is the one that's going to ship and get used
>> in
>> tens of thousands of software projects, and which we might still
>> be
>> having to support 15 years later when the CPU architectures have
>> all
>> changed. It's quite likely in that lifetime that other vendors
>> will
>> try
>> to do a compatible implementation too if it sees adoption.
>
> I recommend that, where practical, we use a syntax compatible with
> other implementations. For example, Intel's compiler implements
> some of these which I believe we'd like to have:
>
> ivdep
> loop_count
> vector/novector
> unroll/nounroll
>
> http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-346BAAA5-CF2D-4A26-9194-CA840BFB34E5.htm
>
> IBM's compiler also has a few of these (unroll, novector, etc.):
>
> http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Fcompiler%2Fref%2Frupragen.htm

I think it is okay for clang to develop its own coherent syntax. If
we take syntax from different compilers we get non-coherent syntax
and users will probably still have to use macros because different
compilers and compiler versions will support a different set of
those directives (ivdep, simd, openmp, etc) with possibly a slightly
different semantics, or requirements (for examples, Intel’s simd
pragma requires the loop induction variable to be signed).

And you'll note that I specifically omitted Intel's simd pragma from the list above :wink:

Renato is trying to add pragmas that control the vectorizer. For
this, I think an intuitive syntax interface would be a "#pragma
subject attribute(value), attribute(value), ..." structure.

I agree with this, up to a point. We need to be careful about exposing implementation details at this level, not only because it is unfriendly, but because those details might change in the future. The fact that the vectorizer, as a monolithic object, handles these things now, and is essentially the only user of this information currently, does not mean it will always be this way.

For unrolling, I'm specifically against tying the syntax in any way to the vectorizer. The vectorizer can unroll without vectorizing, and that's an implementation detail. We also have a generic (concatenation) unroller, and that should also be controlled by the same syntax. I propose that, for unrolling we accept something like this:

#pragma nounroll[(interleave/concatenate/all)] -- all is the default
#pragma unroll[(n[, interleave/concatenate/any])] -- any is the default

so that the user can, optionally, specify what kind of unrolling to perform.

If you really want a 'subject' in the pragma, we could use 'loop' to give us something like #pragma loop nounroll, etc.

Regarding memory access independence, we should also not tie this to the vectorizer. It can be used by alias analysis and affect many other things. It is true that taking Intel's ivdep, or similar, may have a lot of semantic baggage that we don't want, but having a syntax which specifically says 'safe for vectorization' may be too strong for cases where vectorization itself is not at issue. At the C++ standards committee meeting in Chicago, we discussed this issue somewhat (in the context of some parallelization/vectorization proposals), and I believe that we settled on the term 'unsequenced' to describe the semantics we'd like here. I think adopting that makes sense. Maybe something like:

#pragma loop unsequenced
or
#pragma loop_unsequenced
or just
#pragma unsequenced

And while you're right that people need to ifdef their code anyway for some things, we should not make it more difficult than necessary. If the syntax adopted by other vendors is not horrid, and has the semantics we desire, then we should follow the trend.

-Hal

Note: I have no skin in this game. But C++11 has sane attributes that
you may want to use instead of a pragma. They can be placed on
statements as well as declarations (they can be placed on just about
anything, actually). This would then give the information an AST
representation that might prove useful in other ways that a pragma
would not. Just a random thought.

~Aaron

From: "Aaron Ballman" <aaron@aaronballman.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Arnold Schwaighofer" <aschwaighofer@apple.com>, "Nadav Rotem" <nrotem@apple.com>, "Richard Smith"
<richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>
Sent: Wednesday, January 8, 2014 8:22:18 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

>> From: "Arnold Schwaighofer" <aschwaighofer@apple.com>
>> To: "Hal Finkel" <hfinkel@anl.gov>
>> Cc: "Alp Toker" <alp@nuanti.com>, "Richard Smith"
>> <richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>,
>> "Renato
>> Golin" <renato.golin@linaro.org>, "Nadav Rotem" <nrotem@apple.com>
>> Sent: Tuesday, January 7, 2014 5:19:26 PM
>> Subject: Re: [cfe-dev] Adding a new pragma to Clang
>>
>>
>>
>> >> From: "Alp Toker" <alp@nuanti.com>
>> >> To: "Renato Golin" <renato.golin@linaro.org>, "Hal Finkel"
>> >> <hfinkel@anl.gov>
>> >> Cc: "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev"
>> >> <cfe-dev@cs.uiuc.edu>
>> >> Sent: Monday, January 6, 2014 9:42:01 AM
>> >> Subject: Re: [cfe-dev] Adding a new pragma to Clang
>> >>
>> >>
>> >>>
>> >>> I think he'll need something very close to the OpenMP
>> >>> syntax.
>> >>>
>> >>>
>> >>> Or that.
>> >>>
>> >>> I don't really mind *how* it's represented in C code, as long
>> >>> as
>> >>> it
>> >>> does what we need in IR in the long run: lexical block level
>> >>> metadata.
>> >>
>> >> The C frontend syntax is the one that's going to ship and get
>> >> used
>> >> in
>> >> tens of thousands of software projects, and which we might
>> >> still
>> >> be
>> >> having to support 15 years later when the CPU architectures
>> >> have
>> >> all
>> >> changed. It's quite likely in that lifetime that other vendors
>> >> will
>> >> try
>> >> to do a compatible implementation too if it sees adoption.
>> >
>> > I recommend that, where practical, we use a syntax compatible
>> > with
>> > other implementations. For example, Intel's compiler implements
>> > some of these which I believe we'd like to have:
>> >
>> > ivdep
>> > loop_count
>> > vector/novector
>> > unroll/nounroll
>> >
>> > http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-346BAAA5-CF2D-4A26-9194-CA840BFB34E5.htm
>> >
>> > IBM's compiler also has a few of these (unroll, novector, etc.):
>> >
>> > http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Fcompiler%2Fref%2Frupragen.htm
>>
>>
>> I think it is okay for clang to develop its own coherent syntax.
>> If
>> we take syntax from different compilers we get non-coherent syntax
>> and users will probably still have to use macros because different
>> compilers and compiler versions will support a different set of
>> those directives (ivdep, simd, openmp, etc) with possibly a
>> slightly
>> different semantics, or requirements (for examples, Intel’s simd
>> pragma requires the loop induction variable to be signed).
>
> And you'll note that I specifically omitted Intel's simd pragma
> from the list above :wink:
>
>>
>> Renato is trying to add pragmas that control the vectorizer. For
>> this, I think an intuitive syntax interface would be a "#pragma
>> subject attribute(value), attribute(value), ..." structure.
>
> I agree with this, up to a point. We need to be careful about
> exposing implementation details at this level, not only because it
> is unfriendly, but because those details might change in the
> future. The fact that the vectorizer, as a monolithic object,
> handles these things now, and is essentially the only user of this
> information currently, does not mean it will always be this way.
>
> For unrolling, I'm specifically against tying the syntax in any way
> to the vectorizer. The vectorizer can unroll without vectorizing,
> and that's an implementation detail. We also have a generic
> (concatenation) unroller, and that should also be controlled by
> the same syntax. I propose that, for unrolling we accept something
> like this:
>
> #pragma nounroll[(interleave/concatenate/all)] -- all is the
> default
> #pragma unroll[(n[, interleave/concatenate/any])] -- any is the
> default
>
> so that the user can, optionally, specify what kind of unrolling to
> perform.
>
> If you really want a 'subject' in the pragma, we could use 'loop'
> to give us something like #pragma loop nounroll, etc.

Note: I have no skin in this game. But C++11 has sane attributes that
you may want to use instead of a pragma. They can be placed on
statements as well as declarations (they can be placed on just about
anything, actually). This would then give the information an AST
representation that might prove useful in other ways that a pragma
would not. Just a random thought.

I'd like to see both. We do need to support C, C++O3, etc.

-Hal

From: "Aaron Ballman" <aaron@aaronballman.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Arnold Schwaighofer" <aschwaighofer@apple.com>, "Nadav Rotem" <nrotem@apple.com>, "Richard Smith"
<richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>
Sent: Wednesday, January 8, 2014 8:22:18 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

From: "Arnold Schwaighofer" <aschwaighofer@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Alp Toker" <alp@nuanti.com>, "Richard Smith"
<richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>,
"Renato
Golin" <renato.golin@linaro.org>, "Nadav Rotem" <nrotem@apple.com>
Sent: Tuesday, January 7, 2014 5:19:26 PM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

From: "Alp Toker" <alp@nuanti.com>
To: "Renato Golin" <renato.golin@linaro.org>, "Hal Finkel"
<hfinkel@anl.gov>
Cc: "Richard Smith" <richard@metafoo.co.uk>, "Clang Dev"
<cfe-dev@cs.uiuc.edu>
Sent: Monday, January 6, 2014 9:42:01 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

    I think he'll need something very close to the OpenMP
    syntax.

Or that.

I don't really mind *how* it's represented in C code, as long
as
it
does what we need in IR in the long run: lexical block level
metadata.

The C frontend syntax is the one that's going to ship and get
used
in
tens of thousands of software projects, and which we might
still
be
having to support 15 years later when the CPU architectures
have
all
changed. It's quite likely in that lifetime that other vendors
will
try
to do a compatible implementation too if it sees adoption.

I recommend that, where practical, we use a syntax compatible
with
other implementations. For example, Intel's compiler implements
some of these which I believe we'd like to have:

  ivdep
  loop_count
  vector/novector
  unroll/nounroll

http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-346BAAA5-CF2D-4A26-9194-CA840BFB34E5.htm

IBM's compiler also has a few of these (unroll, novector, etc.):

http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=%2Fcom.ibm.xlcpp8a.doc%2Fcompiler%2Fref%2Frupragen.htm

I think it is okay for clang to develop its own coherent syntax.
If
we take syntax from different compilers we get non-coherent syntax
and users will probably still have to use macros because different
compilers and compiler versions will support a different set of
those directives (ivdep, simd, openmp, etc) with possibly a
slightly
different semantics, or requirements (for examples, Intel’s simd
pragma requires the loop induction variable to be signed).

And you'll note that I specifically omitted Intel's simd pragma
from the list above :wink:

Renato is trying to add pragmas that control the vectorizer. For
this, I think an intuitive syntax interface would be a "#pragma
subject attribute(value), attribute(value), ..." structure.

I agree with this, up to a point. We need to be careful about
exposing implementation details at this level, not only because it
is unfriendly, but because those details might change in the
future. The fact that the vectorizer, as a monolithic object,
handles these things now, and is essentially the only user of this
information currently, does not mean it will always be this way.

For unrolling, I'm specifically against tying the syntax in any way
to the vectorizer. The vectorizer can unroll without vectorizing,
and that's an implementation detail. We also have a generic
(concatenation) unroller, and that should also be controlled by
the same syntax. I propose that, for unrolling we accept something
like this:

#pragma nounroll[(interleave/concatenate/all)] -- all is the
default
#pragma unroll[(n[, interleave/concatenate/any])] -- any is the
default

so that the user can, optionally, specify what kind of unrolling to
perform.

If you really want a 'subject' in the pragma, we could use 'loop'
to give us something like #pragma loop nounroll, etc.

Note: I have no skin in this game. But C++11 has sane attributes that
you may want to use instead of a pragma. They can be placed on
statements as well as declarations (they can be placed on just about
anything, actually). This would then give the information an AST
representation that might prove useful in other ways that a pragma
would not. Just a random thought.

I'd like to see both. We do need to support C, C++O3, etc.

This would be such a win over a new pragma that it's potentially a good idea to enable C++11 attributes in C and other language standards to get this. It solves entire chunks of the specification and implementation questions posed in this thread.

If we went that way, portable usage would still be easily achieved with a wrapper macro that gets defined away on incompatible compilers. In fact it'd be easier to make portable than code using a pragma.

I suspect the written form would be more logical as it's clear which loop the attributed statement applies to, and the AST representation / template instantiation would be handled for us.

The downside I can see to this proposal is that Objective-C would require disambiguation not only in C++ mode but also C if such a change were made. Perhaps we can define a macro-like compatibility wrapper in C mode..

I'll think about this a bit more and see if it's viable. Very smart idea from Aaron if we can make it work in some form.

Alp.

For unrolling, I'm specifically against tying the syntax in any way to the
vectorizer. The vectorizer can unroll without vectorizing, and that's an
implementation detail. We also have a generic (concatenation) unroller, and
that should also be controlled by the same syntax. I propose that, for
unrolling we accept something like this:

This is a good point.

#pragma nounroll[(interleave/concatenate/all)] -- all is the default

#pragma unroll[(n[, interleave/concatenate/any])] -- any is the default

But this is getting *seriously* out of my area of expertise... :wink:

I agree with Arnold that any generic alternative will be too complex for
current purposes, but I also agree with the front-end crowd that we should
do it right for one. Given my lack of experience with parsers and utter
ignorance in Clang, I'd rather defer that implementation to someone more
knowledgeable.

I like the idea of using C++11 attributes, but this could be complimentary,
not replace the pragmas, as Hal pointed out.

cheers,
--renato

Hi Alp,

Following the discussions in the IRC, I think we might have some entry
point into the vectorizer metadata using C++11 attributes first, than
pragmas later.

Having tried to add a pragma, I know how hard it is, and given the high
levels of community interest, I think starting with the attributes will be
the fastest way to get it working.

Arnold,

Do you think this would be a set-back? Having C++11 handled, we can at
least start creating specialized tests and benchmarks, which is the most
important thing for the vectorizer right now. How the pragmas will behave
is a matter for further discussion, but in the end, both attribute and
pragma should generate the same metadata in IR.

cheers,
--renato

From: "Renato Golin" <renato.golin@linaro.org>
To: "Alp Toker" <alp@nuanti.com>, "Arnold Schwaighofer" <aschwaighofer@apple.com>
Cc: "Hal Finkel" <hfinkel@anl.gov>, "Aaron Ballman" <aaron@aaronballman.com>, "Richard Smith"
<richard@metafoo.co.uk>, "Clang Dev" <cfe-dev@cs.uiuc.edu>, "Nadav Rotem" <nrotem@apple.com>
Sent: Wednesday, January 8, 2014 10:43:28 AM
Subject: Re: [cfe-dev] Adding a new pragma to Clang

I'll think about this a bit more and see if it's viable. Very smart
idea from Aaron if we can make it work in some form.

Hi Alp,

Following the discussions in the IRC, I think we might have some
entry point into the vectorizer metadata using C++11 attributes
first, than pragmas later.

Having tried to add a pragma, I know how hard it is, and given the
high levels of community interest, I think starting with the
attributes will be the fastest way to get it working.

I agree that adding attributes is much easier; and I'm fine with taking this approach. I think that, in general, we should work on an infrastructure that lets the pragma parsing build off of the attribute handing. I don't see any real reason to have separate code paths for the two (in those cases where they are doing similar things).

In the long run we should, however, definitely support a pragma syntax. That is the proper, standard, way of providing these kinds of extensions in pre-C++11 languages.

I'll add that, as I understand it, one of the original design motivations for C++11 attributes was so that extensions like OpenMP could be specified in terms of C++ attributes instead of or in addition to pragmas. I don't think the OpenMP ARB has yet acted on this, but I expect that as C++11 support is specifically addressed in upcoming versions of the OpenMP specification, we might see movement in this direction.

-Hal

I agree that adding attributes is much easier; and I'm fine with taking
this approach. I think that, in general, we should work on an
infrastructure that lets the pragma parsing build off of the attribute
handing. I don't see any real reason to have separate code paths for the
two (in those cases where they are doing similar things).

Agreed.

In the long run we should, however, definitely support a pragma syntax.

That is the proper, standard, way of providing these kinds of extensions in
pre-C++11 languages.

Agreed.

I'll add that, as I understand it, one of the original design motivations

for C++11 attributes was so that extensions like OpenMP could be specified
in terms of C++ attributes instead of or in addition to pragmas. I don't
think the OpenMP ARB has yet acted on this, but I expect that as C++11
support is specifically addressed in upcoming versions of the OpenMP
specification, we might see movement in this direction.

This is interesting. We can add the attributes in "beta" stage, and change
to closer to whatever OMP comes up, just to make things easier to users.
Since this attribute is going to be only used by a handful of people for
now, mostly us and tests, I think it should be ok.

cheers,
--renato

Since it sounds like the initial use is for internal testing and "bleeding edge" users, I'd suggest naming these in ways which clearly denote their experimental (and likely to change) status. (e.g. "__clang_experimental_unroll" rather than "unroll".) Once we're happy with the usage model, we can come back to the bikeshed discussion about naming.

Philip

Hi,

I understand that adding vectorization attributes will be easier than pragmas. But, what will the syntax look like? Can someone provide a small example?

Thanks,
Nadav