Filter optimization remarks by the hotness of the code region

This idea came up a few times recently [1][2] so I’d like start prototyping it. To summarize, we can emit optimization remarks using the -Rpass* options. These are currently emitted by optimizations like vectorization[3], unrolling, inlining and since last week loop distribution.

For large programs however this can amount to a lot of diagnostics output to sift through. Filtering this by the hotness of the region can help to focus the user on performance opportunities that are likely to pay off.

The approach I am thinking of taking is to install a wrapper as the diagnostics handler that will only forward to the original handler if the region of code is considered hot. This will be installed by a new pass that will use BlockFrequencyInfo to determine the top N hot regions.

This is at very early stage right now. I would appreciate any feedback.

Thanks,
Adam

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098492.html
[2] http://lists.llvm.org/pipermail/cfe-dev/2016-April/048526.html
[3] Loop Vectorization: Diagnostics and Control - The LLVM Project Blog

I think it is a good idea, and it reminds me a discussion about Polly at the last llvm-dev meeting, where we considered limiting compile-time impact by running polly only the code that is deemed to be "hot".
There could be the same kind of logic for things like LoopVersioningLICMPass, or specific optimizations like maybe the vectorization: if the remark is not relevant because the user should not care about this loop, why does the optimizer care in the first place?

Sure. The vectorizer has LoopVectorizeWithBlockFrequency which was meant to adapt the aggressiveness of the vectorizer to the hotness of the code. I think it got turned off by default because at the time BFI didn’t really provide a good measure of hotness. This should probably be looked at again especially if as a result we could turn on vectorization *with versioning* for -Os.

Also just to be clear the main use case is to apply -Rpass-misssed/-Rpass-analysis with PGO and see why we miss in hot regions.

Adam

Hi Adam,

I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also for generating reports by other tools (see the discussion around D19678, for example), I think it is important to allow the frontend to filter. The frontend, or other tool, might also want to collect information from different compilation jobs to provide the user with an overall ranking.

The default diagnostic, furthermore, does not provide enough information to identify a code "region". I think that the pass generating the diagnostic needs to provide the information, however, we could certainly create some utility functions that take a pointer to the BFI analysis and a Value* that can do the right thing in most simple cases.

Thanks again,
Hal

Hi Hal,

Hi Adam,

I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also for generating reports by other tools (see the discussion around D19678, for example), I think it is important to allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying that for example in the listing use case we don’t want the filtered diagnostics? In other words it should be up to the remark handler to decide whether it wants filtered or unfiltered remarks?

The frontend, or other tool, might also want to collect information from different compilation jobs to provide the user with an overall ranking.

Strictly speaking about PGO, the different compilation jobs get the same PGO with the aggregated profile, thus the hotness calculated should be global. I am not sure why an extra aggregation step is necessary.

The default diagnostic, furthermore, does not provide enough information to identify a code "region". I think that the pass generating the diagnostic needs to provide the information, however, we could certainly create some utility functions that take a pointer to the BFI analysis and a Value* that can do the right thing in most simple cases.

Good point.

Thanks for your feedback.

Adam

From: "Adam Nemet" <anemet@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev (llvm-dev@lists.llvm.org)" <llvm-dev@lists.llvm.org>
Sent: Wednesday, May 11, 2016 1:15:42 AM
Subject: Re: Filter optimization remarks by the hotness of the code region

Hi Hal,

>
> Hi Adam,
>
> I think would be a really useful feature to have. I don't think
> that the backend should be responsible for filtering, but should
> pass the relative hotness information to the frontend. Given that
> these diagnostics are not just going to be used for -Rpass and
> friends, but also for generating reports by other tools (see the
> discussion around D19678, for example), I think it is important to
> allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying
that for example in the listing use case we don’t want the filtered
diagnostics? In other words it should be up to the remark handler
to decide whether it wants filtered or unfiltered remarks?

We might or might not want them. The user might want to select different ratios and filters.

> The frontend, or other tool, might also want to collect information
> from different compilation jobs to provide the user with an
> overall ranking.

Strictly speaking about PGO, the different compilation jobs get the
same PGO with the aggregated profile, thus the hotness calculated
should be global. I am not sure why an extra aggregation step is
necessary.

I agree. However, I think that the frontend might employ a combination of factors in deciding what information to present. We might, for example, have pick different hotness thresholds for different kinds of remarks.

Especially since we're likely going with a design for the optimization reports where the frontend just creates some YAML files with the diagnostic information, and then a separate tool processes the files to produce reports, I think that we should give those tools the maximum about of practical flexibility. Such a tool might provide the user with non-trivial filtering options.

Thanks again,
Hal

From: “Adam Nemet” <anemet@apple.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “llvm-dev (llvm-dev@lists.llvm.org)” <llvm-dev@lists.llvm.org>
Sent: Wednesday, May 11, 2016 1:15:42 AM
Subject: Re: Filter optimization remarks by the hotness of the code region

Hi Hal,

Hi Adam,

I think would be a really useful feature to have. I don’t think
that the backend should be responsible for filtering, but should
pass the relative hotness information to the frontend. Given that
these diagnostics are not just going to be used for -Rpass and
friends, but also for generating reports by other tools (see the
discussion around D19678, for example), I think it is important to
allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying
that for example in the listing use case we don’t want the filtered
diagnostics? In other words it should be up to the remark handler
to decide whether it wants filtered or unfiltered remarks?

We might or might not want them. The user might want to select different ratios and filters.

The frontend, or other tool, might also want to collect information
from different compilation jobs to provide the user with an
overall ranking.

Strictly speaking about PGO, the different compilation jobs get the
same PGO with the aggregated profile, thus the hotness calculated
should be global. I am not sure why an extra aggregation step is
necessary.

I agree. However, I think that the frontend might employ a combination of factors in deciding what information to present. We might, for example, have pick different hotness thresholds for different kinds of remarks.

Especially since we’re likely going with a design for the optimization reports where the frontend just creates some YAML files with the diagnostic information, and then a separate tool processes the files to produce reports, I think that we should give those tools the maximum about of practical flexibility. Such a tool might provide the user with non-trivial filtering options.

I think this all makes sense. So I guess the steps are:

  1. YAML support for the existing remarks
  2. Add optional relative hotness to the opt-remark API
  3. Exposed relative hotness in the YAML output

Are you working on 1 or should I get started?

There was another issue that came up while discussing this with John McCall. We could have a pretty large number of such remarks. I imagine that if the post-processing is performed by the external tool, it may be useful to effectively enable all optimization remarks (i.e. -Rpass/-Rpass-missed/-Rpass-analysis=loop-vectorize/inline/etc.). For large programs this may not be feasible and we may need to locally filter the remarks in LLVM. This however seems like a somewhat orthogonal issue that we could probably postpone for now.

Thanks for your input!
Adam

From: "Adam Nemet" <anemet@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev (llvm-dev@lists.llvm.org)" <llvm-dev@lists.llvm.org>,
"John McCall" <rjmccall@apple.com>
Sent: Wednesday, May 11, 2016 12:45:32 PM
Subject: Re: Filter optimization remarks by the hotness of the code
region

> > From: "Adam Nemet" < anemet@apple.com >
>

> > To: "Hal Finkel" < hfinkel@anl.gov >
>

> > Cc: "llvm-dev ( llvm-dev@lists.llvm.org )" <
> > llvm-dev@lists.llvm.org
> > >
>

> > Sent: Wednesday, May 11, 2016 1:15:42 AM
>

> > Subject: Re: Filter optimization remarks by the hotness of the
> > code
> > region
>

> > Hi Hal,
>

> >
>

> > > Hi Adam,
> >
>

> > > I think would be a really useful feature to have. I don't think
> >
>

> > > that the backend should be responsible for filtering, but
> > > should
> >
>

> > > pass the relative hotness information to the frontend. Given
> > > that
> >
>

> > > these diagnostics are not just going to be used for -Rpass and
> >
>

> > > friends, but also for generating reports by other tools (see
> > > the
> >
>

> > > discussion around D19678, for example), I think it is important
> > > to
> >
>

> > > allow the frontend to filter.
> >
>

> > I am not sure I follow, can you please elaborate. Are you saying
>

> > that for example in the listing use case we don’t want the
> > filtered
>

> > diagnostics? In other words it should be up to the remark handler
>

> > to decide whether it wants filtered or unfiltered remarks?
>

> We might or might not want them. The user might want to select
> different ratios and filters.

> > > The frontend, or other tool, might also want to collect
> > > information
> >
>

> > > from different compilation jobs to provide the user with an
> >
>

> > > overall ranking.
> >
>

> > Strictly speaking about PGO, the different compilation jobs get
> > the
>

> > same PGO with the aggregated profile, thus the hotness calculated
>

> > should be global. I am not sure why an extra aggregation step is
>

> > necessary.
>

> I agree. However, I think that the frontend might employ a
> combination of factors in deciding what information to present. We
> might, for example, have pick different hotness thresholds for
> different kinds of remarks.

> Especially since we're likely going with a design for the
> optimization reports where the frontend just creates some YAML
> files
> with the diagnostic information, and then a separate tool processes
> the files to produce reports, I think that we should give those
> tools the maximum about of practical flexibility. Such a tool might
> provide the user with non-trivial filtering options.

I think this all makes sense. So I guess the steps are:

1. YAML support for the existing remarks
2. Add optional relative hotness to the opt-remark API
3. Exposed relative hotness in the YAML output

Are you working on 1 or should I get started?

This is on my TODO list, but I've not started yet. Next few days seem unlikely too.

One thing we should now thing about is where this YAML encoding happens. It could be in the backend, or it could be in the remark handler in the frontend. If they'll be no real programmatic interaction in the frontend with the remark content (other than serializing it into YAML), it might make sense to make the YAML-serialization capability a property of the remark itself handled by its implementation in the backend.

There was another issue that came up while discussing this with John
McCall. We could have a pretty large number of such remarks. I
imagine that if the post-processing is performed by the external
tool, it may be useful to effectively enable all optimization
remarks (i.e.
-Rpass/-Rpass-missed/-Rpass-analysis=loop-vectorize/inline/etc.).
For large programs this may not be feasible and we may need to
locally filter the remarks in LLVM. This however seems like a
somewhat orthogonal issue that we could probably postpone for now.

I've also thought about this, and this is certainly something we'll need to keep an eye on. It is specifically why I thought at first that we'd prefer to put the report-generation functionality into Clang. I'm happy, however, to postpone this unless and until it proves to be a problem. There's certainly a benefit to separating the functionality like this.

Thanks again,
Hal


From: “Adam Nemet” <anemet@apple.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “llvm-dev (llvm-dev@lists.llvm.org)” <llvm-dev@lists.llvm.org>, “John McCall” <rjmccall@apple.com>
Sent: Wednesday, May 11, 2016 12:45:32 PM
Subject: Re: Filter optimization remarks by the hotness of the code region


From: “Adam Nemet” <anemet@apple.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “llvm-dev (llvm-dev@lists.llvm.org)” <llvm-dev@lists.llvm.org>
Sent: Wednesday, May 11, 2016 1:15:42 AM
Subject: Re: Filter optimization remarks by the hotness of the code region

Hi Hal,

Hi Adam,

I think would be a really useful feature to have. I don’t think
that the backend should be responsible for filtering, but should
pass the relative hotness information to the frontend. Given that
these diagnostics are not just going to be used for -Rpass and
friends, but also for generating reports by other tools (see the
discussion around D19678, for example), I think it is important to
allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying
that for example in the listing use case we don’t want the filtered
diagnostics? In other words it should be up to the remark handler
to decide whether it wants filtered or unfiltered remarks?

We might or might not want them. The user might want to select different ratios and filters.

The frontend, or other tool, might also want to collect information
from different compilation jobs to provide the user with an
overall ranking.

Strictly speaking about PGO, the different compilation jobs get the
same PGO with the aggregated profile, thus the hotness calculated
should be global. I am not sure why an extra aggregation step is
necessary.

I agree. However, I think that the frontend might employ a combination of factors in deciding what information to present. We might, for example, have pick different hotness thresholds for different kinds of remarks.

Especially since we’re likely going with a design for the optimization reports where the frontend just creates some YAML files with the diagnostic information, and then a separate tool processes the files to produce reports, I think that we should give those tools the maximum about of practical flexibility. Such a tool might provide the user with non-trivial filtering options.

I think this all makes sense. So I guess the steps are:

  1. YAML support for the existing remarks
  2. Add optional relative hotness to the opt-remark API
  3. Exposed relative hotness in the YAML output

Are you working on 1 or should I get started?

This is on my TODO list, but I’ve not started yet. Next few days seem unlikely too.

One thing we should now thing about is where this YAML encoding happens. It could be in the backend, or it could be in the remark handler in the frontend. If they’ll be no real programmatic interaction in the frontend with the remark content (other than serializing it into YAML), it might make sense to make the YAML-serialization capability a property of the remark itself handled by its implementation in the backend.

I was under the impression that your notion of source locations is opaquely encoded and so something in the frontend would need to interpret them. If that’s not the case, then it doesn’t matter much, I think, as long as the format supports certain kinds of query. For example, a tool should be able to start with a source location and find the interesting remarks about it + surrounding code. As long as the format allows that query to occur without needing pass-specific knowledge of a remark’s schema, it’s probably not all that important who outputs the data.

There was another issue that came up while discussing this with John McCall. We could have a pretty large number of such remarks. I imagine that if the post-processing is performed by the external tool, it may be useful to effectively enable all optimization remarks (i.e. -Rpass/-Rpass-missed/-Rpass-analysis=loop-vectorize/inline/etc.). For large programs this may not be feasible and we may need to locally filter the remarks in LLVM. This however seems like a somewhat orthogonal issue that we could probably postpone for now.

I’ve also thought about this, and this is certainly something we’ll need to keep an eye on. It is specifically why I thought at first that we’d prefer to put the report-generation functionality into Clang. I’m happy, however, to postpone this unless and until it proves to be a problem. There’s certainly a benefit to separating the functionality like this.

Right, I think we can handle this as optional configuration data for the pass’s remark generation: the default is to generate all remarks (if this is enabled at all), but we can evolve ways to filter that, e.g. opting specific places into more comprehensive remarks.

John.

From: "John McCall" <rjmccall@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Adam Nemet" <anemet@apple.com>, "llvm-dev
(llvm-dev@lists.llvm.org)" <llvm-dev@lists.llvm.org>
Sent: Thursday, May 12, 2016 9:30:27 PM
Subject: Re: Filter optimization remarks by the hotness of the code
region

> > From: "Adam Nemet" < anemet@apple.com >
>

> > To: "Hal Finkel" < hfinkel@anl.gov >
>

> > Cc: "llvm-dev ( llvm-dev@lists.llvm.org )" <
> > llvm-dev@lists.llvm.org
> > >, "John McCall" < rjmccall@apple.com >
>

> > Sent: Wednesday, May 11, 2016 12:45:32 PM
>

> > Subject: Re: Filter optimization remarks by the hotness of the
> > code
> > region
>

> >
>

> >
>

> > > > From: "Adam Nemet" < anemet@apple.com >
> > >
> >
>

> > > > To: "Hal Finkel" < hfinkel@anl.gov >
> > >
> >
>

> > > > Cc: "llvm-dev ( llvm-dev@lists.llvm.org )" <
> > > > llvm-dev@lists.llvm.org
> > > > >
> > >
> >
>

> > > > Sent: Wednesday, May 11, 2016 1:15:42 AM
> > >
> >
>

> > > > Subject: Re: Filter optimization remarks by the hotness of
> > > > the
> > > > code
> > > > region
> > >
> >
>

> > > > Hi Hal,
> > >
> >
>

> > > >
> > >
> >
>

> > > > > Hi Adam,
> > > >
> > >
> >
>

> > > > > I think would be a really useful feature to have. I don't
> > > > > think
> > > >
> > >
> >
>

> > > > > that the backend should be responsible for filtering, but
> > > > > should
> > > >
> > >
> >
>

> > > > > pass the relative hotness information to the frontend.
> > > > > Given
> > > > > that
> > > >
> > >
> >
>

> > > > > these diagnostics are not just going to be used for -Rpass
> > > > > and
> > > >
> > >
> >
>

> > > > > friends, but also for generating reports by other tools
> > > > > (see
> > > > > the
> > > >
> > >
> >
>

> > > > > discussion around D19678, for example), I think it is
> > > > > important
> > > > > to
> > > >
> > >
> >
>

> > > > > allow the frontend to filter.
> > > >
> > >
> >
>

> > > > I am not sure I follow, can you please elaborate. Are you
> > > > saying
> > >
> >
>

> > > > that for example in the listing use case we don’t want the
> > > > filtered
> > >
> >
>

> > > > diagnostics? In other words it should be up to the remark
> > > > handler
> > >
> >
>

> > > > to decide whether it wants filtered or unfiltered remarks?
> > >
> >
>

> > > We might or might not want them. The user might want to select
> > > different ratios and filters.
> >
>

> > > > > The frontend, or other tool, might also want to collect
> > > > > information
> > > >
> > >
> >
>

> > > > > from different compilation jobs to provide the user with an
> > > >
> > >
> >
>

> > > > > overall ranking.
> > > >
> > >
> >
>

> > > > Strictly speaking about PGO, the different compilation jobs
> > > > get
> > > > the
> > >
> >
>

> > > > same PGO with the aggregated profile, thus the hotness
> > > > calculated
> > >
> >
>

> > > > should be global. I am not sure why an extra aggregation step
> > > > is
> > >
> >
>

> > > > necessary.
> > >
> >
>

> > > I agree. However, I think that the frontend might employ a
> > > combination of factors in deciding what information to present.
> > > We
> > > might, for example, have pick different hotness thresholds for
> > > different kinds of remarks.
> >
>

> > > Especially since we're likely going with a design for the
> > > optimization reports where the frontend just creates some YAML
> > > files
> > > with the diagnostic information, and then a separate tool
> > > processes
> > > the files to produce reports, I think that we should give those
> > > tools the maximum about of practical flexibility. Such a tool
> > > might
> > > provide the user with non-trivial filtering options.
> >
>

> > I think this all makes sense. So I guess the steps are:
>

> > 1. YAML support for the existing remarks
>

> > 2. Add optional relative hotness to the opt-remark API
>

> > 3. Exposed relative hotness in the YAML output
>

> > Are you working on 1 or should I get started?
>

> This is on my TODO list, but I've not started yet. Next few days
> seem
> unlikely too.

> One thing we should now thing about is where this YAML encoding
> happens. It could be in the backend, or it could be in the remark
> handler in the frontend. If they'll be no real programmatic
> interaction in the frontend with the remark content (other than
> serializing it into YAML), it might make sense to make the
> YAML-serialization capability a property of the remark itself
> handled by its implementation in the backend.

I was under the impression that your notion of source locations is
opaquely encoded and so something in the frontend would need to
interpret them. If that's not the case, then it doesn't matter much,
I think, as long as the format supports certain kinds of query. For
example, a tool should be able to start with a source location and
find the interesting remarks about it + surrounding code. As long as
the format allows that query to occur without needing pass-specific
knowledge of a remark's schema, it's probably not all that important
who outputs the data.

Right now, this works by forcing the frontend to at least produce line/column debug information, and the locations are derived from that.

-Hal

From: “Adam Nemet” <anemet@apple.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “llvm-dev (llvm-dev@lists.llvm.org)” <llvm-dev@lists.llvm.org>
Sent: Wednesday, May 11, 2016 1:15:42 AM
Subject: Re: Filter optimization remarks by the hotness of the code region

Hi Hal,

Hi Adam,

I think would be a really useful feature to have. I don’t think
that the backend should be responsible for filtering, but should
pass the relative hotness information to the frontend. Given that
these diagnostics are not just going to be used for -Rpass and
friends, but also for generating reports by other tools (see the
discussion around D19678, for example), I think it is important to
allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying
that for example in the listing use case we don’t want the filtered
diagnostics? In other words it should be up to the remark handler
to decide whether it wants filtered or unfiltered remarks?

We might or might not want them. The user might want to select different ratios and filters.

The frontend, or other tool, might also want to collect information
from different compilation jobs to provide the user with an
overall ranking.

Strictly speaking about PGO, the different compilation jobs get the
same PGO with the aggregated profile, thus the hotness calculated
should be global. I am not sure why an extra aggregation step is
necessary.

I agree. However, I think that the frontend might employ a combination of factors in deciding what information to present. We might, for example, have pick different hotness thresholds for different kinds of remarks.

Especially since we’re likely going with a design for the optimization reports where the frontend just creates some YAML files with the diagnostic information, and then a separate tool processes the files to produce reports, I think that we should give those tools the maximum about of practical flexibility. Such a tool might provide the user with non-trivial filtering options.

I think this all makes sense. So I guess the steps are:

  1. YAML support for the existing remarks
  2. Add optional relative hotness to the opt-remark API

I submitted for the first patch to implement this: http://reviews.llvm.org/D21771

Adam

From: “Adam Nemet” <anemet@apple.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “llvm-dev (llvm-dev@lists.llvm.org)” <llvm-dev@lists.llvm.org>
Sent: Wednesday, May 11, 2016 1:15:42 AM
Subject: Re: Filter optimization remarks by the hotness of the code region

Hi Hal,

Hi Adam,

I think would be a really useful feature to have. I don’t think
that the backend should be responsible for filtering, but should
pass the relative hotness information to the frontend. Given that
these diagnostics are not just going to be used for -Rpass and
friends, but also for generating reports by other tools (see the
discussion around D19678, for example), I think it is important to
allow the frontend to filter.

I am not sure I follow, can you please elaborate. Are you saying
that for example in the listing use case we don’t want the filtered
diagnostics? In other words it should be up to the remark handler
to decide whether it wants filtered or unfiltered remarks?

We might or might not want them. The user might want to select different ratios and filters.

The frontend, or other tool, might also want to collect information
from different compilation jobs to provide the user with an
overall ranking.

Strictly speaking about PGO, the different compilation jobs get the
same PGO with the aggregated profile, thus the hotness calculated
should be global. I am not sure why an extra aggregation step is
necessary.

I agree. However, I think that the frontend might employ a combination of factors in deciding what information to present. We might, for example, have pick different hotness thresholds for different kinds of remarks.

Especially since we’re likely going with a design for the optimization reports where the frontend just creates some YAML files with the diagnostic information, and then a separate tool processes the files to produce reports, I think that we should give those tools the maximum about of practical flexibility. Such a tool might provide the user with non-trivial filtering options.

I think this all makes sense. So I guess the steps are:

  1. YAML support for the existing remarks
  2. Add optional relative hotness to the opt-remark API

I submitted for the first patch to implement this: http://reviews.llvm.org/D21771

Adam

  1. Exposed relative hotness in the YAML output

The RFC patch implementing 1 and 3 is at https://reviews.llvm.org/D24587.

Adam