Move InlineCost.cpp out of Analysis?

Hi,

After r256521 - which removes InlineCostAnalysis class - I think there is no strong reason for InlineCost.cpp to be part of the Analysis library. Is it fine to make it part of TransformUtils?

I submitted r266477 (which has now been reverted) that made Analysis depend on ProfileData in order to obtain ProfileSummary for the module, but there is an existing dependency of ProfileData on Analysis (through Object and BitCode). Moving InlineCost.cpp under Transforms/Utils will fix this issue. There are other ways to fix this (make Inliner.cpp get the ProfileSummary and pass it to InlineCost, for example), but I think it makes sense to move InlineCost.

Thoughts?

Thanks,
Easwaran

From: "Easwaran Raman" <eraman@google.com>
To: "via llvm-dev" <llvm-dev@lists.llvm.org>
Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Hal Finkel" <hfinkel@anl.gov>, "Philip Reames"
<listmail@philipreames.com>, "David Li" <davidxl@google.com>
Sent: Monday, April 18, 2016 2:39:49 PM
Subject: Move InlineCost.cpp out of Analysis?

Hi,

After r256521 - which removes InlineCostAnalysis class - I think
there is no strong reason for InlineCost.cpp to be part of the
Analysis library. Is it fine to make it part of TransformUtils?

Given that InlineCost is not really an analysis any longer, I think this is fine.

-Hal

From: "Easwaran Raman" <eraman@google.com>
To: "via llvm-dev" <llvm-dev@lists.llvm.org>
Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Hal Finkel" <hfinkel@anl.gov>, "Philip Reames"
<listmail@philipreames.com>, "David Li" <davidxl@google.com>
Sent: Monday, April 18, 2016 2:39:49 PM
Subject: Move InlineCost.cpp out of Analysis?

Hi,

After r256521 - which removes InlineCostAnalysis class - I think
there is no strong reason for InlineCost.cpp to be part of the
Analysis library. Is it fine to make it part of TransformUtils?

Given that InlineCost is not really an analysis any longer, I think this is fine.

Isn't it? It is not a pass, but I see it as an analysis utils.

I submitted r266477 (which has now been reverted) that made Analysis
depend on ProfileData in order to obtain ProfileSummary for the
module, but there is an existing dependency of ProfileData on
Analysis (through Object and BitCode).

The real issue is that BitCode depends on Analysis I think.
I'm not sure about ProfileData that depends on Bitcode, do you know why?

The difference between Analysis and Transforms is not about passes, but about what the code does.

Code for mutating the IR should be in Transforms, and code that analyzes the IR without mutating it should be in Analysis. This is why, for example, InstructionSimplify is in Analysis – it does not mutate the IR in any way.

So I think InlineCost and similar things should stay in the Analysis library regardless of whether they are passes or not.

Thanks for the comments.

The difference between Analysis and Transforms is *not* about passes, but
about what the code *does*.

Code for mutating the IR should be in Transforms, and code that analyzes
the IR without mutating it should be in Analysis. This is why, for example,
InstructionSimplify is in Analysis -- it does not mutate the IR in any way.

So I think InlineCost and similar things should stay in the Analysis
library regardless of whether they are passes or not.

Is that the only criteria (IR mod or not) ? Most of the transformations
have pass specific analysis (that are not shared with other clients) --
those code stay with the transformation -- it does not make sense to move
those code in to Analysis. For the same reason, we need to ask first if
InlineCost is intended to be a shared utility?

David

Thanks for the comments.

If CmpInstAnalysis.cpp is not mutating the IR, it shouldn’t sit here in my opinion.

Yeah, I know but we should find a way to break this…

Some refactoring or split could help here, this does not seem desirable to me.

Given a public API, yes, that should be the only criteria IMO.

But I would expect these to also not expose public APIs to the analyses. Provided the public API is limited to the transformation, I think the code be closely located makes sense.

Sure, but in fact at least some parts of it are shared.

There is also some reason to prefer surfacing nice high-level public APIs when reasonable to do so. For example, if someone wanted to write a custom inlining pass that re-used the cost analysis, having it separated seems useful. So unless there is a significant problem solved by sinking this to be a private API only used by the inliner, I would keep it as-is.

Thanks for the comments.

>
>> From: "Easwaran Raman" <eraman@google.com>
>> To: "via llvm-dev" <llvm-dev@lists.llvm.org>
>> Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Hal Finkel" <
hfinkel@anl.gov>, "Philip Reames"
>> <listmail@philipreames.com>, "David Li" <davidxl@google.com>
>> Sent: Monday, April 18, 2016 2:39:49 PM
>> Subject: Move InlineCost.cpp out of Analysis?
>>
>>
>> Hi,
>>
>>
>> After r256521 - which removes InlineCostAnalysis class - I think
>> there is no strong reason for InlineCost.cpp to be part of the
>> Analysis library. Is it fine to make it part of TransformUtils?
>>
>
> Given that InlineCost is not really an analysis any longer, I think
this is fine.

Isn't it? It is not a pass, but I see it as an analysis utils.

Yes, I meant that it is not an analysis pass. It does perform analysis. I
see one such example of something that performs analysis being added under
Transforms/Utils: CmpInstAnalysis.cpp.

If CmpInstAnalysis.cpp is not mutating the IR, it shouldn't sit here in my
opinion.

>
>>
>> I submitted r266477 (which has now been reverted) that made Analysis
>> depend on ProfileData in order to obtain ProfileSummary for the
>> module, but there is an existing dependency of ProfileData on
>> Analysis (through Object and BitCode).

The real issue is that BitCode depends on Analysis I think.

I think that is due to ThinLTO's use of getBlockProfileCount.

Yeah, I know but we should find a way to break this...

I'm not sure about ProfileData that depends on Bitcode, do you know why?

ProfileData (specifically CoverageMapping{Reader|Writer}) depends on
Object which depends on Bitcode.

Some refactoring or split could help here, this does not seem desirable to
me.

Does moving the CoverageMapping code to a Coverage library seems
reasonable? The rest of ProfileData does not depend on this or on Object
and thgis would break the ProfileData->Analysis dependence.

In the current case at stake: the issue is that we can’t make the Analysis library using anything from the ProfileData library. Conceptually there is a problem IMO.

Moving something from Analysis to TransformUtils just because it needs to access ProfileData is a slippery slope…

Thanks for the comments.

>
>> From: "Easwaran Raman" <eraman@google.com>
>> To: "via llvm-dev" <llvm-dev@lists.llvm.org>
>> Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Hal Finkel" <
hfinkel@anl.gov>, "Philip Reames"
>> <listmail@philipreames.com>, "David Li" <davidxl@google.com>
>> Sent: Monday, April 18, 2016 2:39:49 PM
>> Subject: Move InlineCost.cpp out of Analysis?
>>
>>
>> Hi,
>>
>>
>> After r256521 - which removes InlineCostAnalysis class - I think
>> there is no strong reason for InlineCost.cpp to be part of the
>> Analysis library. Is it fine to make it part of TransformUtils?
>>
>
> Given that InlineCost is not really an analysis any longer, I think
this is fine.

Isn't it? It is not a pass, but I see it as an analysis utils.

Yes, I meant that it is not an analysis pass. It does perform analysis. I
see one such example of something that performs analysis being added under
Transforms/Utils: CmpInstAnalysis.cpp.

If CmpInstAnalysis.cpp is not mutating the IR, it shouldn't sit here in my
opinion.

>
>>
>> I submitted r266477 (which has now been reverted) that made Analysis
>> depend on ProfileData in order to obtain ProfileSummary for the
>> module, but there is an existing dependency of ProfileData on
>> Analysis (through Object and BitCode).

The real issue is that BitCode depends on Analysis I think.

I think that is due to ThinLTO's use of getBlockProfileCount.

Yeah, I know but we should find a way to break this...

I'm not sure how? With r265941 I moved summary creation out of the
BitcodeWriter and into a new analysis pass. However, I realized I needed to
keep the dependence on Analysis (r265945) since now the WriteBitcodePass
in BitcodeWriter invokes getAnalysis on the ModuleSummaryIndexWrapperPass.
How should that dependence be expressed, so that WriteBitcodePass ensures
that ModuleSummaryIndexWrapperPass builds the index and that
WriteBitcodePass can get access to the index?

Thanks,
Teresa

What you said makes sense.

David

The difference between Analysis and Transforms is *not* about passes, but
about what the code *does*.

Code for mutating the IR should be in Transforms, and code that analyzes
the IR without mutating it should be in Analysis. This is why, for example,
InstructionSimplify is in Analysis -- it does not mutate the IR in any way.

So I think InlineCost and similar things should stay in the Analysis
library regardless of whether they are passes or not.

Is that the only criteria (IR mod or not) ? Most of the transformations
have pass specific analysis (that are not shared with other clients) --
those code stay with the transformation -- it does not make sense to move
those code in to Analysis. For the same reason, we need to ask first if
InlineCost is intended to be a shared utility?

In the current case at stake: the issue is that we can't make the Analysis
library using anything from the ProfileData library. Conceptually there is
a problem IMO.

Yes -- this is a very good point.

David

From: "Chandler Carruth" <chandlerc@gmail.com>
To: "Xinliang David Li" <davidxl@google.com>
Cc: "Mehdi Amini" <mehdi.amini@apple.com>, "Hal Finkel"
<hfinkel@anl.gov>, "via llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Monday, April 18, 2016 4:31:05 PM
Subject: Re: [llvm-dev] Move InlineCost.cpp out of Analysis?

> > The difference between Analysis and Transforms is *not* about
> > passes,
> > but about what the code *does*.
>

> > Code for mutating the IR should be in Transforms, and code that
> > analyzes the IR without mutating it should be in Analysis. This
> > is
> > why, for example, InstructionSimplify is in Analysis -- it does
> > not
> > mutate the IR in any way.
>

> > So I think InlineCost and similar things should stay in the
> > Analysis
> > library regardless of whether they are passes or not.
>

> Is that the only criteria (IR mod or not) ?

Given a public API, yes, that should be the only criteria IMO.

> Most of the transformations have pass specific analysis (that are
> not
> shared with other clients) -- those code stay with the
> transformation -- it does not make sense to move those code in to
> Analysis.

But I would expect these to also not expose public APIs to the
analyses. Provided the public API is limited to the transformation,
I think the code be closely located makes sense.

I'm not sure the situation is as clear cut as you make it out to be. We can factor common components out of different transformations (e.g. out of different inliner implementations), including common components that don't modify the IR, without that component forming what I would generally consider an analysis. In this case, we're really talking about the encapsulation of the inliner's cost heuristic, and the generality of this information is questionable. To put it another way, while I'm fine with someone reusing this logic to build a new inliner, I'd be very skeptical of using it for any other purpose. That's why it does not really feel like an analysis to me. This is not a strong feeling, however, and I'm also fine with it staying in Analysis.

> For the same reason, we need to ask first if InlineCost is intended
> to be a shared utility?

Sure, but in fact at least some parts of it are shared.

There is also some reason to prefer surfacing nice high-level public
APIs when reasonable to do so. For example, if someone wanted to
write a custom inlining pass that re-used the cost analysis, having
it separated seems useful. So unless there is a significant problem
solved by sinking this to be a private API only used by the inliner,
I would keep it as-is.

I did not take moving to Transforms/Utils to mean making the API private. The things in Transforms/Utils generally have public APIs.

-Hal

From: "Xinliang David Li" <davidxl@google.com>
To: "Mehdi Amini" <mehdi.amini@apple.com>
Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Hal Finkel"
<hfinkel@anl.gov>, "via llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Monday, April 18, 2016 4:38:12 PM
Subject: Re: [llvm-dev] Move InlineCost.cpp out of Analysis?

>

>

> > > The difference between Analysis and Transforms is *not* about
> > > passes,
> > > but about what the code *does*.
> >
>

> > > Code for mutating the IR should be in Transforms, and code that
> > > analyzes the IR without mutating it should be in Analysis. This
> > > is
> > > why, for example, InstructionSimplify is in Analysis -- it does
> > > not
> > > mutate the IR in any way.
> >
>

> > > So I think InlineCost and similar things should stay in the
> > > Analysis
> > > library regardless of whether they are passes or not.
> >
>

> > Is that the only criteria (IR mod or not) ? Most of the
> > transformations have pass specific analysis (that are not shared
> > with other clients) -- those code stay with the transformation --
> > it
> > does not make sense to move those code in to Analysis. For the
> > same
> > reason, we need to ask first if InlineCost is intended to be a
> > shared utility?
>

> In the current case at stake: the issue is that we can't make the
> Analysis library using anything from the ProfileData library.
> Conceptually there is a problem IMO.

Yes -- this is a very good point.

Independent of anything else, +1.

-Hal

I don’t think that everything that is an analysis needs to be completely general though. Certainly, we need to be very clear and careful in what terms we describe the API to make the quality and nature of the results unsurprising, but I think that the API around the inline cost is reasonably good about that – it seems to pretty clearly indicate that this is a threshold determining mechanism and not something that provides a holistic view of the impact of a potential inlining decision.

I actually think it would be good if essentially everything we can put a reasonable public API around and which doesn’t mutate the IR were moved to the analysis library. As an example of why I think this, without this it is very hard to know what code can be re-used in analysis passes. And the constraint there is really exactly this: it has a public API that can be re-used, and it doesn’t mutate the IR. Nothing more or less that I see.

Naturally, we don’t need to go on a campaign to move everything right now, and we can even be fairly lazy about it, but I do think that this direction is the correct long-term design of the libraries here.

The design of ProfileData and reading profile information in the entire middle end had a really fundamental invariant that folks seem to have lost track of:

a) There is exactly one way to get at profile information from general analyses and transforms: a dedicated analysis pass that manages access to the profile info.

b) There is exactly one way for this analysis to compute this information from an external profile source: profile metadata attached to the IR.

c) There could be many external profile sources, but all of them should be read and then translated into metadata annotations on the IR so that serialization / deserialization preserve them in a common format and we can reason about how they work.

This layering is why it is only a transform that accesses ProfileData – it is responsible for annotating the IR and nothing else. Then the analysis uses these annotations and never reads the data directly.

I think this is a really important separation of concerns as it ensures that we don’t get an explosion of different analyses supporting various different subsets of profile sources.

Now, the original design only accounted for profile information within a function body, clearly it needs to be extended to support intraprocedural information. But I would still expect that to follow a similar layering where we first read the data into IR annotations, then have an analysis pass (this time a module analysis pass in all likelihood) that brokers access to these annotations through an API that can do intelligent things like synthesizing it from the “cold” attribute or whatever when missing.

-Chandler

In general I agree, but the inliner's cost heuristic is a bit special in this regard. It is special because it contains tuned thresholds that are meaningful only in the context of inlining as performed by the current inliner. Were we to change the inliner's overall scheme (e.g. made it more top-down) those would need to change. If someone wanted to build some higher-level analysis that made use of the InlineCost information, the fact that they had to move things from Transforms/Utils to Anslysis would be a good red flag that they're probably using the wrong tool. Obviously it might be the right tool, but it is probably something we'd want to think about. A second inliner, however, might be sufficiently close in design to the current one that the tuned thresholds might be applicable.

In short, I like your proposal, but I'd further propose that we say that anything that does not mutate the IR and does not make assumptions about its ultimate client should be an analysis.

-Hal

------------------------------

*From: *"Xinliang David Li" <davidxl@google.com>

In the current case at stake: the issue is that we can't make the
Analysis library using anything from the ProfileData library. Conceptually
there is a problem IMO.

Yes -- this is a very good point.

Independent of anything else, +1.

The design of ProfileData and reading profile information in the entire
middle end had a really fundamental invariant that folks seem to have lost
track of:

a) There is exactly *one* way to get at profile information from general
analyses and transforms: a dedicated analysis pass that manages access to
the profile info.

b) There is exactly *one* way for this analysis to compute this
information from an *external* profile source: profile metadata attached to
the IR.

c) There could be many external profile sources, but all of them should be
read and then translated into metadata annotations on the IR so that
serialization / deserialization preserve them in a common format and we can
reason about how they work.

This layering is why it is only a transform that accesses ProfileData --
it is responsible for annotating the IR and nothing else. Then the
analysis uses these annotations and never reads the data directly.

I think this is a really important separation of concerns as it ensures
that we don't get an explosion of different analyses supporting various
different subsets of profile sources.

Now, the original design only accounted for profile information *within* a
function body, clearly it needs to be extended to support intraprocedural
information. But I would still expect that to follow a similar layering
where we first read the data into IR annotations, then have an analysis
pass (this time a module analysis pass in all likelihood) that brokers
access to these annotations through an API that can do intelligent things
like synthesizing it from the "cold" attribute or whatever when missing.

Invariants b) and c) above are still true, but a) is not since InlineCost
directly calls ProfileSummary instead of through an analysis pass. I'll
add a module level pass as you suggest, but that still needs breaking the
ProfileData->Analysis dependence chain.

- Easwaran


From: “Chandler Carruth” <chandlerc@gmail.com>
To: “Hal Finkel” <hfinkel@anl.gov>
Cc: “Mehdi Amini” <mehdi.amini@apple.com>, “via llvm-dev” <llvm-dev@lists.llvm.org>, “Xinliang David Li” <davidxl@google.com>
Sent: Monday, April 18, 2016 4:54:36 PM

Subject: Re: [llvm-dev] Move InlineCost.cpp out of Analysis?


From: “Chandler Carruth” <chandlerc@gmail.com>
To: “Xinliang David Li” <davidxl@google.com>
Cc: “Mehdi Amini” <mehdi.amini@apple.com>, “Hal Finkel” <hfinkel@anl.gov>, “via llvm-dev” <llvm-dev@lists.llvm.org>
Sent: Monday, April 18, 2016 4:31:05 PM
Subject: Re: [llvm-dev] Move InlineCost.cpp out of Analysis?

Given a public API, yes, that should be the only criteria IMO.

But I would expect these to also not expose public APIs to the analyses. Provided the public API is limited to the transformation, I think the code be closely located makes sense.

I’m not sure the situation is as clear cut as you make it out to be. We can factor common components out of different transformations (e.g. out of different inliner implementations), including common components that don’t modify the IR, without that component forming what I would generally consider an analysis. In this case, we’re really talking about the encapsulation of the inliner’s cost heuristic, and the generality of this information is questionable. To put it another way, while I’m fine with someone reusing this logic to build a new inliner, I’d be very skeptical of using it for any other purpose. That’s why it does not really feel like an analysis to me. This is not a strong feeling, however, and I’m also fine with it staying in Analysis.

I don’t think that everything that is an analysis needs to be completely general though. Certainly, we need to be very clear and careful in what terms we describe the API to make the quality and nature of the results unsurprising, but I think that the API around the inline cost is reasonably good about that – it seems to pretty clearly indicate that this is a threshold determining mechanism and not something that provides a holistic view of the impact of a potential inlining decision.

I actually think it would be good if essentially everything we can put a reasonable public API around and which doesn’t mutate the IR were moved to the analysis library. As an example of why I think this, without this it is very hard to know what code can be re-used in analysis passes. And the constraint there is really exactly this: it has a public API that can be re-used, and it doesn’t mutate the IR. Nothing more or less that I see.

Naturally, we don’t need to go on a campaign to move everything right now, and we can even be fairly lazy about it, but I do think that this direction is the correct long-term design of the libraries here.

In general I agree, but the inliner’s cost heuristic is a bit special in this regard. It is special because it contains tuned thresholds that are meaningful only in the context of inlining as performed by the current inliner. Were we to change the inliner’s overall scheme (e.g. made it more top-down) those would need to change. If someone wanted to build some higher-level analysis that made use of the InlineCost information, the fact that they had to move things from Transforms/Utils to Anslysis would be a good red flag that they’re probably using the wrong tool. Obviously it might be the right tool, but it is probably something we’d want to think about. A second inliner, however, might be sufficiently close in design to the current one that the tuned thresholds might be applicable.

In short, I like your proposal, but I’d further propose that we say that anything that does not mutate the IR and does not make assumptions about its ultimate client should be an analysis.

But everything makes assumptions about its client? That’s the whole point of having a public API – you need to be able to document your assumptions well enough to have a reasonable public API. But that’s not about analysis vs. transforms, that’s really true for any public API. For example, I would expect anything in Transforms/Utils to document very clearly what assumptions they make about their callers.

And I’m not even sure I buy that our inline cost is that specialized… It uses TTI, another analysis, for estimating cost. It is essentially a size cost estimation tool. The closest thing to this kind of assumption are the thresholds is uses. Those should really be clear input parameters so that callers can customize it.

Anyways, all of this is to say that I don’t think the inline cost’s public API is in great shape, but I don’t think it is inherently without a reasonable public API. So I suspect we should be more trying to improve the clarity and configurability of its API rather than moving things around.

But all of this is a digression. Let’s get back to the meat of the subject around ProfileData.