IC profiling infrastructure

I still have some issues with what we're doing with value_kind here. Let
me summarize the arguments I've heard for including the value_kind in
various places in this set of changes:

1. We would like to be able to enable multiple types of value profiling
   at the same time.

1a. We would like to be able to selectively use only particular value
    kinds from a particular profile.

1b. We would like to be able to combine profiles with different sets of
    profiling options enabled.

2. We will need to know the value kind in order to parse the raw profile
   data.

3. We'd like to avoid needing to change the .profdata format when we add
   new value kinds, since it needs to stay backwards compatible.

4. The frontend work is simpler if the value kind is explicitly encoded.

There are a couple of these points that certainly need to be addressed,
and some that are a little bit less clear.

A. We do need to handle (1) in some way, but the issue can be stated a
   little bit more generally. If there are optional parts of the profile
   data, we definitely need to know which options are enabled.

B. (1a) is only an issue if you believe (4). This might be true or it might
   not, and I don't think we can make an informed decision until we're
   actually using multiple value kinds. For now there's just a bunch of
   code that's effectively useless - "if value kind is 2, do nothing". I'm
   opposed to adding this kind of practically unreachable code "just in
   case". It adds a kind of complexity and premature generality that, in my
   experience, will have to be completely re-done when we get to actually
   generalizing the code.

C. (1b) is an interesting one. Is this actually a useful feature? If we
   want to combine multiple profiles with different subsets of profiling
   turned on, then we definitely need the information recorded
   somewhere.

D. I have issues with (2). If the different value kinds actually have
   different structures and need to be read in / interpreted
   differently, then writing the support for them now simply won't
   work. We're reading them in as if they're interpreted in the exact
   same way as the indirect call profiling - if they aren't, then we
   have to rewrite this code when we add a new one anyway. What's the
   point of writing code that won't actually work when we try to use it?

E. There is value in dealing with (3) as long as we're confident that we
   can get it right. If we're pretty confident that the value profile
   data structure we're encoding in the .profdata file is going to work
   for multiple value profile kinds, then it makes sense to allow
   multiple of them and encode which kinds they are. David, is
   "InstrProfValueRecord" in Betul's current patches going to work to
   record the other types of value profiling information you're planning
   on looking at?

I'd also like to point out that the instrprof parts of LLVM try to be
relatively frontend agnostic. We should strive to keep it such that the
frontend is the piece that decides what a particular profiling kind
means if possible, as long as the structure and way the data is laid out
is suitable. This doesn't affect the current work very much, but it's a
good idea to keep it in mind.

So, with all of that said, I have a couple of suggestions on how to move
forward.

- I don't think we should store the kind in the raw profile yet. Let's
  keep this simple, since it's easy to change and will be easier to get
  right when we're implementing or experimenting with a second kind
  (based on (D)).

- Assuming the answer to my question in (E) is "yes", let's go ahead and
  include a value kind for the value records in the .profdata format. I
  don't really like the current approach of an array of std::vectors,
  because I think we can do this more simply. I'll bring that up on the
  review thread.

- In clang, we should error if we see value kinds we don't understand
  yet.

I think that this addresses the short term concerns without forcing us
to write unreachable/untestable code that may or may not work for the
longer term concerns. It minimally handles (A ) in that the data that is
there implies which features are being used, puts off (B) without making
it harder to deal with when the time comes, allows us to implement (C)
without much trouble if we want it, avoids doing work that could cause
problems with (D), and handles (E) since avoiding it would mean more
work later.

What do you think?

Xinliang David Li <davidxl@google.com> writes:

Justin, thanks for the reply.

I would like to point out that value_kind is actually not really
stored in the raw profile data (nor do we intend to do so), other than
the minimal information about per value kind NumOfSites info in
per-function profile header. Basically the data is organized per value
kind instead of stored intermixed. That is about it.

More replies below.

I still have some issues with what we're doing with value_kind here. Let
me summarize the arguments I've heard for including the value_kind in
various places in this set of changes:

1. We would like to be able to enable multiple types of value profiling
   at the same time.

1a. We would like to be able to selectively use only particular value
    kinds from a particular profile.

1b. We would like to be able to combine profiles with different sets of
    profiling options enabled.

yes.

2. We will need to know the value kind in order to parse the raw profile
   data.

Yes -- but we don't need to store value kind in order to do that -- we
simply need to organize the profile data more structurally.

3. We'd like to avoid needing to change the .profdata format when we add
   new value kinds, since it needs to stay backwards compatible.

4. The frontend work is simpler if the value kind is explicitly encoded.

There are a couple of these points that certainly need to be addressed,
and some that are a little bit less clear.

A. We do need to handle (1) in some way, but the issue can be stated a
   little bit more generally. If there are optional parts of the profile
   data, we definitely need to know which options are enabled.

B. (1a) is only an issue if you believe (4). This might be true or it might
   not, and I don't think we can make an informed decision until we're
   actually using multiple value kinds. For now there's just a bunch of
   code that's effectively useless - "if value kind is 2, do nothing". I'm
   opposed to adding this kind of practically unreachable code "just in
   case". It adds a kind of complexity and premature generality that, in my
   experience, will have to be completely re-done when we get to actually
   generalizing the code.

In terms of value profiling code, there is no need for the check you
mentioned "if value kind is ..." at all, but 'attached' to the codegen
code for language construct of interests, for instance,

... CodeGenFunction::EmitCall(...)
{

    if (enable_indirect_profile) {
           ....profileIndirectCall(); // does both instrumentation
and profile annotation depending on the context.
    }
   ...
}

Organizing the profile data according value profile kind allows
profile data to be independently loaded and used:

.. CodeGenPGO::assignRegionCounters (...) {

   if (enable_indirect_profile) {
     loadValueProfile(VK_indirect_call);
   }
   if (enable_xxx_profile) {
    loadXXXProfile(VK_xxx);
  }

}

if we end up with inter-mixing all value kind profile data, we will
have to load all value profile data into memory regardless of the
option specified. It also makes the profile data uses really hard --
how do we easily match the value profile site with the profile data
counter index if profile kinds are selectively enabled?

Besides, in the profile-use pass, each value kind's in-memory format
of profile data may be slightly different, so the loader will have to
do different things ..

C. (1b) is an interesting one. Is this actually a useful feature? If we
   want to combine multiple profiles with different subsets of profiling
   turned on, then we definitely need the information recorded
   somewhere.

I think it is a good flexibility to have.

D. I have issues with (2). If the different value kinds actually have
   different structures and need to be read in / interpreted
   differently, then writing the support for them now simply won't
   work.

We choose to have very compact form representing the raw value data
for all value kinds, but to support efficient use of value profile of
different kinds, the raw profile reader has to do something different
here. For instance, indirect call profile data needs the translation
of the raw address which other value kinds do not care. I think this
is something we can do without.

We're reading them in as if they're interpreted in the exact
   same way as the indirect call profiling - if they aren't, then we
   have to rewrite this code when we add a new one anyway. What's the
   point of writing code that won't actually work when we try to use it?

I don't think this would be a problem as adding a new kind requires
client/use side change too, otherwise we won't be able to use it.

E. There is value in dealing with (3) as long as we're confident that we
   can get it right. If we're pretty confident that the value profile
   data structure we're encoding in the .profdata file is going to work
   for multiple value profile kinds, then it makes sense to allow
   multiple of them and encode which kinds they are. David, is
   "InstrProfValueRecord" in Betul's current patches going to work to
   record the other types of value profiling information you're planning
   on looking at?

The raw profile format is certainly good for other kinds.
InstrProfValueRecord is mostly there (for instance overloading the
Name field for other purpose, but this is minor and can be done as
follow up when support of other kinds are added), so I am ok to leave
it as it is now.

I'd also like to point out that the instrprof parts of LLVM try to be
relatively frontend agnostic. We should strive to keep it such that the
frontend is the piece that decides what a particular profiling kind
means if possible, as long as the structure and way the data is laid out
is suitable. This doesn't affect the current work very much, but it's a
good idea to keep it in mind.

yes. See the above -- FE code does not really need to care about the format.

So, with all of that said, I have a couple of suggestions on how to move
forward.

- I don't think we should store the kind in the raw profile yet. Let's
  keep this simple, since it's easy to change and will be easier to get
  right when we're implementing or experimenting with a second kind
  (based on (D)).

See my reply above -- we don't really store the value kind in raw
profile, but we need to organize the profile data according to the
kind in order to process (read) and use (profile-use) them
efficiently.

- Assuming the answer to my question in (E) is "yes", let's go ahead and
  include a value kind for the value records in the .profdata format.

yes.

I
  don't really like the current approach of an array of std::vectors,
  because I think we can do this more simply. I'll bring that up on the
  review thread.

- In clang, we should error if we see value kinds we don't understand
  yet.

yes.

I think that this addresses the short term concerns without forcing us
to write unreachable/untestable code that may or may not work for the
longer term concerns. It minimally handles (A ) in that the data that is
there implies which features are being used, puts off (B) without making
it harder to deal with when the time comes, allows us to implement (C)
without much trouble if we want it, avoids doing work that could cause
problems with (D), and handles (E) since avoiding it would mean more
work later.

What do you think?

See my reply above. For now, I think the things we need to organize
the value profile data in raw profile according to kind (not storing)
-- I think it will handle all the cases where you have concerns.

thanks,

David

Xinliang David Li <davidxl@google.com> writes:

Justin, thanks for the reply.

I would like to point out that value_kind is actually not really
stored in the raw profile data (nor do we intend to do so), other than
the minimal information about per value kind NumOfSites info in
per-function profile header. Basically the data is organized per value
kind instead of stored intermixed. That is about it.

More replies below.

I still have some issues with what we're doing with value_kind here. Let
me summarize the arguments I've heard for including the value_kind in
various places in this set of changes:

1. We would like to be able to enable multiple types of value profiling
   at the same time.

1a. We would like to be able to selectively use only particular value
    kinds from a particular profile.

1b. We would like to be able to combine profiles with different sets of
    profiling options enabled.

yes.

2. We will need to know the value kind in order to parse the raw profile
   data.

Yes -- but we don't need to store value kind in order to do that -- we
simply need to organize the profile data more structurally.

3. We'd like to avoid needing to change the .profdata format when we add
   new value kinds, since it needs to stay backwards compatible.

4. The frontend work is simpler if the value kind is explicitly encoded.

There are a couple of these points that certainly need to be addressed,
and some that are a little bit less clear.

A. We do need to handle (1) in some way, but the issue can be stated a
   little bit more generally. If there are optional parts of the profile
   data, we definitely need to know which options are enabled.

B. (1a) is only an issue if you believe (4). This might be true or it might
   not, and I don't think we can make an informed decision until we're
   actually using multiple value kinds. For now there's just a bunch of
   code that's effectively useless - "if value kind is 2, do nothing". I'm
   opposed to adding this kind of practically unreachable code "just in
   case". It adds a kind of complexity and premature generality that, in my
   experience, will have to be completely re-done when we get to actually
   generalizing the code.

In terms of value profiling code, there is no need for the check you
mentioned "if value kind is ..." at all, but 'attached' to the codegen
code for language construct of interests, for instance,

... CodeGenFunction::EmitCall(...)
{

    if (enable_indirect_profile) {
           ....profileIndirectCall(); // does both instrumentation
and profile annotation depending on the context.
    }
   ...
}

Organizing the profile data according value profile kind allows
profile data to be independently loaded and used:

.. CodeGenPGO::assignRegionCounters (...) {

   if (enable_indirect_profile) {
     loadValueProfile(VK_indirect_call);
   }
   if (enable_xxx_profile) {
    loadXXXProfile(VK_xxx);
  }

}

if we end up with inter-mixing all value kind profile data, we will
have to load all value profile data into memory regardless of the
option specified. It also makes the profile data uses really hard --
how do we easily match the value profile site with the profile data
counter index if profile kinds are selectively enabled?

Besides, in the profile-use pass, each value kind's in-memory format
of profile data may be slightly different, so the loader will have to
do different things ..

C. (1b) is an interesting one. Is this actually a useful feature? If we
   want to combine multiple profiles with different subsets of profiling
   turned on, then we definitely need the information recorded
   somewhere.

I think it is a good flexibility to have.

D. I have issues with (2). If the different value kinds actually have
   different structures and need to be read in / interpreted
   differently, then writing the support for them now simply won't
   work.

We choose to have very compact form representing the raw value data
for all value kinds, but to support efficient use of value profile of
different kinds, the raw profile reader has to do something different
here. For instance, indirect call profile data needs the translation
of the raw address which other value kinds do not care. I think this
is something we can do without.

   We're reading them in as if they're interpreted in the exact
   same way as the indirect call profiling - if they aren't, then we
   have to rewrite this code when we add a new one anyway. What's the
   point of writing code that won't actually work when we try to use it?

I don't think this would be a problem as adding a new kind requires
client/use side change too, otherwise we won't be able to use it.

This is exactly my point. Why are we writing code *now* to read values
that we have no way of using? This code *cannot* be tested, because it
doesn't do anything. It adds a bunch of loops to iterate over values
that *must* be zero, because if they aren't we don't actually know what
they mean.

The compiler-rt patches I've seen so far have all treated each value
kind as if it should be treated identically to indirect call target, and
had extra complexity so that they could do this, but you've stated
multiple times that this won't actually be correct when we use it. We
should write the raw profile writer and reader code to do exactly what
we need it to do today. We can and will change it when we want to add
new functionality.

E. There is value in dealing with (3) as long as we're confident that we
   can get it right. If we're pretty confident that the value profile
   data structure we're encoding in the .profdata file is going to work
   for multiple value profile kinds, then it makes sense to allow
   multiple of them and encode which kinds they are. David, is
   "InstrProfValueRecord" in Betul's current patches going to work to
   record the other types of value profiling information you're planning
   on looking at?

The raw profile format is certainly good for other kinds.

I'm not talking about the raw profile format here. This point is about
the stable, indexed format (.profdata, not .profraw). We have no need to
keep the raw profile format backwards compatible, but we do need to keep
readers for old indexed format files.

InstrProfValueRecord is mostly there (for instance overloading the
Name field for other purpose, but this is minor and can be done as
follow up when support of other kinds are added), so I am ok to leave
it as it is now.

What do you mean? What do you expect this structure to look like for
other value kinds? Will we need to change the serialized format for the
indexed profile format?

There are two possibilities. Either:

1. This serialized format is sufficient for other value kinds, so it is
   valuable to encode it in the indexed profile format now, because this
   will avoid needing to change that format later.

Or,

2. This is likely not how other value kinds will need to be serialized,
   so we're going to need to change the format later either way. In this
   case, we should just solve the problem we have *today*, rather than
   making changes we'll need to undo later.

I think that given the uncertainty about what the future value kinds'
formats will be, we should encode this in a way where it's obvious what
the extra data is for (we'll specify a kind in the data), but without
trying to accomodate future value kinds until we're ready to talk about
them concretely.

I'll go into more detail about this in my review for the indexed profile
reader and writer patch, which I should be able to get to tomorrow.

I'd also like to point out that the instrprof parts of LLVM try to be
relatively frontend agnostic. We should strive to keep it such that the
frontend is the piece that decides what a particular profiling kind
means if possible, as long as the structure and way the data is laid out
is suitable. This doesn't affect the current work very much, but it's a
good idea to keep it in mind.

yes. See the above -- FE code does not really need to care about the format.

So, with all of that said, I have a couple of suggestions on how to move
forward.

- I don't think we should store the kind in the raw profile yet. Let's
  keep this simple, since it's easy to change and will be easier to get
  right when we're implementing or experimenting with a second kind
  (based on (D)).

See my reply above -- we don't really store the value kind in raw
profile, but we need to organize the profile data according to the
kind in order to process (read) and use (profile-use) them
efficiently.

The only reader of the raw profile is the tool that converts it to the
indexed format, which is what profile-use consumes. The read will always
read the entire file in a streaming fashion, and efficiently reading
parts of the file is not important. We then write this to the indexed
format, which needs to be efficiently readable.

   We're reading them in as if they're interpreted in the exact
   same way as the indirect call profiling - if they aren't, then we
   have to rewrite this code when we add a new one anyway. What's the
   point of writing code that won't actually work when we try to use it?

I don't think this would be a problem as adding a new kind requires
client/use side change too, otherwise we won't be able to use it.

This is exactly my point. Why are we writing code *now* to read values
that we have no way of using? This code *cannot* be tested, because it
doesn't do anything. It adds a bunch of loops to iterate over values
that *must* be zero, because if they aren't we don't actually know what
they mean.

I think the real question is not whether the related change is needed
now or not, but whether they are needed when more than one kind is
supported in the near future. If we predict it is needed, making
future change simpler with a good foundation is a good thing to have.

The compiler-rt patches I've seen so far have all treated each value
kind as if it should be treated identically to indirect call target,

Can you be specific here? I don't see any indirect_call specific code
in there. The only thing related is to sum up the total number of
sites across all value kinds.

and
had extra complexity so that they could do this, but you've stated
multiple times that this won't actually be correct when we use it. We
should write the raw profile writer and reader code to do exactly what
we need it to do today. We can and will change it when we want to add
new functionality.

Which one of the statements about correctness issue is here?

E. There is value in dealing with (3) as long as we're confident that we
   can get it right. If we're pretty confident that the value profile
   data structure we're encoding in the .profdata file is going to work
   for multiple value profile kinds, then it makes sense to allow
   multiple of them and encode which kinds they are. David, is
   "InstrProfValueRecord" in Betul's current patches going to work to
   record the other types of value profiling information you're planning
   on looking at?

The raw profile format is certainly good for other kinds.

I'm not talking about the raw profile format here. This point is about
the stable, indexed format (.profdata, not .profraw). We have no need to
keep the raw profile format backwards compatible, but we do need to keep
readers for old indexed format files.

InstrProfValueRecord is mostly there (for instance overloading the
Name field for other purpose, but this is minor and can be done as
follow up when support of other kinds are added), so I am ok to leave
it as it is now.

What do you mean? What do you expect this structure to look like for
other value kinds? Will we need to change the serialized format for the
indexed profile format?

Basically the Name field is specific to indirect call target. For
other value kind, we may need have a summary record for each site,
which is represented by the InstrProfValueRecord. The summary record
can be something like average etc -- but they are not strictly needed
either as they can be derived by the client who uses the profile data.
In short, we are strong reason to believe InstrProfValueRecord is
sufficient for other kinds.

There are two possibilities. Either:

1. This serialized format is sufficient for other value kinds, so it is
   valuable to encode it in the indexed profile format now, because this
   will avoid needing to change that format later.

Or,

2. This is likely not how other value kinds will need to be serialized,
   so we're going to need to change the format later either way. In this
   case, we should just solve the problem we have *today*, rather than
   making changes we'll need to undo later.

I think that given the uncertainty about what the future value kinds'
formats will be, we should encode this in a way where it's obvious what
the extra data is for (we'll specify a kind in the data), but without
trying to accomodate future value kinds until we're ready to talk about
them concretely.

I believe 1 is the right way -- it will make the incremental cost of
adding new kind really low.

I'll go into more detail about this in my review for the indexed profile
reader and writer patch, which I should be able to get to tomorrow.

thanks,

David

Thanks for all the replies.

Xinliang David Li <davidxl@google.com> writes:

Justin, thanks for the reply.

I would like to point out that value_kind is actually not really
stored in the raw profile data (nor do we intend to do so), other than
the minimal information about per value kind NumOfSites info in
per-function profile header. Basically the data is organized per value
kind instead of stored intermixed. That is about it.

More replies below.

I still have some issues with what we're doing with value_kind here.

Let

me summarize the arguments I've heard for including the value_kind in
various places in this set of changes:

1. We would like to be able to enable multiple types of value profiling
   at the same time.

1a. We would like to be able to selectively use only particular value
    kinds from a particular profile.

1b. We would like to be able to combine profiles with different sets of
    profiling options enabled.

yes.

2. We will need to know the value kind in order to parse the raw

profile

   data.

Yes -- but we don't need to store value kind in order to do that -- we
simply need to organize the profile data more structurally.

3. We'd like to avoid needing to change the .profdata format when we

add

   new value kinds, since it needs to stay backwards compatible.

4. The frontend work is simpler if the value kind is explicitly

encoded.

There are a couple of these points that certainly need to be addressed,
and some that are a little bit less clear.

A. We do need to handle (1) in some way, but the issue can be stated a
   little bit more generally. If there are optional parts of the

profile

   data, we definitely need to know which options are enabled.

B. (1a) is only an issue if you believe (4). This might be true or it

might

   not, and I don't think we can make an informed decision until we're
   actually using multiple value kinds. For now there's just a bunch of
   code that's effectively useless - "if value kind is 2, do nothing".

I'm

   opposed to adding this kind of practically unreachable code "just in
   case". It adds a kind of complexity and premature generality that,

in my

   experience, will have to be completely re-done when we get to

actually

   generalizing the code.

In terms of value profiling code, there is no need for the check you
mentioned "if value kind is ..." at all, but 'attached' to the codegen
code for language construct of interests, for instance,

... CodeGenFunction::EmitCall(...)
{

    if (enable_indirect_profile) {
           ....profileIndirectCall(); // does both instrumentation
and profile annotation depending on the context.
    }
   ...
}

Organizing the profile data according value profile kind allows
profile data to be independently loaded and used:

.. CodeGenPGO::assignRegionCounters (...) {

   if (enable_indirect_profile) {
     loadValueProfile(VK_indirect_call);
   }
   if (enable_xxx_profile) {
    loadXXXProfile(VK_xxx);
  }

}

if we end up with inter-mixing all value kind profile data, we will
have to load all value profile data into memory regardless of the
option specified. It also makes the profile data uses really hard --
how do we easily match the value profile site with the profile data
counter index if profile kinds are selectively enabled?

Besides, in the profile-use pass, each value kind's in-memory format
of profile data may be slightly different, so the loader will have to
do different things ..

C. (1b) is an interesting one. Is this actually a useful feature? If we
   want to combine multiple profiles with different subsets of

profiling

   turned on, then we definitely need the information recorded
   somewhere.

I think it is a good flexibility to have.

D. I have issues with (2). If the different value kinds actually have
   different structures and need to be read in / interpreted
   differently, then writing the support for them now simply won't
   work.

We choose to have very compact form representing the raw value data
for all value kinds, but to support efficient use of value profile of
different kinds, the raw profile reader has to do something different
here. For instance, indirect call profile data needs the translation
of the raw address which other value kinds do not care. I think this
is something we can do without.

   We're reading them in as if they're interpreted in the exact
   same way as the indirect call profiling - if they aren't, then we
   have to rewrite this code when we add a new one anyway. What's the
   point of writing code that won't actually work when we try to use

it?

I don't think this would be a problem as adding a new kind requires
client/use side change too, otherwise we won't be able to use it.

This is exactly my point. Why are we writing code *now* to read values
that we have no way of using? This code *cannot* be tested, because it
doesn't do anything. It adds a bunch of loops to iterate over values
that *must* be zero, because if they aren't we don't actually know what
they mean.

I disagree that the loops are doing nothing and the code is not testable.
First IR tests will come with path #4 which is the lowering. So it all is
going to be tested in a full-fledged form. Splitting was your suggestion.

The compiler-rt patches I've seen so far have all treated each value
kind as if it should be treated identically to indirect call target, and
had extra complexity so that they could do this, but you've stated
multiple times that this won't actually be correct when we use it. We
should write the raw profile writer and reader code to do exactly what
we need it to do today. We can and will change it when we want to add
new functionality.

So I do not understand why having value kind hurts here. Just having
indirect call targets is the same as having a value_kind with size = 1.
Isn't it? We're also looking into profiling loop count histogram in the
near future. So, having a general value profiling foundation is extremely
useful. Otherwise, for each added value type the indexed reader format
will keep on changing. At least it's looking future proof in the sense
that new values up to three more are already being supported.

Knowing the value_kind is also necessary to be able to map the 64-bit
target addresses to function names. In the indexed format we either store
a String or a uint64_t value depending on the value_kind. I'm unable to
read from your arguments, how would you plan to store a name for one
specific value kind and a 64-bit value for another kind.