Can CSSPGO profile contain inlined function samples?

Background: [RFC] Context-sensitive Sample PGO with Pseudo-Instrumentation

The question concerns a case whether a profile like this is valid.

[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]:1234:11
 1: 1224
 4: bar:10    #    here bar is inlined
  1: 10

To my understanding from the source code of SampleProfile, this representation is valid but has no effect, because CSSPGO uses SampleContextTracker for function call matching, which builds a Trie only using the function with context part (the part inside [ ]), and CallsiteSamples are ignored. This is unlike AutoFDO sample, where CallsiteSamples are searched when encountering a function call instruction.

If the intention of CSSPGO is not to have inlined callsite samples, then the design of Sample Profile Reader can be simplified by completely decoupling the representation of context-less and CSSPGO profiles (for example, currently SampleContext is behaving like a Variant class in order to support both types of context). This can speed up compilation time/reduce memory usage because we never mix these two types of profile in compilation.

Furthermore, the above example can be parsed by Sample Profile Reader but cannot be written back correctly by Sample Profile Writer, it would become

[main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]:1234:11
 1: 1224
 4: [bar]:10    #  Malformed 
  1: 10

First, some context on CSSPGO profile. So CSSPGO profile has two forms, 1) flat form e.g. [main:3 @ _Z5funcAi:1 @ _Z8funcLeafi]; 2) nested form, in which case the profile looks no different from classic AutoFDO profile with inlinee profile nested inside inliner, except that the profile nesting isn’t based on previous inlining, but rather pre-inline decisions made in llvm-profen.

I think you’re asking about the flat form CSSPGO profile, not CSSPGO profile in general. For the flat form, we don’t expect any nested profile, so you’re right the example you gave is technically invalid, though it wasn’t explicitly disallowed.

We do want to have SampleContext being the abstraction for all profiles, and thinking of context-less profile as a special case of context-sensitive profile. Maybe there are some trade-off decisions to be tweaked between generality vs cost. Patches welcome and happy to look at the concrete changes and results.

May I ask an off-topic question, is there any introduction to using CSSPGO? So far, all I found is information about the design of CSSPGO. @WenleiHe

One method to create a profile with publicly available tool is to first collect perf.data on the binary with linux perf, and then use GitHub - google/autofdo: AutoFDO to create contextless AFDO file using perf.data and the binary as input. Finally use llvm-profdata convert-sample-profile-layout option to convert AFDO to CSSPGO.