[RFC] Addition of S_POGODATA field to PDB for better PGO analysis

Hello, everyone! I’m relatively new to the community and LLVM, but I’m happy to be here and meet everyone!

I am currently working on adding components to improve Profile Guided Optimization operations and am looking to propose a feature addition to the LLVM ASM Printer to allow S_POGODATA to be added to the PDB files to allow users who perform PGO to better understand its effects on a function level when looking at the PDB. This is what currently exists in the Microsoft Visual C++ compiler and I hope to enable clang to match the same on windows.
As per the publically available PDB GitHub info for Code View Info:microsoft-pdb/cvinfo.h at master · microsoft/microsoft-pdb · GitHub
S_POGODATA structure is defined by:
typedef struct POGOINFO {
unsigned short reclen; // Record length
unsigned short rectyp; // S_POGODATA

unsigned long   invocations;        // Number of times function was called
__int64         dynCount;           // Dynamic instruction count
unsigned long   numInstrs;          // Static instruction count
unsigned long   staInstLive;        // Final static instruction count (post inlining)

} POGOINFO;

My hope is to update the CodeViewDebug::emitDebugInfoForFunction method to access the per function information and emit the record into the PDB.

Computing function invocation is relatively unambiguous as it corresponds to the recorded Function Entry count.
The number of instructions seems to also be stored. However, it is not clear if we should be considering IR level instruction counts or the Machine Lowered Instruction counts.
This also feeds into what should be considered for the Live Instruction Counts and the Dynamic Instruction Counts.

Another concern is the means of computing the Dynamic instruction counts using basic blocks vs machine basic blocks. Currently, I have updated the AsmPrinter::SetupMachineFunction method where I am iterating over a function’s Basic Blocks and computing the sum of (basic block instruction count * basic block invocation count) to compute the dynamic instruction count and summing the block instruction count when block invocation count is non 0 to compute the Live instruction count. I would like to ask if this would be the appropriate approach towards computing both these values and if this would be the appropriate code/methods to add this logic to. I have also updated the LLVM PDB parser to be able to read out the same from the PDB.

Alternative ideas might be to add an analysis pass to compute these values and access them at a later point. But I’m not sure if that would be the best approach. Also, it’s not clear if we should be using machine basic blocks or regular basic blocks as the results vary and are sometimes unavailable at certain points.

I would greatly appreciate any comments or feedback on the idea and the potential feedback.
Cheers!

What’s the purpose of emitting the PGO information into PDBs?
Is it the same as the pgo info in profdata file?
Is it to allow developers to look at pgo info in debuggers?

Hello Zequan and thank you for your reply!

Adding S_POGODATA into the PDB allows users to understand each function’s impact on inlining and code alignment based on its dynamic instruction count (how many instructions in a function were actually executed during profiling * the number of invocations to the function during profiling) along with “live instruction count”. This allows users to verify if functions added to the “hot” section COFF group are genuinely valid and useful.
Furthermore, adding the S_POGODATA to the PDB will enable tools that work on Microsft Visual C++ compiler, like WPA, to read the PDB and provide users with useful insights after running the analysis on PDBs.
I hope this helps clear up why it would be beneficial to add the S_POGODATA record to the CLang PDB.

Hi Chandrasekar, thanks for the clarification.
I’m not familiar with PGO and IR stuff and can’t comment on the questions you proposed.