Hello, everyone! I’m relatively new to the community and LLVM, but I’m happy to be here and meet everyone!
I am currently working on adding components to improve Profile Guided Optimization operations and am looking to propose a feature addition to the LLVM ASM Printer to allow S_POGODATA to be added to the PDB files to allow users who perform PGO to better understand its effects on a function level when looking at the PDB. This is what currently exists in the Microsoft Visual C++ compiler and I hope to enable clang to match the same on windows.
As per the publically available PDB GitHub info for Code View Info:microsoft-pdb/cvinfo.h at master · microsoft/microsoft-pdb · GitHub
S_POGODATA structure is defined by:
typedef struct POGOINFO {
unsigned short reclen; // Record length
unsigned short rectyp; // S_POGODATA
unsigned long invocations; // Number of times function was called
__int64 dynCount; // Dynamic instruction count
unsigned long numInstrs; // Static instruction count
unsigned long staInstLive; // Final static instruction count (post inlining)
} POGOINFO;
My hope is to update the CodeViewDebug::emitDebugInfoForFunction method to access the per function information and emit the record into the PDB.
Computing function invocation is relatively unambiguous as it corresponds to the recorded Function Entry count.
The number of instructions seems to also be stored. However, it is not clear if we should be considering IR level instruction counts or the Machine Lowered Instruction counts.
This also feeds into what should be considered for the Live Instruction Counts and the Dynamic Instruction Counts.
Another concern is the means of computing the Dynamic instruction counts using basic blocks vs machine basic blocks. Currently, I have updated the AsmPrinter::SetupMachineFunction method where I am iterating over a function’s Basic Blocks and computing the sum of (basic block instruction count * basic block invocation count) to compute the dynamic instruction count and summing the block instruction count when block invocation count is non 0 to compute the Live instruction count. I would like to ask if this would be the appropriate approach towards computing both these values and if this would be the appropriate code/methods to add this logic to. I have also updated the LLVM PDB parser to be able to read out the same from the PDB.
Alternative ideas might be to add an analysis pass to compute these values and access them at a later point. But I’m not sure if that would be the best approach. Also, it’s not clear if we should be using machine basic blocks or regular basic blocks as the results vary and are sometimes unavailable at certain points.
I would greatly appreciate any comments or feedback on the idea and the potential feedback.
Cheers!