Do we expect these __llvm_profile_ functions to be stable across LLVM releases in terms of interface and semantics? Is it safe to assume that if a user calls such functions in their code, when they upgrade to a later version of LLVM, their program behaviour is unchanged when they compile with the newer compiler?
There are quite a few __llvm_profile_ functions as you noted; precisely speaking, not all of the functions remains stable (e.g., __llvm_profile_get_padding_sizes_for_counters function signature could change if you look at commit history).
Relatedly, with new LLVM releases, the data format of raw profiles (which might be a part of semantics __llvm_profile_dump depending on how the function is used) could be updated, and raw profile format does not have no backward or forward compatibility guarantees [1]. For instance, if raw profile version is updated in a new LLVM release, raw profile data generated by old compiler cannot be parsed by llvm-profdata, the LLVM command line tool to convert raw profiles to the indexed format (to be used by compiler).
Note indexed format used by compiler does backward compatibility guarantee.
Could you elaborate more on how the user code calls these APIs, and what kind of program behavior change or API change is not desired for user code?
The APIs in __llvm namespace (prefix) are public and are intended to be stable in signature and semantics (at least for primary dumping APIs). Internal APIs are named as lprofXXX. If there are changes in signature, it might be unintentional.
Those public APIs are intended to be called by user programs to explicitly control profile dumping, merging etc. For instance, user can start profile collection after the startup/initialization phase.
As @davidxl noted in his reply, the user program can call such APIs to control the exact timing of profile dumping to collect profile data only for a part of the program execution. For example, the user program can have
// Introduce the API names.
__llvm_profile_reset_counters();
__llvm_profile_dump();
int main(...) {
initialization(...)
// Reset PGO counters to so profile collection starts after initialization.
#ifdef RESET_COUNTER
__llvm_profile_reset_counters();
#endif
// Profile is collected during kernel execution.
kernel(...);
// Dump the profile right after kernel execution.
#ifdef DUMP_BEFORE_CLEANUP
__llvm_profile_dump();
#endif
// No profile is collected during program clean up.
cleanup(...);
}
Ah this is awesome! Thanks!
Thanks for the confirmation! I see that currently, an LLVM installation does not include a InstrProfiling.h file which introduces these API names to a user program. This makes sense to me because InstrProfiling.h contains a lot of PGO implementation details in addition to the public APIs. Is it correct that at this time, the user will have to know such names to use them? Is the example above how people typically use these APIs in their programs (introducing names by declaring them, and using macros to guard the calls)?
Does it sound reasonable to introduce a header that is installed, which contains a list of such “primary dumping APIs”? Maybe we can introduce some macro mechanism so that such API calls are guarded automatically so the user can avoid introducing the guards against the calls? See example below.
// New Header. InstrProfileControl.h.
// List of function names.
void __llvm_profile_dump();
// Macro wrapper so that the user can avoid the guards in their programs.
// -fprofile-generate can define PROFILE_GENERATE_ON.
// Users can call __llvm_pgo_profile_dump instead of calling __llvm_profile_dump
// directly.
#ifdef PROFILE_GENERATE_ON
void __llvm_pgo_profile_dump() { __llvm_profile_dump(); }
#else
#define __llvm_pgo_profile_dump()
#endif
Since the API signature of these functions are stable, the common way of calling these functions from user programs is to declare them as extern with a weak reference like so:
extern "C" __attribute__((weak)) int __llvm_profile_dump(void);
Then to call it, you can do the following:
// Check whether this build was linked against the profiling runtime
if (__llvm_profile_dump)
__llvm_profile_dump();
This avoids the need for a header and avoids the dependency on LLVM (or the need for macros) in the non-instrumented build.
Since the number of most useful APIs are very limited, I am leaning towards let introduce a header like you suggested to reduce the number of files to be maintained.
The PGO_GEN and PGO_USE macro themselves are a useful features that may find other uses. The caveat is to avoid introducing code (control flow) divergence between prof-gen and prof-use with these macros, especially for hot code regions .