[RFC] Dynamic Type Profiling and Optimizations in LLVM

An update (cc @modimo @teresajohnson )

I have uploaded the end-to-end implementation into a branch of my LLVM fork. Besides (a squashed version of) the type instrumentation PR, the rest of the branch contains thinlto import of vtables and actual icp transformations (commit 1 and 2). I’ll send out the rest of the commits using stacked reviews (supported by spr). The first PR is not managed by spr and has been reviewed, and I’m still figuring out whether I should just use spr to create a duplication (of the first PR) in order to send the rest of the patches (for formal review) in a stack in this scenario. The one duplication is meant to make diffs from the rest of patches visible but not meant for duplicated reviews.

Meanwhile, with safer WPD, we are planning to look into an optimization that removes the indirect call fallback [1] if there are no other implementations. Presumably with profile-guilded indirect-call-promotion, the BB for indirect fallback should be a cold block (and thereby splitted out of .text.hot with machine-functions-splitter). However this would be useful as an general optimization.

[1]

vptr = ptr->_vptr;
if (vptr == &vtable_HotType1)
  HotType1::func()   // hot path
else if (vptr == &vtable_HotType2)
  HotType2::func() // hot path
// If there are no other implementations with safer WPD, this fallback could be optimized away.
else  {           
  func_ptr = *(vptr + function-offset)
  call func_ptr
}
1 Like