Incorrect code generation when using -fprofile-generate on code which contains exception handling (Windows target)


I've run into a bug with the LLVM backend that causes incorrect code generation to happen when using -fprofile-generate on programs that contain C++ exception handling when building for Windows.

The problem occurs when the value profiling inserts function calls into exception handling blocks. The instrumentation inserts value profiling intrinsic calls, and these are subsequently lowered into target library calls. However, these library calls do not get a funclet operand bundle associated with them. This causes the Windows Exception Handling Preparation Pass to drop all the instructions within the exception handler starting from the PGO instrumentation call, and replace them with 'unreachable'. This is being done by the function removeImplausibleInstructions (WinEHPrepare.cpp).

A simple reproducer of the problem shown here which will lead to incorrect code on the method test::run(). In this example, the virtual function called from within the exception handler triggers the bug when using -fprofile-generate.

  #include <stdexcept>
  #include <iostream>

  extern void may_throw(int);

  class base {
    base() : x(0) {};
    int get_x() const { return x; }
    virtual void update() { x++; }
    int x;

  class derived : public base {
    derived() {}
    virtual void update() { x--; }

  class test {
    void run(base* b, int count) {
      try {
        for (int i = 0; i < count; ++i)
      catch (std::exception& e) {
        // Virtual function call in exception handler for value profiling.

  void run_test() {
    test tester;
    base *obj = new derived;, 100);
    std::cout << "Value in obj (should be -1): " << obj->get_x() << "\n";
    if (obj->get_x() == -1)
      std::cout << "test passed\n";
      std::cout << "test failed\n";

  int main() {
    // Without PGO, test runs and prints result.
    // With -fprofile-generate, program seg-faults without printing.
    return 0;

  void may_throw(int x) {
    if (x > 10)
      throw std::range_error("value out of range");

On Windows, build with: clang -O2 -fprofile-generate test.cpp

When profiling is enabled the program will seg fault without printing anything. Without the -fprofile-generate flag, the program will run successfully.

The compiler problem is as follows: Prior to the Windows Exception Handling Preparation Pass, the IR for the function "test::run" contains the following:

19: ; preds = %17
  %20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception@std@@@8", i32 8, %"class.std::exception"** %6]
  %21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run@test@@QEAAXPEAVbase@@H@Z", i64 0, i64 2), align 8
  %22 = add i64 %21, 1
  store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run@test@@QEAAXPEAVbase@@H@Z", i64 0, i64 2), align 8
  %23 = bitcast %class.base* %1 to void (%class.base*)***
  %24 = load void (%class.base*)**, void (%class.base*)*** %23, align 8, !tbaa !9
  %25 = load void (%class.base*)*, void (%class.base*)** %24, align 8
  %26 = ptrtoint void (%class.base*)* %25 to i64
  call void @__llvm_profile_instrument_target(i64 %26, i8* bitcast ({ i64, i64, i64*, i8*, i8*, i32, [2 x i16] }* @"__profd_?run@test@@QEAAXPEAVbase@@H@Z" to i8*), i32 0)
  call void %25(%class.base* %1) [ "funclet"(token %20) ]
  call void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #15 [ "funclet"(token %20) ]

Following this pass, this IR has been replaced with the following, causing a breakage to the original program. This is occurring because the instrumentation function call, "__llvm_profile_instrument_target", is not marked with the funclet operand bundle [ "funclet"(token %20) ].

19: ; preds = %17
  %20 = catchpad within %18 [%rtti.TypeDescriptor19* @"??_R0?AVexception@std@@@8", i32 8, %"class.std::exception"** %6]
  %21 = load i64, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run@test@@QEAAXPEAVbase@@H@Z", i64 0, i64 2), align 8
  %22 = add i64 %21, 1
  store i64 %22, i64* getelementptr inbounds ([3 x i64], [3 x i64]* @"__profc_?run@test@@QEAAXPEAVbase@@H@Z", i64 0, i64 2), align 8

Possible solutions:
  1) Avoid value profiling of calls within exception handling blocks
    Pros: Solves the problem
    Cons: Could lose some cases of value profiling, but since the exception code is not supposed to be the primary execution path, this should not be a significant performance issue.

  2) Propagate the funclet information onto the value profiling intrinsics created. And then also propagate this info to the library routines these intrinsics get lowered into.
      For indirect function calls, the funclet information can be copied from the original function call.
      However, for MemIntrinsic call operand value profiling, these do not have funclet operand bundles attached to them by the front-end. (Not sure if it's possible to do because the interfaces that are used to create these do not take operand bundles) Therefore, PGO would need to determine the appropriate funclet value with colorEHFunlets to identify the funclet operand bundle to attach to the instrumentation calls. Unfortunately, because it is possible that a basic block could be associated with multiple funclets or both a funclet and outside the funclet, this may also need to clone some of basic blocks similar to the WinEHPrepare.cpp routine cloneCommonBlocks(), prior to computing the instrumentation.

    Pros: does not disable value profiling opportunities.
    Cons: complex to implement due to the need to determine the appropriate funclet to place on the memory operand value profiling calls. This would necessitate the same cloning behavior to be done for the PGO use compilation.

  3) Teach the Windows Exception Preparation Pass about the value profiling library functions. Currently this pass will ignore llvm intrinsic functions that are marked with the 'does not throw' attribute, but the value profiling intrinsic calls have been lowered from being intrinsic calls into runtime library target specific functions before reaching this point.
    Pros: does not disable value profiling opportunities
    Cons: requires exposing function names from InstrProf.h to the WinEHPrepare.cpp file, or requires a new attribute on the function calls to identify them as instrumentation library calls. Also, the IR does not correctly reflect the correct state regarding the operand bundle funclet information for the PGO inserted function calls.

For options 2 or 3 to work, it also requires that the PGO indirect function call promotion pass used for -fprofile-use to maintain the 'funclet' operand bundle on the specialized function call that is inserted as a direct function call target. Fortunately, the code within that pass is cloning the original indirect call, so the 'funclet' operand bundle is being maintained on it.

Any thoughts on which of these options should be taken, or other suggestions for resolving this problem?