RFC: Use closures to delay construction of optimization remarks

For better readability we typically create remarks and call OptimizationRemarkEmitter::emit unconditionally. E.g.:

Transforms/IPO/Inliner.cpp: ORE.emit(OptimizationRemarkMissed(DEBUG_TYPE, “TooCostly”, Call)
Transforms/IPO/Inliner.cpp- << NV(“Callee”, Callee) << " not inlined into "
Transforms/IPO/Inliner.cpp- << NV(“Caller”, Caller) << " because too costly to inline (cost="
Transforms/IPO/Inliner.cpp- << NV(“Cost”, IC.getCost())
Transforms/IPO/Inliner.cpp- << “, threshold=” << NV(“Threshold”, IC.getThreshold()) << “)”);

Then inside ORE we return right away if the remarks for the given pass is not enabled.

This is nice and concise however there is still some overhead involved in this if remarks are not enabled:

  1. Constructing the remark object

  2. Computing and inserting the strings, named-value arguments

  3. Performing the comparison between the pass name and the passes enabled by the user

  4. Virtual destructor

Now that Vivek added support for asking LLVMContext for what remarks are enabled by the user [1] we can improve this. The idea is to allow ORE::emit to take a closure. This delays all this work until we know remarks are enabled. E.g. the above code with closure:

ORE.emit(& {
return OptimizationRemarkMissed(DEBUG_TYPE, “TooCostly”, Call)
<< NV(“Callee”, Callee) << " not inlined into "
<< NV(“Caller”, Caller) << " because too costly to inline (cost="
<< NV(“Cost”, IC.getCost())
<< “, threshold=” << NV(“Threshold”, IC.getThreshold()) << “)”;
});

I have a proof-of-concept implementation at https://reviews.llvm.org/D37921.

The main change is that since in the lambda the remark is now returned by value, we need to preserve its type in the insertion operator. This requires making the insertion operator generic. I am also curious if people see C++ portability problems with the code.

Feedback welcome.

Adam

[1] https://reviews.llvm.org/D33514

Makes sense to me. This will also give us a more-natural way to deal with constructing remarks where the remark construction itself is potentially expensive (e.g., involves doing some extra analysis or traversing data structures). -Hal

Another alternative could be:

ORE.emitMissed(DEBUG_TYPE, …) << …

Then the first line of emitMissed does a check if it is enabled and if not then returns a dummy stream that does nothing for operator<< (and short-circuits all the stream operations)

Actually maybe something like:

if (auto &E = ORE.emitMissed(DEBUG_TYPE)) {
E.emit(…) << …;
}

Credit to Chandler for that (if I’m remembering it right. This is vaguely recollected from a LLVM social conversation a long time ago about reducing overhead of clang diagnostics that are not enabled, which sounds like the same problem)

This also seems like a good option. -Hal

Actually maybe something like:

if (auto &E = ORE.emitMissed(DEBUG_TYPE)) {
E.emit(…) << …;
}

Well, the point of this interface was exactly to avoid writing a conditional. If you’re willing to use a conditional you can already write this:

if (ORE.allowExtraAnalysis(DEBUG_TYPE))
ORE.emit(OptimizationRemark(…) << …;

But again the point was to hide all this. I find the closure-based interface more concise and easier to identify visually. One reason is that the block is contained within ORE.emit(…). I just wish you could omit the return in a lambda as in Python.

Adam

Sean, hopefully you’re OK with that reasoning. I went ahead and committed this in r313691.

Sean, hopefully you’re OK with that reasoning. I went ahead and committed
this in r313691.

No problem. It was just a suggested alternative for your consideration.

-- Sean Silva