Optimization remarks and LTO

Hi

I’m building a C++ codebase on clang with LTO, while activating -fsave-optimization-remarks.

I’m surprised to see remarks like “X will not be inlined into Y because its definition is unavailable”. Here’s a public example: abstract.h

These remarks are probably generated in the per-translation-unit IR generation phase.

Do the passes in the LTO phase emit remarks? Is there a way to filter the remarks to only those that are relevant post-LTO? If not, shouldn’t there be one?

Thanks,

-Ofek

2 Likes

I have the same concerns. The performance of my targets relies heavily on LTO and so I am finding it difficult to use @OfekShilon’s optview2 utility to generate actionable information for these NoDefinition remarks.

I think the issue is that we generate remarks during the prepare-for-LTO phase (compiling individual TU to bitcode) and during the LTO phase.

Presumably the NoDefinition remarks are coming from the prepare-for-LTO phase, but really are superseded by the successful inlining during LTO.

This is in interesting issue, it sounds like we need some kind of additional post-processing to filter out those outdated remarks. One (perhaps a bit hacky) way to distinguish between the phases is to check if the remarks have been generated for a bitcode file or executable.

We could also filter/disable some remarks that are likely inaccurate during the prepare-for-lto phase.

Just had a look at this, it looks like this contains some nice improvements over the opt-viewer in tree. It would be great to see some of those improvements submitted back to the LLVM repo :slight_smile:

4 Likes

I originally thought my work was of interest to me and maybe some general devs - but definitely not to LLVM devs. I will very gladly submit this back to the LLVM repo, will read a bit on how to do that.

Regarding LTO: I think I now have a (cumbersome) way to separate the preparation opt-remarks from the LTO ones -
(1) Build with LTO, use –v to dump the list of obj files generated (IR only)
(2) $ llvm-lto -lto-pass-remarks-output=<yaml outputpath> -j=10 -O=3 <obj files list>

This creates a single very large *.obj.yaml file.
You can process it with OptView2 - but for a large project it takes forever (processing gigs of yaml, no parallelization).
@euphoria can you please say if it works for you too?