Varying per function optimisation based on include path?

There’s a proposal in SG15 (http://wg21.link/p1832) with a suggestion
to more aggressively inline functions included via isystem when building
for -Og.

To me that sounds pretty outside the scope of the C++ standards
committee, but don’t know too much about how they deal with things on the
fringe like this. It would seem simpler/more direct to provide
patches/discuss development in existing compilers to demonstrate the value
and leave it up to “market forces” to handle this sort of
quality-of-implementation thing.

I think “discuss development in existing compilers” was the purpose of
Jon’s opening this thread, yeah?

The discussion here seems fine - I was/am a bit confused by this involving
a paper to the C++ committee.

It’s context. Most of the time I’m compiler back end, but periodically I play with C++. John’s paper looked enough like a compiler bug report that I thought I’d raise it here. The paper references GCC but LLVM is more familiar to me.

Does P1832’s approach seem like a reasonable extension in the context of
Clang?

Wording pedantry, etc - but I wouldn’t describe this as an extension
because it’s not part of the observable behavior as far as C++ is
concerned. (it doesn’t make code that would be invalid (or unspecified, IB,
or UB) valid with some specific semantics)

I’d probably describe it as a clang feature request - one I personally
would be pretty hesitant about, but data could help motivate the decision -
performance data combined with some attempt to quantify debuggability.

Feature request is fair. I was imagining marking “application” functions as optnone and STL/SDK functions as normal, but don’t have a good handle on how clang distinguishes functions from different origins.

The goal is to improve runtime performance for debug builds that make

heavy use of the STL.

Does clang already set various attributes based on whether a function
was found via -i or -isystem, and if not, does that seem a reasonable
extension?

Nope - pretty sure it doesn’t & probably tries fairly hard not to vary
code generation (or even diagnostics, except for a very big slice around
system headers specifically to avoid warnings users can’t fix) depending on
where the code was written.

In -Og mode, it seems that it would equally make sense to take “a very big
slice around system headers specifically to avoid” debug symbols for code
that users can’t debug.

That seems different to me - users can debug into templates and it can be
useful - if they’ve corrupted the state of a container (yeah, other tools
might be better there, like sanitizers) or the library is doing the right
thing but it’s surprising because the user didn’t realize what they were
asking for.

Mixed. The attitude of “the bug is in my code, not the standard library” is usually worth encouraging. So I sort of like the idea of optimising library code while keeping application code simpler. Splitting the two by include paths was a new idea for me.

There is work in Clang/LLVM to try to make -Og/-O1 (currently synonymous

and hope to keep them that way as long as possible) be a good/better
fast/debuggable tradeoff. But mostly that centers around avoiding
destructive optimizations & keeping as much debug-ability-related state as
possible.

Are “debug symbols / lack of inlining” for any piece of code always
“debuggability-related”?
P1832 takes the position that for code in system headers, performing
inlining and other optimizations doesn’t have much impact on
“debuggability” because that system code is never going to be debugged by
the user anyway.

Once you inline though, the code you’ve inlined can get jumbled up with the
other code - potentially placing a container in an invalid state inside
that user code - perhaps they try to print the contents of the container on
a line after something was added but it doesn’t show up because of
instruction reordering, for instance.

Sure. This suggestion is a rough line between code that is “trusted” and “suspect”, at slightly different granularity to compiling different translation units with different optimisation levels.

Naïvely, it seems to me like a reasonable extension, in the context of
Clang.

my $.02,
–Arthur

Cheers. Naively seems reasonable is where I am on this. It feels like something games dev people would like. And it feels more QoI than ISO. Good debugging experience of optimised code is, uh, difficult.

Jon