Hi all,
maybe this is just a silly question. But I wondered why clang is not attempting to fold calls to constexpr functions in non-constant expression contexts.
Example: Compiler Explorer just compare "one" vs "two" in the IR
I understand that clang does not have to do this and I'm not saying it should.
But I'd like to know what is the rationale here.
- Is it a C++ Standard issue?, i.e. doing that would trigger some side-effects that would be non-compliant.
- Is it a "separation of concerns" issue?, i.e. let LLVM optimize these cases and improve if it can't.
- Is it a speed issue?, i.e. even if the original AST is preserved for fidelity, computing an associated constant value to it takes time which is precious for some clients.
Or maybe it is something else that didn't occur to me.
Thank you,
Roger
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
I’m going to guess it’s mostly (3). Checking and trying to constant evaluate every subexpression would be expensive - the systems to do so in Clang have to do a lot of work to properly report diagnostics, to act on the full AST form, etc. It’s easier to leave it to the LLVM optimization passes to do the work, when possible.
Thanks David for your answer, would it make sense to do this at CodeGen under –O?
Something like, in CodegenFunction::EmitCallExpr check that we’re calling a constexpr function, then attempt to evaluate it, and then forward down a llvm::Constant* until CodeGenFunction::EmitCall (the one in CGCall.cpp). Then there if given a non-null llvm::Constant*, use that to create and return the RValue. This seems unnecessarily complicated but should emit everything else required at that point except the function call itself.
Kind regards,
Roger
Thanks David for your answer, would it make sense to do this at CodeGen under –O?
I doubt it - the Clang code isn’t really designed to be an efficient optimizer. If this is a missed optimization, it’d likely be best to look at why the LLVM optimization pipeline is missing the opportunity.