sincos optimization

Hi,

I'm looking at http://llvm.org/bugs/show_bug.cgi?id=13204 which involves converting calls to sin and cos to sincos (when available)

Initially I thought about transforming calls to sinf/cosf to sincosf. However, I don't think this is a legal transformation given that a declaration for a function called sinf is not necessarily the standard library function.

Therefore it makes sense to transform intrinsic calls to sin and cos. One problem with this is that clang does not generate sin/cos intrinsics because they do not have the same semantics as the library functions.

What is the best way to approach this transformation?

(Also, experimentation shows that sincos is indeed faster than sin and cos)

paul

[...]

I'm looking at http://llvm.org/bugs/show_bug.cgi?id=13204 which involves converting calls to sin and cos to sincos (when available)

Initially I thought about transforming calls to sinf/cosf to sincosf. However, I don't think this is a legal transformation given that a declaration for a function called sinf is not necessarily the standard library function.

I've actually just dealt with this --- standard library calls, including
sinf, *are* promoted to intrinsics if the compiler sees them (and are
then usually converted back to library calls again later). The list of
recognised names is here:

  lib/Target/TargetLibraryInfo.cpp

The list of functions which are considered candidates for promotion is
in hasOptimizedCodeGen() in here:

  include/llvm/Target/TargetLibraryInfo.h

...and the code that actually does it is here:

  lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

Look for SelectionDAGBuilder::visitUnaryFloatCall() and visitCall().

It appears that such functions are only promoted if they're declared
readnone, which provides some protection against overriding a standard
library function, but I haven't found anything that explicitly checks
for this yet.

Disclaimer: I learnt this yesterday...

Hi David,

I think it would be a good idea. I'd like to see more of the libm-aware optimizations move out of DAGCombine and into IR level passes (InstCombine?). Translating library calls to intrinsics at the IR level makes that simpler.

--Owen