Math intrinsics

I tried replying to an old thread but got no response. So I’m trying someplace else.

My target machine has hardware instructions for ldexp, frexp, atan2, asin, acos, atan, rsqrt.
I want some suggestions on how to generate them.
I suppose the options (e.g. for atan2) are:

  1. Match them in TargetLowering::LowerCall().
    How does one insure that the name “atan2” was from math.h and not just some arbitrary name?
  2. Have the front end (clang) generate an intrinsic.
    a. Target specific - is there a target that has done this for math calls?
    b. New generic llvm.atan2… Why are some libm routines implemented with
    intrinsics, e.g., llvm.sin., llvm.cos.* and some are not?
    c. Are these intrinsics dependent on __builtin_atan2?

Ideas please. Thank you.

Hi Brian,

Wrt 2) I would prefer not to add more math intrinsics but to remove the ones we have.
Wrt 1) We know libc/libm/… names are special unless the user puts a nobuiltin on the function, the call, or the module.

Does that help?

~ Johannes

I think exactly the opposite. I think we should add all the missing math intrinsics and drop all of this libcall recognition scattered around.

The problem is the “all” part. Who defines “all”? The C/C++ standard, my system [c]math.h? The target?
Once we define that, how do we deal with the inconsistencies of math headers, e.g. does llvm.isinf return bool or int?

The reason we have intrinsics are, IIRC, to overload the types with vectors. By now we have better ways and even where they fail, we need generic ways for user function vector overloading anyway.

Union of any. If there exists a useful special lowering if it somewhere, a lowering can be produce elsewhere. I may look into resurrecting ⚙ D14327 Add llvm.ldexp.* intrinsic, associated SDNode and library calls soon

But that opens up two questions:

  1. What to do with llvm.my_special_math_function in a backend that doesn’t have special lowering? Fallback to libc calls (which means having all recognition stuff both in clang and backwards in the backends)? It also won’t work for non-libc systems as of yet (though that is hopefully fixed at some point). We could ask the target but that has other problems too. It’s also not only a frontend issue as middle end passes, e.g., Enzyme, generate such math calls and would need to know what is available, what is an intrinsic, etc. Right now, the rough assumption is that all targets have all 10 math intrinsics or none of them, and in the latter case we see things already falling apart.
  2. Why do we need intrinsics in the first place? It can’t be only about the recognition logic, right? If we start with builtins, maybe, but if we start with my_special_math_function as a call in user code we still do libcall recognition in clang. And if we want other frontends to benefit we duplicate the recognition logic for them.

Don’t we need the intrinsics for their different errno semantics? FP intrinsics do not set errno, libcalls do.

1 Like

Implement the lowering. I already work on a non-libc system. The DAG code was written around the assumption there’s a library you can call into. We’ve had to implement inline expansions for many functions, which is mostly just an annoyance it wasn’t done in the first place.

The API for recognizing libcalls is garbage. I don’t want to have to look for a function name. I want to switch over an enum without doing additional parsing. Code also assumes function call = expensive, intrinsic call = cheap.

Libcalls also bind you to this garbage API from the 70s we’re stuck dealing with. For example, I would like a frexp intrinsic. This has no business using a pointer argument, an intrinsic can return a pair of fields. Not to mention errno (I don’t know why we don’t just ship a compiler-rt libm-lite that actually behaves like everyone wants with no errno)

TargetLibraryInfo already has an enum and IIRC recognizes calls based both on name and signature. The intrinsics don’t correlate with that?

It depends which piece of code you’re looking at. The whole thing is a mess. I think conflating the intrinsics and library calls is a problem. We do have nonsense where we consider the availability of library functions when considering to emit intrinsics which have other lowerings available (memcpy is the worst of these). The llvm intrinsics should exist independently of the library functions. Knowledge of the target library should only really need consideration as a lowering option

Right, errno: So the difference between sin and llvm.sin is the pottential effect on errno. I would argue we can simply mark the calls memory(none) if fno-errno (or similar) is given.

“My target machine has hardware instructions for ldexp, frexp, atan2, asin, acos, atan, rsqrt.”

The ISA contains the following elementary functions::
EXPON // extract and debias the exponent
FRACT // insert a synthetic 1 in the exponent: 1.0 <= result < 2.0 or 0.0
EADD // add the integer exponent to the floating point number
CPSN // copy sign

      SQRT                      // IEEE SQRT()
      RSQRT                   // 1/SQRT() but with a single rounding
      RCP                        // 1/x

      Ln2P1, LnP1, Ln10P1          // logarithms that pass through <0,0>
      Ln2, Ln, Ln10                       // logarithms that pass through <1,0>
      Exp2M1, ExpM1, Exp10M1 // exponentials that pass through <0,0>
      Exp2, Exp, Exp10                // exponentials that pass through <0,1>
      SIN, COS, TAN                    // cyclicals based on pi
      SINpi, COSpi, TANpi           // cyclicals based on 1.0
      ASIN, ACOS, ATAN             // anticyclicals based on pi
      ASINpi, ACOSpi, ATANpi     // anticiyclicals based on 1.0

      ATAN2                                 // 2 argument version of ATAN
      POW                                   // x**y

What we want is for “reasonable function calls” to result in these instructions in the assembly.

It is unlikely that many machines with have this number of intrinsic elementary functions.
Thus, what is desired is an easy way to ISAs to drop in intrinsic names, with some kind
of matching criterion (types in particular, but no errno for example) and some kind of target
template so the list can expand and contract in straightforward ways for machines with
more intrinsics or for with fewer.