llvm intrinsics/libc/libm question

Ryan_Taylor · June 7, 2016, 3:03pm

I’m trying to figure out exactly how the intrinsics/libc/libm work in llvm.

For example, this code return user defined function:

float acos(float x, float y)
{
return x+y;
}

float a;
void foo(float b, float c)
{
a = acos(b, c);
}

But this code returns llvm.intrinsic:

float acos(float, float);

float a;
void foo(float b, float c)
{
a = acos(b, c);
}

float acos(float x, float y)
{
return x+y;
}

What is the expected behavior here?

Also, there are certain “standard C library functions” included in LLVM that I’d like to remove without modifying core code, is this possible?

I’m also curious how LLVM handles pure functions in regards to optimizations. For example, libm functions such as acos. It appears that X86 doesn’t have acos as an intrinsic and instead just calls a function “tail call @acos…”, is this still able to be optimized. I see the hardcoded ‘name’ lookup for ‘acos’ in ConstantFolding.cpp. It doesn’t appear that if you slightly change the name from ‘acos’ to say ‘XXX_acos’ in your libm it’ll still be optimized?

Thanks,

Ryan

Ahmed_Bougacha · June 7, 2016, 4:23pm

I'm trying to figure out exactly how the intrinsics/libc/libm work in llvm.

Intrinsics are basically "lesser" instructions that aren't guaranteed
to have first-class support (e.g., work on all targets). They are
specified in the LangRef.

Target intrinsics are similar to generic intrinsics, but are
specifically only available on one target. They are (intentionally
under-) specified in the various Intrinsics<Target>.td files.

Some functions (libc, libm, and a few others) are recognized as having
well-defined behavior for the target platform; this information is in
TargetLibraryInfo.

For example, this code return user defined function:

float acos(float x, float y)
{
return x+y;
}

float a;
void foo(float b, float c)
{
a = acos(b, c);
}

But this code returns llvm.intrinsic:

float acos(float, float);

float a;
void foo(float b, float c)
{
a = acos(b, c);
}

float acos(float x, float y)
{
return x+y;
}

What is the expected behavior here?

I don't see how they can behave differently. What IR are you seeing?

Also, there are certain "standard C library functions" included in LLVM that
I'd like to remove without modifying core code, is this possible?

If you're using clang: -fno-builtin=acos is what you're looking for.

If you're using llvm: when setting up your pass pipeline, you should
create an instance of TargetLibraryInfo (an example is in
tools/opt/opt.cpp), and either use:
- setUnavailable(LibFunc::acos): this marks acos as "unavailable",
preventing optimizations from making assumptions about its behavior.
Equivalent to clang -fno-builtin-acos
- disableAllFunctions(): this marks all known functions as
unavailable. Equivalent to clang -fno-builtin

I'm also curious how LLVM handles pure functions in regards to
optimizations. For example, libm functions such as acos.

Note that I don't think libm functions are pure on most platform,
because they can modify errno (-ffast-math/-fno-math-errno disables
that, though).

It appears that X86
doesn't have acos as an intrinsic and instead just calls a function "tail
call @acos...", is this still able to be optimized.

Yes, because TLI knows about the name 'acos'. However, the prototype
needs to be reasonably correct ('double @acos(double)'), but isn't in
your example. (Specifically, ConstantFolding checks that @acos only
has one operand, but yours has two.)

I see the hardcoded
'name' lookup for 'acos' in ConstantFolding.cpp. It doesn't appear that if
you slightly change the name from 'acos' to say 'XXX_acos' in your libm
it'll still be optimized?

Correct, it won't be.

It's possible to make that happen with a few patches, but there has
been no need for that yet:

- replace the various calls to TLI::has() or
TLI::getLibFunc(StringRef, Func&) with TLI::getLibFunc(Function&,
Func&). Any of the former are probably hiding bugs related to
incorrect prototypes

- teach TLI::getLibFunc to check function availability using the
custom name instead of always checking the standard name. Again, this
is (arguably) hiding bugs, where we recognize the standard name even
though it's not what the target uses

- fix canConstantFoldCallTo to pass it TLI or maybe remove it entirely
(it's always used before ConstantFoldCall, which does check TLI)

- tell TLI that 'acos' is available with name 'XXX_acos' for your target

-Ahmed

mats_petersson · June 7, 2016, 4:42pm

Note that the C standard [no, I can’t remember the exact section/paragraph] says that “you should not use the names of existing functions in <std*.h>, <math.h>, <memory.h> and <string.h> [probably a few others too] for your own functions - the result of that is undefined” [I’m paraphrasing the spec]. In other words, functions listed by the standard as part of those header files are reserved functions, and if you use, for example, acos as a function name, it’s not well-defined what the compiler will do. In this case, the compiler will think that it knows what acos does when it’s declared but not defined before the call. If it’s defined as well as declared before the use of it, the compiler actually knows what the function does, and thus knows that it’s not “standard acos”. But the compiler could also produce an error or do other things in this case.

So, expecting clang + LLVM to handle this “correctly” (whatever you believe is correct) is perhaps a bit of wishful thinking.

It would of course be nice to have a warning saying “redeclaration of acos, which is a library function”. But as far as I see, the compiler is doing what is within the spec.

Stephen_Canon1 · June 7, 2016, 4:46pm

7.1.3 p1, perhaps?

"All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage."

– Steve

Ryan_Taylor · June 7, 2016, 8:24pm

In the first code I see a ‘tail call @acos’, in the second code I see a tail call @llvm.acos.f32’. (sorry, there should be only one input for acos, I’ve been trying many libm/libc functions).

Not sure why it’s called TargetLibraryInfo if it’s not in target specific code? It seems that ALL targets use this code, making it generic. Am I missing something here?

Basically you’re saying if I changed/added the XXXacos in TargetLibraryInfo::hasOptimizedCodeGen then the ConstantFolding (and other opts) could then opt for this libm call?

Thanks,

Ryan

ps. The spec also states (albeit unclearly) that you can use “#undef” to omit a library function so that a user defined function of the same name can be used but LLVM doesn’t seem to support that.

TNorthover · June 7, 2016, 8:38pm

Not sure why it's called TargetLibraryInfo if it's not in target specific
code? It seems that ALL targets use this code, making it generic. Am I
missing something here?

Some of the names can vary by platform, for example ARM sometimes has
__aeabi_memcpy instead of memcpy

ps. The spec also states (albeit unclearly) that you can use "#undef" to
omit a library function so that a user defined function of the same name can
be used but LLVM doesn't seem to support that.

I think it says exactly the opposite: (7.1.2p3):

"If the program removes (with #undef) any macro definition of an
identifier in the first group listed above, the behavior is
undefined."

Incidentally, I don't think anyone's mentioned that "-ffreestanding"
will probably inhibit the intrinsics substantially if that's what
you're after (technically, it's probably a compiler extension that it
gives them back to the user, but everyone does it as far as I know).

Cheers.

Tim.

Ryan_Taylor · June 7, 2016, 8:45pm

Tim,

Are you referring to setLibcallName? That is target specific yes but there isn’t RTLIB for most of the libm functions, for example, for acos this doesn’t apply.

Ideally what I would like is to create a libc with functions like acos called something like __xxx_acos that can still be recognized to be optimized.

RTLIB is pretty limited but it works fine, I can just use setLibcallName(RTLIB::floor, “__xxx_floor”)… but again, the functions that are RTLIB are limited. Using intrinsics make it more difficult because then you have to match the intrinsic (rather than it automatically generating a lib call). ISD is just as bad (FCOPYSIGN, FABS for example) because then they need to be manually lowered.

Thanks,

Ryan

Ryan_Taylor · June 7, 2016, 8:57pm

Tim,

Currently, I have to do multiple things:

create some setLibcallNames in XXXISelLowering.cpp to generate correct naming for RTLIBS.
lower ISD down to an RTLIB for some calls (and then do solution 1 on those to get correct names)
change TargetLibraryInfo for functions that aren’t covered in solutions 1 and 2 (so that they can also be optimized)

I must be missing something, I’m just not sure what it is.

Thanks,

Ryan

Ryan_Taylor · June 7, 2016, 9:54pm

per 7.1.4p1:

The use of #undef to remove any macro deﬁnition will also ensure that an actual function is referred to

And then footnote 187:

Because external identiﬁers and some macro names beginning with an underscore are reserved, implementations may provide special semantics for such names. For example, the identiﬁer _BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the appropriate header could specify:

#define abs(x) __builtin_abs(x)

for a compiler whose code generator will accept it.

In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine function may write

#undef abs

whether the implementation’s header provides a macro implementation of abs or a built-in implementation. The prototype for the function, which precedes and is hidden by any macro deﬁnition, is thereby revealed also.

Ahmed_Bougacha · June 7, 2016, 11:06pm

Tim,

Currently, I have to do multiple things:

1) create some setLibcallNames in XXXISelLowering.cpp to generate correct
naming for RTLIBS.
2) lower ISD down to an RTLIB for some calls (and then do solution 1 on
those to get correct names)

These solve a related but different - CodeGen - problem.

RTLIB libcalls are used when we're not able to select some IR
instruction/intrinsic so have to rely on a runtime library helper
function (e.g., the stuff in compiler-rt/lib/builtins/).

So, #1 and #2 would make LLVM able to emit calls to __xxx_acos when
it sees "@llvm.acos.f32", but it won't let LLVM optimize (constant
fold, transform into the intrinsic, ...) "__xx_acos()" when it sees
it.

It sounds like you also want to recognize and optimize these calls.
That involves (pre-CodeGen) IR-level optimizations.
No, I don't think that's supported today without changing LLVM (see
the list in my first email).

3) change TargetLibraryInfo for functions that aren't covered in solutions 1
and 2 (so that they can also be optimized)

I must be missing something, I'm just not sure what it is.

Thanks,

Ryan

Tim,

Are you referring to setLibcallName? That is target specific yes but
there isn't RTLIB for most of the libm functions, for example, for acos this
doesn't apply.

Ideally what I would like is to create a libc with functions like acos
called something like __xxx_acos that can still be recognized to be
optimized.

RTLIB is pretty limited but it works fine, I can just use
setLibcallName(RTLIB::floor, "__xxx_floor")... but again, the functions that
are RTLIB are limited. Using intrinsics make it more difficult because then
you have to match the intrinsic (rather than it automatically generating a
lib call). ISD is just as bad (FCOPYSIGN, FABS for example) because then
they need to be manually lowered.
Thanks,

Ryan

> Not sure why it's called TargetLibraryInfo if it's not in target
> specific
> code? It seems that ALL targets use this code, making it generic. Am I
> missing something here?

I agree the name "Target" is a bit awkward, but it's not generic in
that it behaves differently depending on the target triple, which is
usually not OK in a "generic" analysis.

If you look in TargetLibraryInfo.cpp, there are various checks for
function availability, usually predicated on OS versions.

-Ahmed

Ryan_Taylor · June 8, 2016, 5:43pm

Correct, it does check based on OS and triple, what I meant was that it might be better to have this info in the target specific files and have the LibraryInfo do a look up of that (like most other sections of the core code do, ie have the tablegen or ISelLowering specify the libs etc…)

I’m not sure I follow about the RTLIB, I’m able to use an intrinsic for floor (def int_floor::Intrinsic in IntrinsicsXXX.td) and still use RTLIB to generate the appropriate name for the function (ie __xxx_floor). It sounds like you’re implying either/or, not both?

I agree, it doesn’t seem supported. It looks like I might just need to change ‘TLI.has’ and ‘TLI.getName’ in order to make this happen (potentially removing the prefix here). This goes back to my first point, the TLI should be changed to simply get this info generically from the target information, you seem to agree with that.

Thanks,

Ryan

Ahmed_Bougacha · June 9, 2016, 6:10pm

Correct, it does check based on OS and triple, what I meant was that it
might be better to have this info in the target specific files and have the
LibraryInfo do a look up of that (like most other sections of the core code
do, ie have the tablegen or ISelLowering specify the libs etc..)

I agree it's not the best place, but one difference is that
TargetLibraryInfo is much more about OSes than architectures.

I'm not sure I follow about the RTLIB, I'm able to use an intrinsic for
floor (def int_floor::Intrinsic in IntrinsicsXXX.td) and still use RTLIB to
generate the appropriate name for the function (ie __xxx_floor). It sounds
like you're implying either/or, not both?

No, I'm just saying that RTLIB only solves the codegen problem; you'll
need something else (like your intrinsic?) to have better IR
optimizations.

I agree, it doesn't seem supported. It looks like I might just need to
change 'TLI.has' and 'TLI.getName' in order to make this happen (potentially
removing the prefix here). This goes back to my first point, the TLI should
be changed to simply get this info generically from the target information,
you seem to agree with that.

Hmm, what are you really trying to do? If you want LLVM to recognize
your __xxx functions: yes, the cleanest solution is probably to teach
TLI and its users to recognize the "custom" names, and mark the
functions as available with your custom __xxx names.

HTH,
-Ahmed

Ryan_Taylor · June 14, 2016, 3:58pm

I’m still not sure why copysign and fabs have to be lowered to a call when they are represented as a call in the IR?

Looks like the DAG makes them into SDNodes.

Ryan_Taylor · June 14, 2016, 5:15pm

If I do

T.getArch() == xxx
TLI.setUnavailable(LibFunc::copysign)

then this works at generating a call instead of not being able to select the ISD::FCOPYSIGN, but I don’t know why I don’t need to do this for other LibFunc functions (such as floor, etc… these generate call just fine)?

Thanks,
Ryan

akorobeynikov · June 14, 2016, 11:20pm

Mostly because there are fabs / fcopysign instructions in many ISAs.
But nothing like this for floor / trunc.

Ryan_Taylor · June 14, 2016, 11:40pm

Unfortunately even TLI.setUnavailable doesn’t work for fabs, only for copysign. I would have thought this would have stopped the conversion to ISD based on the hasOptimizedCodeGen function.

Topic		Replies	Views
RFC: We need to explicitly state that some functions are reserved by LLVM LLVM Dev List Archives	29	127	November 13, 2017
replace hardcoded function names by intrinsics LLVM Dev List Archives	6	73	March 5, 2012
[RFC] All the math intrinsics IR & Optimizations	20	1652	August 17, 2024
Math intrinsics IR & Optimizations	11	1419	December 24, 2022
Math Library Intrinsics as native intrinsics LLVM Dev List Archives	8	589	December 5, 2022

llvm intrinsics/libc/libm question

Related topics