libcalls vs. size of int

Hi!

I have stumbled upon some different problems related to LLVM generating libcalls with the wrong function prototype lately.
Asking for some guidance how to tackle those problems.

One problem is related to certain builtins in compiler-rt that for example use "int" instead of machine mode types such as si_int. An example here is
  double __powidf2(double, int);

If I understand https://reviews.llvm.org/D81285 correctly this was changed in compiler-rt to match the prototypes in libgcc that seem to be using non-machine mode types in a few places. However, for my target that has 16-bit int we run into problems since LegalizeDAG isn't taking the size of int into consideration when for example legalizing/lowering a call to
  llvm.powi.f64(double, i32)
by replacing it with a call to
  __powidf2(double, i32)
That unfortunately passes the linker without problem, but we end up with runtime errors since our rt lib expects the second argument to be i16.

I guess one solution here is to find all the builtins in compiler-rt that isn't using machine mode types and put them in a list of functions that depend on size of int/long etc. Then make sure we make those unavailable for targets with int!=si_int etc (and then one has to implement an replacement if those functions are needed).

Another solution could be to break compatibility with libgcc and use the machine mode types everywhere (or is there a requirement that builtins are compatible between libclang_rt and libgcc?).

Yet another solution is to make sure that LegalizeDAG knows about "size of int". And that it is provided with the list of bulitins that has "int" as argument or return type. To make sure correct prototypes are used when doing the libcalls.

The other problem I've seen is a bit similar. Here it is libc that sometimes use "int" in function prototypes.
An example here is
  float ldexpf( float arg, int exp );

SimplifyLibCalls may rewrite powf(2.0, itofp(x)) to ldexpf(1.0, x), but SimplifyLibCall (or rather getIntToFPVal) does not take into consideration that the second argument to ldexpf may have a different size depending on the "size of int". In our case it used 32-bit argument on the caller side, while our libc expected a 16-bit value.

I think transforms that produce calls to libc is a slightly bigger problem than for the builtins. It is not only SimplifyLibCalls that assumes that an int is 32 bits (and that char is being 8 bits, etc). I think that for example ExpandMemCmp and MegeICmps is using assumptions that return value from memcmp/bcmp is an i32 (but it is an int which isn't necessarily an i32).

Here I have a downstream hack that adds some knowledge, such as SizeOfInt and SizeOfChar, in TargetLibraryInfo. And then I can use that info to make sure correct sizes are used in certain libcalls simplifications. Not sure if there is any alternative solution, otherwise I could prepare a patch in phabricator based on that.

One thing that could help here is if Module::getOrInsertFunction didn't bitcast the function pointer when the prototype isn't matching. Maybe there should be different flavors of getOrInsertFunction that either allows the bitcast or not. Otherwise I expect the compiler to complain that the types doesn't match. I actually don't know
when the bitcast would be useful (it is certainly bad to emit a call using the wrong prototype, but maybe it is ok when doing some kind of profiling/instrumentation?). But the code that adds a bitcast to the returned value from Module::getOrInsertFunction has been there since way back in time.

The sad thing with all of this is that in both cases these problems have been detected by users getting runtime failures. Had been nice if one could detect that LLVM has produced libcalls with the wrong prototype somehow. Right now I got a feeling that even though I hunt down and find solutions for problems I've observered, it is a time bomb because someone might add another builtin, or another transform from LLVM IR to a libcall, not playing by the rules tomorrow.

Regards,
Björn

Hi again.

I noticed that __builtin_powi/__builtin_powif/__builtin_powil doesn't work very well either
unless "int" is 32-bits. The builtins are defined as taking an int argument, but when
CGBuiltin.cpp is emitting code for the builtin it hits an assertion since there is a type
mismatch when emitting a call to Intrinsic::powi (that is hard-coded to use i32).

So either one has to sext/trunc the argument to match the type used in the intrinsic,
or maybe Intrinsic::powi should be overloaded on the second arguments type as well?

The problematic part with changing the type is that if we need to go back to a smaller
type when emitting the libcall, then one need to know if it simply is legal to truncate
the argument (or what should the result be if trying to pass a value that is out-of-bounds).

/Björn

Hi!

I guess one solution here is to find all the builtins in compiler-rt that isn't using machine mode types and put them in a list of functions that depend on size of int/long etc. Then make sure we make those unavailable for targets with int!=si_int etc (and then one has to implement an replacement if those functions are needed).

Another solution could be to break compatibility with libgcc and use the machine mode types everywhere (or is there a requirement that builtins are compatible between libclang_rt and libgcc?).

I do not have some definitive solution right now, but the third option could probably be implementing several compiler-rt functions such as __powsidf2 (normally aliased with __powidf2) and __powhidf2 (aliased with __powidf2 on 16-bit targets). This makes possible to only emit calls to machine mode-sized helper functions from `llc`. At the same time one will not end up having two functions with the same name in libgcc and compiler-rt and subtle differences in behavior.

Since there are no such functions in libgcc, llc would have to emit different names when using compiler-rt and libgcc. Unfortunately, it seems the knowledge whether we are targeting libgcc or compiler-rt as a support library is lost at the time of codegen. As a workaround, llc could emit calls to the aforementioned overloaded functions only if `sizeof(void*) < 32` (this should be easily accessible via DataLayout). This may be considered as a limited/explicit "breaking compatibility with libgcc" but looks rather hackish.

Anatoly

A quick inventory of the "libgcc" runtimes in compiler-rt shows
a number of functions that use "int" in the ABI.

Libcalls may be emitted for these (SHL_I64, SRL_I64, SRA_I64):
  ashldi3.c:COMPILER_RT_ABI di_int __ashldi3(di_int a, int b)
  ashrdi3.c:COMPILER_RT_ABI di_int __ashrdi3(di_int a, int b)
  lshrdi3.c:COMPILER_RT_ABI di_int __lshrdi3(di_int a, int b)

Libcalls may be emitted for these (SINTTOFP_I32_F32, SINTTOFP_I32_F128, SUNTTOFP_I32_F32, UINTTOFP_I32_F128):
  floatsisf.c:COMPILER_RT_ABI fp_t __floatsisf(int a)
  floatsitf.c:COMPILER_RT_ABI fp_t __floatsitf(int a)
  floatunsisf.c:COMPILER_RT_ABI fp_t __floatunsisf(unsigned int a)
  floatunsitf.c:COMPILER_RT_ABI fp_t __floatunsitf(unsigned int a)

Libcall is emitted via llvm.powi intrinsic (FPOWI/STRICT_FPOWI ISD nodes):
  powidf2.c:COMPILER_RT_ABI double __powidf2(double a, int b)
  powisf2.c:COMPILER_RT_ABI float __powisf2(float a, int b)
  powitf2.c:COMPILER_RT_ABI long double __powitf2(long double a, int b)
  powixf2.c:COMPILER_RT_ABI long double __powixf2(long double a, int b)

No libcalls generated to these functions afaict:
  ctzdi2.c:COMPILER_RT_ABI int __ctzdi2(di_int a)
  ctzsi2.c:COMPILER_RT_ABI int __ctzsi2(si_int a)
  ctzti2.c:COMPILER_RT_ABI int __ctzti2(ti_int a)
  ffsdi2.c:COMPILER_RT_ABI int __ffsdi2(di_int a)
  ffssi2.c:COMPILER_RT_ABI int __ffssi2(si_int a)
  ffsti2.c:COMPILER_RT_ABI int __ffsti2(ti_int a)
  paritydi2.c:COMPILER_RT_ABI int __paritydi2(di_int a)
  paritysi2.c:COMPILER_RT_ABI int __paritysi2(si_int a)
  parityti2.c:COMPILER_RT_ABI int __parityti2(ti_int a)
  popcountdi2.c:COMPILER_RT_ABI int __popcountdi2(di_int a)
  popcountsi2.c:COMPILER_RT_ABI int __popcountsi2(si_int a)
  popcountti2.c:COMPILER_RT_ABI int __popcountti2(ti_int a)

The 64-bit shift and float conversion libcalls can be avoided by backends by
using setLibcallName(Op, nullptr). I haven't checked exactly what happens and
how ISel is handling "size of int" for those, but I could not find any signs
indicating that ISel is doing anything special to ensure that the correct
"size of int" is used. So I suspect it isn't handled correctly.

For the powi family I think it is more interesting to emit libcalls even
for targets with 16-bit int. This is because the llvm.powi intrinsic is used
as a mapping to the libcall, and middle-end is doing optimizations for
llvm.powi. I've created a patch here https://reviews.llvm.org/D99439 that
solve problems around powi by changing the llvm.powi intrisic to overload
the type of the exponent argument. And making sure that any in-tree
introduction of llvm.powi uses the correct type.

I've also made another patch, as a starting point, to let SimplifyLibCalls
handle size of int better. That patch is needed as a pre-requisite for the
powi fix to let SimplifyLibCalls use the correct types when emitting calls
to llvm.powi. The patch can be found here https://reviews.llvm.org/D99438 ,
and it also fixes problems with size of int for the rewrites that emit
libcalls to ldexp/ldexpf.

I've used AVR and MSP430 in the test cases added for those patches as
those two targets are using 16-bit int types afaict.

Any reviewers for those patches (D99438 & D99439) are welcome.

/Björn

From: Anatoly Trosinenko <atrosinenko@accesssoftek.com>
Sent: den 24 mars 2021 15:20
To: Björn Pettersson A <bjorn.a.pettersson@ericsson.com>
Subject: Re: [llvm-dev] libcalls vs. size of int

Hi!

> I guess one solution here is to find all the builtins in compiler-rt that
isn't using machine mode types and put them in a list of functions that
depend on size of int/long etc. Then make sure we make those unavailable
for targets with int!=si_int etc (and then one has to implement an
replacement if those functions are needed).

> Another solution could be to break compatibility with libgcc and use the
machine mode types everywhere (or is there a requirement that builtins are
compatible between libclang_rt and libgcc?).

I do not have some definitive solution right now, but the third option
could probably be implementing several compiler-rt functions such as
__powsidf2 (normally aliased with __powidf2) and __powhidf2 (aliased with
__powidf2 on 16-bit targets). This makes possible to only emit calls to
machine mode-sized helper functions from `llc`. At the same time one will
not end up having two functions with the same name in libgcc and compiler-
rt and subtle differences in behavior.

Since there are no such functions in libgcc, llc would have to emit
different names when using compiler-rt and libgcc. Unfortunately, it seems
the knowledge whether we are targeting libgcc or compiler-rt as a support
library is lost at the time of codegen. As a workaround, llc could emit
calls to the aforementioned overloaded functions only if `sizeof(void*) <
32` (this should be easily accessible via DataLayout). This may be
considered as a limited/explicit "breaking compatibility with libgcc" but
looks rather hackish.

Yes, this could be something we could do out-of-tree (at least) as a
hacky workaround. But when I noticed that things like __builtin_powi also
were broken (clang crashing when mapping the builtin to @llvm.powi) I
ended up proposing another solution based on using the correct type,
representing the size of int, when using llvm.powi.

Hi llvm-dev!

Still looking for some feedback here (related to D99438 & D99439 and
using correct "size of int" when dealing with libcalls).

I got one review comment in D99438 about asking on llvm-dev. Since I've
already done a couple of weeks ago I'm asking again.

Regards,
Björn